Tutorial 1: Working with graphs and networks using networkx, graph formats, drawing graphs

NetworkX library

During this tutorial we try some basic manipulation with graphs and networks. We will use package networkx for Python.

Before using this package, it is necessary to install it into your Python, for example by running command pip3 install networkx.

This package contains many important functionalities (as well as many predefined graphs, e.g. the Petersen graph). During the tutorials, we will work with only some of them. If you'll need anything while doing network analysis, I recommend to search the documentation: https://networkx.org/documentation/stable/

In this package we will use the Graph object.

This object allows many basic operations such as adding a vertex

or adding list of vertices.

We can list all vertices using

Of course it is also possible to add edges, again one by one

or using a list

Similarly we can also remove nodes

and edges

It also might be useful to update the graph using a list of edges and vertices.

Following function returns all neighbours of a given vertex.

If we want to iterate through neighbours of all vertices, there are also more sophisticated and effective functions G.adjacency() and G.adj, see the docs for more details.

Complete list of all methods on object Graph is here: https://networkx.org/documentation/stable/reference/classes/graph.html?highlight=graph#networkx.Graph

Task 1: Implement function cycle(n), which returns a cycle of length n.

Task 2: Implement function cartesian(G,H), which for two objects of type Graph returns their Cartesian product.

Cartesian product for graphs $G = (V, E)$ and $H = (W, F)$ is a graph with vertex set $V \times W$. Let $u,v \in V$ and $x,y \in W$. Then $(u,x)$ is adjacent to $(v,y)$ if

Formats for saving graphs

Sometimes we work with graphs in various programming languages or software. For this we need different graph formats. We will discuss some of them below. We use path of length five as a simple showcase.

Adjacency matrix

First possibility to represent graphs is through their adjacency matrix. Adjacency matrix $A$ is defined by $A_{ij} = 1$ for $(i,j) \in E(G)$ and $A_{ij} = 0$ otherwise. With respect to space complexity of this representation it is suitable mainly for dense graphs, i.e. graphs with many edges.

Task 3: Implement function adjacency_matrix(G), which for given graph returns adjacency matrix as a list of its rows.

List of edges

Handling graphs as a list of edges is quite common. We have already seen above how to generate the list of edges from given Graph object.

GML

Now we get to first 'real' format. Let's take a look.

More details to this format, including information about reading a graph in this form, are on https://networkx.org/documentation/stable/reference/readwrite/gml.html?highlight=gml#module-networkx.readwrite.gml

GraphML

Keep in mind that this is different from GML. This is a special version of the xml format.

Other details are listed here: https://networkx.org/documentation/stable/reference/readwrite/graphml.html

JSON

This is a universal format, which is very easy to generate and read in Python. For example, we can export the list of edges of adjacency matrix in this format. There are programs using more specified JSON as one of nativ ways to save graphs, for example Cytoscape, which we'll see later. More information about reading and generating graphs in JSON is on https://networkx.org/documentation/stable/reference/readwrite/json_graph.html

SparseGraph6

This format is suitable for compact saving of unoriented graphs using ASCII characters. The main difference from the previous formats is that it is not human readable. Main benefit of using it is low space complexity. Nice example of the usage of this format are pre-generated lists of all unisomorphic graphs on given number of vertices, which can be found on the webpages of professor Brendan McKay: https://users.cecs.anu.edu.au/~bdm/data/graphs.html

This format has two variants, Graph6 for small or very dense graphs, i.e. with approximately quadratic number of edges, and Sparse6 for big dense graphs, i.e. with approximately linear number of edges. More information about the import and export of graphs in this form can be again found in the documentation to NetworkX: https://networkx.org/documentation/stable/reference/readwrite/sparsegraph6.html

Pajek

This is another one of many formats for saving graphs. Its specification is on http://vlado.fmf.uni-lj.si/pub/networks/pajek/doc/draweps.htm

Other formats supported by NetworkX can be found on https://networkx.org/documentation/stable/reference/readwrite/index.html

Task 4: Read the graph http://vlado.fmf.uni-lj.si/pub/networks/data/sport/football.htm and export it to some other graph formats.

By formats Sparse6 and Graph6 we bump into the problem that they are defined only for unoriented graphs, while the downloaded network is oriented. This problem might occur more frequently, so let's see how to solve it. Typically, oriented graph is made unoriented by forgetting the orientation of the edges. The method G.to_undirected() can be used to handle this. However, in this case it results in unoriented multigraph. In our case we can solve it by following simple trick.

Drawing graphs

While working with graphs and networks, it will be handy to visualize them. Mostly we'll need not only any picture, but a drawing which will help us to understand the main structures inside the given network.

Basic drawing and its setup is included inside NetworkX. More exactly, NetworkX generates set of coordinates which can be visualized by a different program or package, e.g. matplotlib. Again it is necessary to install this package, for example by calling pip install matplotlib.

Note that if you do not run pyplot in Jupyter notebook, you also need to call plt.show() to open window with the result.

Layouts are the main tool for the setup of the view. Simply said, we can choose a formation in which the vertices are layed. (Un)suitable combinations of graphs and layouts are shown below.

Most frequently used layouts are directly supported by variants of the function draw.

Other layouts and possibilities to draw graphs directly using NetworkX are described in the documentation: https://networkx.org/documentation/stable/reference/drawing.html#

Task 5: Try to find a suitable layout for the football network from previous task.

Package NetworkX is primarily made for computational analysis of networks. Suitable drawing of graphs is a complex problem and hence it is often best to use a specialized software.

GraphViz

GraphViz is a set of a dozen programs specialized in drawing graphs and networks. The programs are able to find quite suitable drawings of different stuctures and furthermore they are easy to configure. Namely program Dot may come handy not only during network analysis, but also whenever you store some complicated data structure and you want to make sure that pointers are everywhere set correctly. More information about the program Dot can be found in the documentation: https://www.graphviz.org/pdf/dotguide.pdf

It is again necessary to install GraphViz, e.g. by pip install pygraphviz.

A picture is generated from this file by program dot as follows: dot -Tjpg heawood.dot > heawood.jpg

heawood.jpg

football.jpg

Cytoscape

This software is specifically made for the analysis of complex networks. Apart from computing of some network parameters, which we'll discuss later, this program can be used also for drawing of the networks. One of the main benefits of this program is that you can change the resulting picture not only by changing the parameters for drawing, but also by manually moving the vertices. This may be especially useful during drawing of small networks.

More information is available and download is possible on https://cytoscape.org/

football-tree.png

football-tree-orthogonal.png

football.xml.png

football-radial.png

football-hierarchic.png