Tutorial 4: Betweenness centrality

Recapitulation of theory

Betweenness centrality is defined as follows. Let

$\sigma_{x,y}$ number of shortest paths between $x$ and $y$ and for

$\sigma_{x,y}(v)$ number of shortest paths between $x$ and $y$ passing through $v$, then

$$C_B(v) = \sum_{\substack{x,y \in V(G) \setminus v\\x \neq y}} \frac{\sigma_{x,y}(v)}{\sigma_{x,y}}$$

Global betweenness centrality $$\bar{C}_B(G) = \frac{1}{n} \sum_{v \in V(G)} C_B(v)$$

Average shortest path length $$L(G) = \frac{1}{n(n-1)} \sum_{\substack{u,v \in V(G)\\ u \neq v}} d(u,v)$$

Theorem Let $G$ be unoriented graph of size $|V(G)| = n$, $L(G)$ its average path length and $\overline{C}_B(G)$ its global betweeness centrality. Then it holds that $\overline{C}_B(G) = (n - 1) (L(G) - 1)$.

Lemma Let $G$ be a graph of size $n$ and let $G'$ be a graph obtained from $G$ by adding an edge between random vertices $u,v$ in distance $d = d(u,v) > 1$. Then it holds that $$\overline{C}_B(G') \leq \overline{C}_B(G) - \frac{2(d - 1)}{n}.$$

A vertex is called a leaf if $\deg(v) = 1$ and a k-neighbourhood of a vertex is defined as $$\Gamma_k(v) = \{v\in V : d(u,v) = k\}.$$

Theorem Let $G$ be a graph of size $n$, $w \in V$ a vertex with eccentricity $\epsilon(w)$ and let $G'$ be a graph obtained from $G$ by adding a vertex $u$ which is connected to $G$ by the edge $\{u,w\}$. Then $$\overline{C}_B(G') = \frac{n}{n - 1}\overline{C}_B(G) + \frac{2}{n - 1}\sum_{k=1}^{\epsilon(w)} k|\Gamma_k(w)|.$$

Task 1: Express the betweenness centrality of vertices in

Use $n$ as number of vertices in these graphs. Use some additional parameter to express betweenness if necessary.

Solution:

For vertex $x_i \in P_n$, where $P_n$ is a path on $n$ vertices $\{x_1, \dots, x_n\}$ we have exactly one shortest path leading through $x_i$ between each pair of vertces where one is to the left of $x_i$ and the other one is to the right of $x_i$: $$ B(x_i) = (\sum_{j=1}^{i-1} 1) \cdot (\sum_{j=i+1}^{n} 1) = (i-1)(n-i)$$

We notice that each shortest path of length $i$ (which is on $(i+1)$ vertices) can contain our chosen vertex as one of $(i-1)$ internal vertices. For such chosen vertex of a cycle it is enough to sum through all lengths of shortest paths up to diameter, i.e. the longest from the shortest paths, such that each length is multiplied by its length decreased by one. For exact computation we need to split the task according to the parity of $n$.

Because our definition uses oriented paths, we need to multiply the result by two.

We have $C_n$ for $n = 2k + 1$ and no shortest path is longer than $k$, as otherwise it is shorter to go through the other half of the cycle.
$$B(x) = 2 \cdot \sum_{i=2}^k (i-1) = 2 \cdot \sum_{i=1}^{k-1} i = 2 \cdot \frac{k(k-1)}{2} = k(k-1).$$ As $k = \frac{n-1}{2}$, we obtain $$ B(x) = \frac{n-1}{2} \cdot \Big(\frac{n-1}{2} - 1\Big) = \frac{n-1}{2} \cdot \Big(\frac{n-3}{2}\Big) = \frac{(n-1)(n-3)}{4}$$

We have $C_n$ for $n = 2k$. The longest of the shortest paths are of length $k$ and they have always two possibilities as to what half of the cycle they take. For this reason we will count them separately. Let $P$ be a contribution of these paths and $Q$ be contribution of all other paths. $$Q = 2 \cdot \sum_{i=2}^{k-1} (i-1) = 2 \cdot \sum_{i=1}^{k-2} i = 2 \cdot \frac{(k-2)(k-1)}{2} = (k-2)(k-1).$$ As there are $k-1$ paths of length $k$ ($2(k-1)$ if oriented) and each adds $\frac{1}{2}$, we have $$P = 2 \cdot (k-1) \cdot \frac{1}{2} = k-1.$$ Together, $$B(x) = P + Q = (k-1) + (k-1)(k-2) = (k-1)^2.$$ By plugging in $k = \frac{n}{2}$ we obtain $$B(x) = \Big(\frac{n}{2}-1\Big)^2 = \Big(\frac{n-2}{2}\Big)^2 = \frac{(n-2)^2}{4}.$$

The shortest path is always an edge, which does not contribute to any vertex and thus $B(x) = 0$ for all $x$.

As trees do not contain any cycles, the shortest path between any two vertices is unique. To determine $B(x)$ for $ x \in V(G)$, we are interested in $c_1, \dots, c_K$ which denote the sizes of connected components $C_1, \dots, C_k$ of $G \setminus x$. If we take any vertices $x \in C_i$ and $y \in C_j$, the unique shortest path between them passes through $x$ and so every such pair contributes one to $B(x)$. As a result, $$B(x) = \sum_{i,j \in K} c_i \cdot c_j.$$

Task 2: Bound the change of betweenness after

Solution:

We use bound $$\overline{C}_B(C_n + e) \leq \overline{C}_B(C_n) - \frac{2(d - 1)}{n},$$ where $d$ is the distance of vertices between whose we have added the edge. Clearly, betweenness shriks the most if we connect two vertices which are the furthest from each other and the least if we connect two vertices in distance two. Betweenness always decreases at least by the amount corresponding to $d=2$: $$\overline{C}_B(C_n + e) \leq \overline{C}_B(C_n) - \frac{2}{n}.$$

The maximal change in betweenness is $d=\frac{n}{2}$ for even $n$: $$\overline{C}_B(C_n + e) \leq \overline{C}_B(C_n) - \frac{2(\frac{n-2}{2})}{n} = \overline{C}_B(C_n) - \frac{n-2}{n} = \overline{C}_B(C_n) - 1 - \frac{2}{n}$$ and $d=\frac{n-1}{2}$ for $n$ odd: $$\overline{C}_B(C_n + e) \leq \overline{C}_B(C_n) - \frac{2(\frac{n-3}{2})}{n} = \overline{C}_B(C_n) - \frac{n-3}{n} = \overline{C}_B(C_n) - 1 - \frac{3}{n}.$$

Let us notice that the difference between maximal and minimal decrease in betweenness is one.

We use the bound $$\overline{C}_B(G') = \frac{n}{n + 1}\overline{C}_B(G) + \frac{2}{n + 1}\sum_{k=1}^{\epsilon(w)} k|\Gamma_k(w)|.$$ Notice that betweenness of each vertex in complete graph is zero, so first term is zero. Furthermore, eccentricity of each vertex in a complete graph is one, sum simplifies to only one term $1 \cdot |\Gamma(w)| = (n-1)$. We obtain $$\overline{C}_B(K_n + \text{leaf}) = 0 + \frac{2}{n+1}(n-1) = \frac{2(n-1)}{(n+1)} = 2 - \frac{4}{n+1}.$$