Tutorial 10: Graph properties and random network models

Recapitulation of the lectures

Graph properties and thershold functions

Graph property $\mathcal{P}$ is a subset of the set of all labelled graphs on the set $[n]$, i.e. $\mathcal{P} \subseteq 2^{{V \choose 2}}$.

Monotone (increasing) property is a property $\mathcal{P}$ for which holds $\forall e\in {n \choose 2} ~ G \in \mathcal{P} \Rightarrow G + e \in \mathcal{P}$.

Lemma: Let $\mathcal{P}$ be monotone increasing graph property and $0 \leq p_1 < p_2 \leq 1$. Moreover let $G_i = G(n,p_i)~i=1,2$. Then $$\mathbb{P}[G_1 \in \mathcal{P}] \leq \mathbb{P}[G_2 \in \mathcal{P}].$$

Lemma: Let $\mathcal{P}$ be a graph property and $p = \frac{m}{{n \choose 2}}$, where $m=m(n)\rightarrow\infty$ and ${n\choose 2} - m\rightarrow\infty$. Then for large $n$, $$\mathbb{P}[G(n,m) \in \mathcal{P}] \leq \sqrt{2\pi m}\mathbb{P}[G(n,p) \in \mathcal{P}].$$

Lemma: Let $\mathcal{P}$ be monotone increasing graph property and $p = \frac{m}{N}$, where $N = {n \choose 2}$. Then for large $n$ and $p$ s.t. $Np$, $N(1-p)/\sqrt{Np} \rightarrow \infty$ we have $$\mathbb{P}[G(n,m) \in \mathcal{P}] \leq 3\mathbb{P}[G(n,p) \in \mathcal{P}].$$

Function $p^* = p^*(n)$ is a threshold for monotone ingreasing graph property $\mathcal{P}$ in a random graph $G(n,p)$ if $$\lim_{n\rightarrow\infty}\mathbb{P}[G(n,p) \in \mathcal{P}] = \begin{cases} 0 & \text{ if } p/p^* \rightarrow 0\\ 1 & \text{ if } p/p^* \rightarrow \infty \end{cases}$$

Markov inequality Let $X$ be nonnegative random variable. Then for all $t > 0$ holds $$\mathbb{P}[X \geq t] \leq \frac{\mathbb{E}[X]}{t}.$$

First moment method Let $X$ be nonnegative random integer variable. Then $$\mathbb{P}[X > 0] \leq \mathbb{E}[X].$$

Chebyshev inequality Let $X$ be random variable with finite expectation and variance. Then for $t > 0$ holds $$\mathbb{P}[|X -\mathbb{E}[X]|\geq t] \leq \frac{\text{Var}(X)}{t^2}.$$

Second moment method Let $X$ be nonnegative integer random variable. Then $$\mathbb{P}[X = 0] \leq \frac{\text{Var} X}{(\mathbb{E}[X])^2} = \frac{\mathbb{E}[X^2]}{(\mathbb{E}[X])^2} - 1.$$

Lemma: Let $\mathcal{P}$ be graph property stating that a graph contains at least one edge. Then $$\lim_{n\rightarrow\infty}\mathbb{P}[G(n,p) \in \mathcal{P}] = \begin{cases} 0 & \text{ if } p \gg n^{-2}\\ 1 & \text{ if } p \ll n^{-2} \end{cases}$$

We say that sequence of events $\mathcal{X}_n$ occurs asymptotically almost surely if it occurs with probability $1 - o(1)$ or equivalently $$\lim_{n\rightarrow\infty}\mathbb{P}[\mathcal{X}_n] = 1.$$

Věta: If $m/n\rightarrow\infty$ then asymptotically almost surely $G(n,m)$ contains at least one triangle.

Community structure

A community is a set of vertices, which have greater probability of being connected between each other, then of being connected with other vertices in the network.

Let community $C$ be a set of vertices. For a vertex $v \in C$ we define internal degree $k_v^{\text{in}}(C)$ as $|\{w \in N(v)\ |\ w \in C\}|$ and external degree $k_v^{\text{out}}(C) = |\{w \in N(v)\ |\ w \notin C\}|$.

Total internal degree and total external degree are then defined as $k_C^{\text{in}} = \sum_{v_\in V(C)} k_v^{\text{in}}$ and $k_C^{\text{ext}} = \sum_{v_\in V(C)} k_v^{\text{ext}}$. Total degree of $C$ is $k_C = k_C^{\text{in}} + k_C^{\text{ext}}$. Community $C$ is strong, if $k_v^{\text{in}}(C) > k_v^{\text{ext}}(C)$ for all $v\in C$, or weak, if $k_C^{\text{in}} > k_C^{\text{ext}}$.

Density of community $C$ in graph $G$, where $|C| = n_c$ and $|G| = n$, can be defined as internal denoted by $$\rho_{\text{int}} = \frac{|E(C)|}{{n_c \choose 2}}$$ or external denoted by $$\rho_{\text{ext}} = \frac{|E(C,G\setminus C)|}{(n_c(n - n_c))}.$$ For $\rho = |E(G)| / {n \choose 2}$ average density of a graph and a community $C$ we want $$\rho_{\text{int}} \gg \rho \gg \rho_{\text{ext}}.$$ In algorithms we then maximize $\rho_{\text{int}} - \rho_{\text{ext}}$.

Other conditions on the structure of communities:

We typically want that the communities form decomposition of the graph, i.e. the communities are disjoint. In some cases there might be a reason to consider overlapping communities, in which case the membership in a community is given by weight.

$k$-clique is a maximal subgraph such that $\forall u,v,~ d_G(u,v) \leq k$. The shortest paths can lead outside of $C$ and so in some cases $\text{diam}(G[C]) > k$.

$k$-club is a $k$-clique with $\text{diam}(G[C]) \leq k$. Contrary to $k$-clique, $k$-club is always connected.

$k$-plex is a set of vertices, where each vertex is adjacent to all other from the set except for at most $k-1$, i.e. $\deg_{G[C]}(v) = |N[v] \cap C | \geq |C| - k$ for all $v\in C$.

$k$-PLEX is a problem, where for each given graph $G$ and numbers $c$ and $k$ we are searching for a $k$-plex of size $c$. For $k=1$ this problem contains finding a maximum clique, which means that this problem is NP-hard.

Věta (Balasundaram, Butenko, Hicks, Sachdeva 2011) $k$-PLEX is NP-complete for any fixed $k$.

This problem can be also formulated using integer programming as follows:

Let

For $k > 1$ we have program

where $x_i = 1$ reprezents that $i\in V$ is in a $k$-plex a otherwise $x_i = 0$. The middle condition assures that $i$ has at least $k - 1$ neighbours in the $k$-plex.

Task 1: We say that a graph has Eulerian trail, if there exists a trail visiting each edge of the given graph exactly once. A graph has Eulerian circuit if there is an Eulerian trail starting and ending in the same vertex. Show that the following graph classes are not monotone increasing:

Task 2: Let $X_1, X_2, \dots $ are nonnegative random integer variables. Prove the following:

Task 3: What is the smallest $k$ such that a

is covering all vertices of

Task 4: Find maximal

in the cycle on five vertices $C_5$.