Community detection in a network in a nutshell

Wencao Yang
4 min readDec 31, 2021

--

The network is a branch of the complex system. Let us have a look at the basics of a network first.

  1. What is a network?

A network is made of node and link, or vertex and edge, e.g., social network, internet, power grid. it can be classified based on if the link has direction or weight

2. Network representation

an adjacency matrix is a simple way to represent a network, and most adjacency matrix of real-world example is the sparse matrix:

Aij = 1 if there is a link pointing from node j to node i

Aij = 0 if nodes i and j are not connected

3. The properties of a network

a) degree distribution. see the following example, where Pk is the ratio of the number of nodes with k links over the total node number

Fig1, from Network Science by Albert-László Barabási

another example, where c is a log-log plot

Fig2, from Network Science by Albert-László Barabási

b) clustering coefficient, a variable on a node and its neighborhood, defined like this:

Where Li is the total links between all the neighbors of node I, Ki is the degree of node i.

C of a graphic the average of Ci, see the following examples:

Fig3, from Network Science by Albert-László Barabási

Besides these two properties, there is something else, like the ‘hub,’ ‘giant component,’ and so on.

3. Random network and real word network
the random network was proposed by Pál Erdős (random network also be called Pál Erdős network). The model says a random network G (N, L) has N node and L link, and nodes are connected with a probability P.
Easy to prove, Pk of a random network follows a binomial distribution.

when N is huge, it becomes Poisson distribution

However, as you can see from the table, a real-world network does not follow a Poisson distribution.

Table1, From Network Science by Albert-László Barabási

Instead, most networks have a power law.

4. What else can we get from a network?

Community detection!

Some nodes have more connections in a network, and just like a sub-group, we usually say it is a community. A natural question is how to find the community from a network, precisely what community detection does.

2003, MJ Newman published a paper, Fast algorithm for detecting community structure in networks, which used modularity to partition/cut a network

modularity is defined as follows, quoted from the article.

with modularity Q defined, we can have a greedy procedure like the following

Initially, every single node is a community. For every step, we combine a pair of communities that maximize the Q, and finally, we look back, finding out the community structure with a max Q

Newman’s greedy algorithm is commonly used in network libs like networkx and igraph.

5. Can we do better?

Yes, stochastic block model!

Stochastic blockmodels and community structure in networks, PHYSICAL REVIEW E 83, 016107 (2011)

A more modern method for community detection.

It has a likelihood function.

We need to know how many communities there are first, unfortunately. Let’s assume n community. mrs is the link number between community r and s,kr and ks are the degrees of r and s,g is community assignment, G is the graph. We randomly assign nodes into n communities and then swap nodes to maximize L (G | g).

--

--

Wencao Yang
Wencao Yang

Written by Wencao Yang

Data Scientist & Physics PhD

No responses yet