Overlapping community detection with igraph or other libaries

十年热恋 提交于 2019-11-30 16:29:48

I have implemented the hierarchical link clustering algorithm of Ahn et al a while ago using the Python interface of igraph; see its source code here.

Also, implementing CFinder in Python using igraph is fairly easy; this is what I came up with:

#!/usr/bin/env python
from itertools import combinations

import igraph
import optparse

parser = optparse.OptionParser(usage="%prog [options] infile")
parser.add_option("-k", metavar="K", default=3, type=int,
        help="use a clique size of K")

options, args = parser.parse_args()

if not args:
    parser.error("Required input file as first argument")

k = options.k
g = igraph.load(args[0], format="ncol", directed=False)
cls = map(set, g.maximal_cliques(min=k))

edgelist = []
for i, j in combinations(range(len(cls)), 2):
    if len(cls[i].intersection(cls[j])) >= k-1:
        edgelist.append((i, j))

cg = igraph.Graph(edgelist, directed=False)
clusters = cg.clusters()
for cluster in clusters:
    members = set()
    for i in cluster:
        members.update(cls[i])
    print "\t".join(g.vs[members]["name"])

If you don't mind using another programming language, you have CFinder (Java), which is based on clique percolation (it basically looks for tightly connected cliques), OSLOM (C++), which optimizes a statistical measure, and certainly others.

Otherwise, if you want to stick to python, you could also apply the method by Evans & Lambiotte '09: 1) transform your graph into a line graph, 2) apply a regular community detection method on it, for example using igraph, and 3) obtain overlapping communities. Converting your graph into a line graph does not look too complicated, and should be fast, provided your original graph is not too large. It will be faster than performing the community detection itself, in any case.

Note there are alternatives methods to that of Evans & Lambiotte to get overlapping communities from a regular (mutually exclusive communities) method, such as those of Bennet et al. '12 or Wang et al. '09. However, implementing them seems less straightforward.

According to this blog, networkx can now compute for overlapping communities.

The code below is for clique percolation method and is available in Networkx 11.6. Github here

import networkx as nx
from itertools import combinations

def get_percolated_cliques(G, k):
perc_graph = nx.Graph()
cliques = list(frozenset(c) for c in nx.find_cliques(G) if len(c) >= k)
perc_graph.add_nodes_from(cliques)

# Add an edge in the clique graph for each pair of cliques that percolate
for c1, c2 in combinations(cliques, 2):
    if len(c1.intersection(c2)) >= (k - 1):
        perc_graph.add_edge(c1, c2)

for component in nx.connected_components(perc_graph):
    yield(frozenset.union(*component))

The CFinder implementing the Clique Percolation Method (CPM). If you use python, Networkx have already implement the same(see this link).

>>> G = nx.complete_graph(5)
>>> K5 = nx.convert_node_labels_to_integers(G,first_label=2)
>>> G.add_edges_from(K5.edges())
>>> c = list(nx.k_clique_communities(G, 4))
>>> list(c[0])
[0, 1, 2, 3, 4, 5, 6]
>>> list(nx.k_clique_communities(G, 6))

The Python networkx library now has more extensive community detection algorithms. The example given by Carla is now:

>>> from networkx.algorithms.community import k_clique_communities
>>> G = nx.complete_graph(5)
>>> K5 = nx.convert_node_labels_to_integers(G,first_label=2)
>>> G.add_edges_from(K5.edges())
>>> c = list(k_clique_communities(G, 4))
>>> list(c[0])
[0, 1, 2, 3, 4, 5, 6]
>>> list(k_clique_communities(G, 6))
[]

The communities documentation is here: https://networkx.github.io/documentation/latest/reference/algorithms/community.html

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!