Co-occurrence matrix from nested list of words

前端 未结 8 805
无人及你
无人及你 2020-11-30 10:27

I have a list of names like:

names = [\'A\', \'B\', \'C\', \'D\']

and a list of documents, that in each documents some of these names are m

8条回答
  •  死守一世寂寞
    2020-11-30 10:44

    We can hugely simplify this using NetworkX. Herenames are the nodes we want to consider, and the lists in document contains nodes to connect.

    We can connect the nodes in each sublist taking the length 2 combinations, and create a MultiGraph to account for the co-occurrence:

    import networkx as nx
    from itertools import combinations
    
    G = nx.MultiGraph()
    G = nx.from_edgelist((c for n_nodes in document for c in combinations(n_nodes, r=2)),
                         create_using=nx.MultiGraph)
    nx.to_pandas_adjacency(G, nodelist=names, dtype='int')
    
       A  B  C  D
    A  0  2  1  1
    B  2  0  2  1
    C  1  2  0  1
    D  1  1  1  0
    

提交回复
热议问题