Get node list from random walk in networkX

问题

I am new to networkX. I created a graph as follows:

G = nx.read_edgelist(filename,
                     nodetype=int,
                     delimiter=',',
                     data=(('weight', float),))

where the edges are positive, but do not sum up to one.

Is there a built-in method that makes a random walk of k steps from a certain node and return the node list? If not, what is the easiest way of doing it (nodes can repeat)?

Pseudo-code:

node = random
res = [node]
for i in range(0, k)
    read edge weights from this node
    an edge from this node has probability weight / sum_weights
    node = pick an edge from this node 
    res.append(node)

回答1:

The easiest way of doing it is by using the transition matrix T and then using a plain Markovian random walk (in brief, the graph can be considered as a finite-state Markov chain).

Let A and D be the adjacency and degree matrices of a graph G, respectively. The transition matrix T is defined as T = D^(-1) A.
Let p^(0) be the state vector (in brief, the i-th component indicates the probability of being at node i) at the beginning of the walk, the first step (walk) can be evaluated as p^(1) = T p^(0).
Iteratively, the k-th random walk step can be evaluated as p^(k) = T p^(k-1).

In plain Networkx terms...

import networkx
import numpy
# let's generate a graph G
G = networkx.gnp_random_graph(5, 0.5)
# let networkx return the adjacency matrix A
A = networkx.adj_matrix(G)
A = A.todense()
A = numpy.array(A, dtype = numpy.float64)
# let's evaluate the degree matrix D
D = numpy.diag(numpy.sum(A, axis=0))
# ...and the transition matrix T
T = numpy.dot(numpy.linalg.inv(D),A)
# let's define the random walk length, say 10
walkLength = 10
# define the starting node, say the 0-th
p = numpy.array([1, 0, 0, 0, 0]).reshape(-1,1)
visited = list()
for k in range(walkLength):
    # evaluate the next state vector
    p = numpy.dot(T,p)
    # choose the node with higher probability as the visited node
    visited.append(numpy.argmax(p))

回答2:

You can use the adjacency matrix. Then you can normalise it so that the sum of rows equals 1 and each row is the probability distribution of the node jumping to another node. You can also have a jump probability if the walker jumps to a random node.

M = nx.adjacency_matrix(g) #obtain the adj. matrix for the graph
#normalise the adjacency matrix
for i in range(M.shape[1]):
    if (np.sum(M[i]) > 0):
    M[i] = M[i]/np.sum(M[i])
p = generate a random number between 0 and 1
if p < restart_prob:
    #do restart
else:
    #choose next node

Then you can choose a node randomly, then choose the next with probability 1-restart_prob or restart the walker with probability restart_prob.

To understand the algorithm better you can look at how PageRank works.

来源：https://stackoverflow.com/questions/37311651/get-node-list-from-random-walk-in-networkx

标签

python

graph

machine-learning

statistics

networkx