I\'m looking to find a way to in real-time find the shortest path between nodes in a huge graph. It has hundreds of thousands of vertices and millions of edges. I know this
For large graphs, try the Python interface of igraph. Its core is implemented in C, therefore it can cope with graphs with millions of vertices and edges relatively easily. It contains a BFS implementation (among other algorithms) and it also includes Dijkstra's algorithm and the Bellman-Ford algorithm for weighted graphs.
As for "realtimeness", I made some quick tests as well:
from igraph import *
from random import randint
import time
def test_shortest_path(graph, tries=1000):
t1 = time.time()
for _ in xrange(tries):
v1 = randint(0, graph.vcount()-1)
v2 = randint(0, graph.vcount()-1)
sp = graph.get_shortest_paths(v1, v2)
t2 = time.time()
return (t2-t1)/tries
>>> print test_shortest_path(Graph.Barabasi(100000, 100))
0.010035698396
>>> print test_shortest_path(Graph.GRG(1000000, 0.002))
0.413572219742
According to the code snippet above, finding a shortest path between two given vertices in a small-world graph having 100K vertices and 10M edges (10M = 100K * 100) takes about 0.01003 seconds on average (averaged from 1000 tries). This was the first test case and it is a reasonable estimate if you are working with social network data or some other network where the diameter is known to be small compared to the size of the network. The second test is a geometric random graph where 1 million points are dropped randomly on a 2D plane and two points are connected if their distance is less than 0.002, resulting in a graph with about 1M vertices and 6.5M edges. In this case, the shortest path calculation takes longer (as the paths themselves are longer), but it is still pretty close to real-time: 0.41357 seconds on average.
Disclaimer: I am one of the authors of igraph.