Big graph in memory

你。 提交于 2020-01-26 04:33:09

问题


I want to record all used ports within huge pcaps. There are 65535 ports available, and each port is able to talk each other port: 65535 x 65535 links in total

The matrix will be very sparse (many 0 entries). Additionally, I think the edges don't have to be directed, so Port1->Port2 may be added to Port2->Port1 (which reduces our amount of values to 65535 * 65536 / 2). How would you store this using python? In numpy? What will be the estimated amount of memory consumption for this?

Afterwards, I want to find the greatest sum for one port and pop() it (the whole row and column while). This means, i want to find e.g. that Port1 was used 500 times (100 times from Port2 to Port1, 300 times from Port3 to Port1, Port4 to Port1 100times)...

Graphically spoken, I want to have 65535 nodes that could be connected with each other. Then I want to find the node that has the highest sum of values on connected edges. Afterwards, I want to pop the node (and delete the corresponding edges, which will decrease the sum of other nodes).

Thanks!


回答1:


In Python, and depending on how sparse is sparse, a dict-of-dicts will handle this quite well.

connections = { ..., 8080: { 4545:17, 20151:3, ...}, ...}

If I have understood what you are doing correctly, then the count of connections to port p is

count = sum( connections[8080].values() )

removing port p is

del connections[p]
for conn in connections.values():  # edit, bug fixed.
    if p in conn: 
         del conn[p]

If you want to try to save memory by storing only half the pairs, then simplicity suffers greatly.




回答2:


Look into the adjacency list representation of Graph, it will most probably suits your needs.

However, a graph containing 65535 vertices is not that big. Even if you cannot represent it with a simple matrix.

The memory consumption is O(E+V) with V number of vertices (65535) and E number of edges (on a sparse graph, it has the same magnitude order than V).



来源:https://stackoverflow.com/questions/37271660/big-graph-in-memory

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!