Efficient way to implement LinkedIn like “How you are connected to” feature?

北城余情 提交于 2019-12-03 04:38:26

问题


LinkedIn has this cool feature in which while visiting some user's profile, LinkedIn prompts how you are connecting to that user through the network.

Assuming that the visitor and the profile owner are two nodes of a graph where the nodes represent users and edge represents friendship, a simple solution could be a bfs starting from both the nodes up to certain level and see if there are any intersections. The intersections would be the network link-nodes.

Although this sounds neat, the problem is that in order to determine friends of each person, a separate DB query is needed. When the network goes deeper than 2 levels, it would be highly time consuming algorithm. Is there a better efficient alternative? If not, how can we add better hardware support (parallel computing, grids, distributed database etc) in order to bring down the time required for computation?


回答1:


You can see how this can be done in the article Graphs in the database: SQL meets social networks by Lorenzo Alberton. The example code is written for PostgreSQL using CTEs. However, I doubt that using a RDBMS for this will perform well. I wrote up an article on how to do the same stuff as in the mentioned article using a native graph database, in this case Neo4j: Social networks in the database: using a graph database. Other than the differences in performance, a graph database also simplifies the task by providing a graph API that makes it easy to handle traversals that would be extremely complex to write in SQL (or by using stored procedures). I wrote a bit more on graph databases in this thread and see this one too.




回答2:


Without some kind of recursive stored procedure (CTE in SQL Server 2005+), you'll need multiple round trips as the levels get deeper. However, a good cache infrastructure could really help performance as the most popular / active users' connection lists would remain cached. A read/write through cache mechanism would make things even better (cache updates cascade to db updates, cache reads cascade to db reads)



来源:https://stackoverflow.com/questions/1558290/efficient-way-to-implement-linkedin-like-how-you-are-connected-to-feature

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!