Graph database performance

独自空忆成欢 提交于 2019-12-08 08:56:38

问题


I was reading a book recommended on Neo4j site: http://neo4j.com/books/graph-databases/ about graph database performance and it said:

"In contrast to relational databases, where join-intensive query performance deteriorates as the dataset gets bigger, with a graph database performance tends to remain relatively constant, even as the dataset grows. This is because queries are localized to a portion of the graph. As a result, the execution time for each query is proportional only to the size of the part of the graph traversed to satisfy that query, rather than the size of the overall graph."

So e.g. I want to return only nodes with a label "Doctor, that's localized to a portion of a graph. But my question is how does the database itself know where those nodes are ? In other words, does it not need to traverse all nodes to find out whether or not they satisfy the query and make decision based on that ?


回答1:


Neo4j has a special indexing for node labels so that it can find all nodes for a label without searching all nodes. Beyond that you can:

  • Create your own indexes based on node properties (either schema indexes or legacy indexes) in order to find nodes as starting points
  • Query by node IDs to find a starting point (though I'd suggest using your own property with an index if you need to identify nodes more permanently)



回答2:


In general localized searches mean: you start from a smallish set of starting points which can be people, products, places, orders etc.

A portion of the graph that is annotated with a label, often doesn't fall into that category, i.e. all doctors are not a smallish set of starting points.

Your query would probably touch a large portion of the graph if you traverse out from all doctors to their neighborhoods.

A query like this would be a graph local one:

MATCH (:City {name:"SFO"})<-[:RESIDES_IN]-(d:Doctor)-[presc:PRESCRIBES]->(m:Medicine)
RETURN d.name, m.name, sum(presc.amount) as amount


来源:https://stackoverflow.com/questions/29629903/graph-database-performance

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!