Neo4j Cypher query performance optimization

試著忘記壹切 提交于 2020-01-04 14:02:31

问题


I have the following Neo4j Cypher query

MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision) 
WHERE dg.id = 1 
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic) 
WHERE filterCharacteristic4.id = 4 
WITH relationshipValueRel4, childD, dg 
WHERE  (ANY (id IN [2,3] 
WHERE id IN relationshipValueRel4.optionIds ))  
WITH childD, dg  
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion) 
WHERE c.id IN [2, 3] 
WITH childD, dg, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes 
WITH childD , dg , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes  
ORDER BY  weight DESC 
SKIP 0 LIMIT 10 
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User) OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User)  
RETURN ru, u, rup, up, childD AS decision, weight, totalVotes, 
[ (dg)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: toInt(entity.id),  types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups, 
[ (dg)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: toInt(c1.id),  weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria, [ (dg)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD)  WHERE  NOT ((ch1)<-[:DEPENDS_ON]-())  | {characteristicId: toInt(ch1.id),  optionIds: v1.optionIds, valueIds: v1.valueIds, value: v1.value, available: v1.available, totalHistoryValues: v1.totalHistoryValues, totalFlags: v1.totalFlags, description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics

I'm not sutisfied with the performance of this query execution.

This is PROFILE output:

Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 3296130 total db hits in 2936 ms

Is there any chance to optimize this query performance ?


回答1:


It will be a little hard to optimize this query without a dataset, knowledge of you graph and what you are searching to do.

Performances depend on :

  1. Query itself
  2. Schema (index & constrainsts)
  3. Graph modeling
  4. Neo4j configuration
  5. Hardware

There is no big problem on your query, even if it can be written into a more readable state for me (ex: one big match, sugar syntax on where clause in the match, replace the any by an or, ...) , but it will not change the query plan.

Be sure to use query parameters with this query to avoid to recalculate the query plan of this long query everytimes.

Your query pass most of its times into (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(:Characteristic) + the where clause on it (ie. 1.5M * 2 dbhits). So a solution can be to change the model by creating some relationships like that : HAS_VALUE_ON_WITH_OPTID_1, HAS_VALUE_ON_WITH_OPTID_2 ...



来源:https://stackoverflow.com/questions/47748573/neo4j-cypher-query-performance-optimization

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!