Neo4j Cypher query performance optimization

问题

I have the following Neo4j Cypher query

MATCH (dg:DecisionGroup)-[:CONTAINS]->(childD:Decision) 
WHERE dg.id = 1 
MATCH (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(filterCharacteristic4:Characteristic) 
WHERE filterCharacteristic4.id = 4 
WITH relationshipValueRel4, childD, dg 
WHERE  (ANY (id IN [2,3] 
WHERE id IN relationshipValueRel4.optionIds ))  
WITH childD, dg  
OPTIONAL MATCH (childD)-[vg:HAS_VOTE_ON]->(c:Criterion) 
WHERE c.id IN [2, 3] 
WITH childD, dg, vg.avgVotesWeight as weight, vg.totalVotes as totalVotes 
WITH childD , dg , toFloat(sum(weight)) as weight, toInt(sum(totalVotes)) as totalVotes  
ORDER BY  weight DESC 
SKIP 0 LIMIT 10 
WITH * MATCH (childD)-[ru:CREATED_BY]->(u:User) OPTIONAL MATCH (childD)-[rup:UPDATED_BY]->(up:User)  
RETURN ru, u, rup, up, childD AS decision, weight, totalVotes, 
[ (dg)<-[:DEFINED_BY]-(entity)<-[:COMMENTED_ON]-(comg:CommentGroup)-[:COMMENTED_FOR]->(childD) | {entityId: toInt(entity.id),  types: labels(entity), totalComments: toInt(comg.totalComments)} ] AS commentGroups, 
[ (dg)<-[:DEFINED_BY]-(c1)<-[vg1:HAS_VOTE_ON]-(childD) | {criterionId: toInt(c1.id),  weight: vg1.avgVotesWeight, totalVotes: toInt(vg1.totalVotes)} ] AS weightedCriteria, [ (dg)<-[:DEFINED_BY]-(ch1:Characteristic)<-[v1:HAS_VALUE_ON]-(childD)  WHERE  NOT ((ch1)<-[:DEPENDS_ON]-())  | {characteristicId: toInt(ch1.id),  optionIds: v1.optionIds, valueIds: v1.valueIds, value: v1.value, available: v1.available, totalHistoryValues: v1.totalHistoryValues, totalFlags: v1.totalFlags, description: v1.description, valueType: ch1.valueType, visualMode: ch1.visualMode} ] AS valuedCharacteristics

I'm not sutisfied with the performance of this query execution.

This is PROFILE output:

Cypher version: CYPHER 3.3, planner: COST, runtime: INTERPRETED. 3296130 total db hits in 2936 ms

Is there any chance to optimize this query performance ?

回答1:

It will be a little hard to optimize this query without a dataset, knowledge of you graph and what you are searching to do.

Performances depend on :

Query itself
Schema (index & constrainsts)
Graph modeling
Neo4j configuration
Hardware

There is no big problem on your query, even if it can be written into a more readable state for me (ex: one big match, sugar syntax on where clause in the match, replace the any by an or, ...) , but it will not change the query plan.

Be sure to use query parameters with this query to avoid to recalculate the query plan of this long query everytimes.

Your query pass most of its times into (childD)-[relationshipValueRel4:HAS_VALUE_ON]-(:Characteristic) + the where clause on it (ie. 1.5M * 2 dbhits). So a solution can be to change the model by creating some relationships like that : HAS_VALUE_ON_WITH_OPTID_1, HAS_VALUE_ON_WITH_OPTID_2 ...

来源：https://stackoverflow.com/questions/47748573/neo4j-cypher-query-performance-optimization

标签

neo4j

cypher