问题
What I'm doing is to get all profiles* who has a specific directed relation to a users profile* and if those have an alternate profile* get those in case the users alternate profile* has a relation to it. I also need the direction of the relations.
My problem is, with about 10000 nodes it takes about 5 seconds to get data. I have auto index on nodes and relationships.
This is how my nodes are related:
User-[:profile]->ProfileA-[:related]->ProfileB<-[?:me]->ProfileB2<-[?:related]-ProfileA2<-[:profile]-User
My query looks like this:
START User=node({source})
MATCH User-[:profile]->ProfileA-[rel:related]->ProfileB
WHERE User-->ProfileA-->ProfileB
WITH ProfileA, rel, ProfileB
MATCH ProfileB<-[?:me]->ProfileB2<-[relB?:related]-ProfileA2<-[:profile]-User
WHERE relB IS NULL OR User-->ProfileA-->ProfileB<-->ProfileB2<--ProfileA2<--User
RETURN ProfileB, COLLECT(ProfileB2), rel, relB
LIMIT 25
Any idea how I can optimize the query?
- profiles: ProfileB
- users profile: ProfileA
- alternate profile: ProfileB2
- users alternate profile: ProfileA2
回答1:
You're using WHERE
clauses where you don't need to. Let's look at the first one for example:
WHERE User-->ProfileA-->ProfileB
This clause says "restrict the results only to users that have a relationship to a ProfileA which itself has a relationship to a ProfileB". However, that is already guaranteed to be true by your match clause. You're wasting CPU cycles re-verifying something that is already true.
WITH ProfileA, rel, ProfileB
You aren't doing any sort of aggregation, calculation or reassignment, so there is no need for this WITH
clause. You can continue on without it.
WHERE relB IS NULL OR User-->ProfileA-->ProfileB<-->ProfileB2<--ProfileA2<--User
Again, you're not getting any value out of this WHERE
clause. This one says "restrict the results to paths where a relB wasn't found OR where one was found with the following path..." and then you list the exact same path that was in your MATCH
.
So, remove all those extraneous clauses and you get this:
START User=node({source})
MATCH User-[:profile]->ProfileA-[rel:related]->ProfileB<-[?:me]->ProfileB2<-[relB?:related]-ProfileA2<-[:profile]-User
RETURN ProfileB, COLLECT(ProfileB2), rel, relB
LIMIT 25
Try that and see if the performance is any better. If it's not enough then you may need to add more information to your question -- for my own part, I don't fully understand what your relationships actually mean (for example, what is the "me" relationship? what does it symbolize?)
回答2:
This is how I solved it:
START User=node({source})
MATCH User-[:profile]->ProfileA-[rel:related]->ProfileB<-[?:me]->ProfileB2-[relB?:related]-ProfileA2
WHERE relB IS NULL OR User-[:profile]->ProfileA2
RETURN ProfileB, COLLECT(ProfileB2), rel, relB
LIMIT 25
The ProfileA2<-[:profile]-User
seemed to produce an endless loop.
Recommendations are still welcome.
来源:https://stackoverflow.com/questions/15146883/optimize-neo4j-cypher-query