问题
My data is modeled this way:
I have nodes of soccer players, goals and matches. player is connected to match with [:played] relation, and to goal with [:scored] relation. goal is connected to match with [:scoredIn] relation. each match has a (Long) date property.
I'm trying to find players who scored the most goals in their last 5 matches (with DESC sorting on the match.date property).
What would be the most efficient way? I can go over every player's matches sorted DESC, keep the matches ids and then find the (player)-[:scored]->(goal)-[:scoredIn]->(m) pattern into these relevant matches, but this is very slow.
Seems like I'm missing something, how can I find these patterns with the relative sorting?
Thanks
回答1:
It may help to PROFILE or EXPLAIN your query and paste the resulting plan into your description (with all elements expanded).
Taking a stab at it, it sounds like we're looking for the most traversal-friendly query. A single player may have scored many many goals, and having to expand each goal and do a hash join to the goals scored in a game sounds like it can be very expensive.
If instead we get a player's last 5 games, and then get the goals scored in the game and filter those based on the player, that may be more efficient.
Something like:
MATCH (p:Player)-[:played]->(m)
WITH p, m
ORDER BY m.date DESC
WITH p, COLLECT(m)[..5] as matches
UNWIND matches as match
MATCH (match)<-[:scoredin]-(g)
WHERE (g)<-[:scored]-(p)
RETURN p, COUNT(g) as goalsInLast5
ORDER BY goalsInLast5 DESC
LIMIT 10 // or whichever top n you want
With a large graph, this may still be an expensive query.
As a small improvement, you may want to consider grouping goals per match per player instead of a single goal node for each goal.
Something like:
(player)-[:scored]->(goals:Goals{goals:4})-[:scoredin]->(match)
来源:https://stackoverflow.com/questions/42372580/find-top-sort-patterns-for-each-node