Find cluster in Neo4j

删除回忆录丶 提交于 2019-12-11 05:31:40

问题


Hi I have a neo4j database, similar to below.

CREATE
  (:Person {name: 'Ryan'})-[:TRADES]->(fish:Product {name: 'Fish'}),
  (ken:Person {name: 'Ken'})-[:TRADES]->(fish),
  (mary:Person {name: 'Mary'})-[:TRADES]->(fish),
  (john:Person {name: 'John'})-[:TRADES]->(fish),
  (ken)-[:TRADES]->(book:Product {name: 'Book'}),
  (ken)-[:TRADES]->(plum:Product {name: 'Plum'}),
  (ken)-[:TRADES]->(cabbage:Product {name: 'Cabbage'}),
  (ken)-[:TRADES]->(tomato:Product {name: 'Tomato'}),
  (ken)-[:TRADES]->(pineapple:Product {name: 'Pineapple'}),
  (mary)-[:TRADES]->(Pizza:Product {name: 'Pizza'}),
  (mary)-[:TRADES]->(book),
  (mary)-[:TRADES]->(plum),
  (mary)-[:TRADES]->(cabbage),
  (mary)-[:TRADES]->(tomato),
  (ian:Person {name: 'Ian'})-[:TRADES]->(fish),
  (ian)-[:TRADES]->(pork:Product {name: 'Pork'}),
  (john)-[:TRADES]->(pork),
  (ian)-[:TRADES]->(oil:Product {name: 'Oil'}),
  (ian)-[:TRADES]->(pasta:Product {name: 'Pasta'}),
  (ian)-[:TRADES]->(rice:Product {name: 'Rice'}),
  (ian)-[:TRADES]->(milk:Product {name: 'Milk'}),
  (ian)-[:TRADES]->(orange:Product {name: 'Orange'}),
  (john)-[:TRADES]->(oil),
  (john)-[:TRADES]->(rice),
  (john)-[:TRADES]->(pasta),
  (john)-[:TRADES]->(orange),
  (john)-[:TRADES]->(milk),
  (peter:Person {name: 'Peter'})-[:TRADES]->(rice),
  (peter)-[:TRADES]->(pasta),
  (peter)-[:TRADES]->(orange),
  (peter)-[:TRADES]->(oil),
  (peter)-[:TRADES]->(milk),
  (peter)-[:TRADES]->(apple:Product {name: 'Apple'}),
  (ian)-[:TRADES]->(apple);

I would like to query the names who buy 5 or more same items. (In this case, it's Peter, John and Ian as group1, Ken and Mary as Group2). In for all possible items

[EDITED] Added desire output

My Desire output is similar to below


回答1:


1. Answer for initial question

1.1 Creating your graph

For the ease of possible further answers and solutions I note my graph creating statement:

CREATE
  (:Person {name: 'Ryan'})-[:TRADES]->(fish:Product {name: 'Fish'}),
  (:Person {name: 'Ken'})-[:TRADES]->(fish),
  (:Person {name: 'Mary'})-[:TRADES]->(fish),
  (john:Person {name: 'John'})-[:TRADES]->(fish),
  (ian:Person {name: 'Ian'})-[:TRADES]->(fish),
  (ian)-[:TRADES]->(pork:Product {name: 'Pork'}),
  (john)-[:TRADES]->(pork),
  (ian)-[:TRADES]->(oil:Product {name: 'Oil'}),
  (ian)-[:TRADES]->(pasta:Product {name: 'Pasta'}),
  (ian)-[:TRADES]->(rice:Product {name: 'Rice'}),
  (ian)-[:TRADES]->(milk:Product {name: 'Milk'}),
  (ian)-[:TRADES]->(orange:Product {name: 'Orange'}),
  (john)-[:TRADES]->(oil),
  (john)-[:TRADES]->(rice),
  (john)-[:TRADES]->(pasta),
  (john)-[:TRADES]->(orange),
  (john)-[:TRADES]->(milk),
  (peter:Person {name: 'Peter'})-[:TRADES]->(rice),
  (peter)-[:TRADES]->(pasta),
  (peter)-[:TRADES]->(orange),
  (peter)-[:TRADES]->(oil),
  (peter)-[:TRADES]->(milk),
  (peter)-[:TRADES]->(apple:Product {name: 'Apple'}),
  (ian)-[:TRADES]->(apple);

1.2 Solution

MATCH (person:Person)-[:TRADES]->(product:Product)
WITH person.name AS personName, count(product) AS amount
WHERE amount >=5
RETURN personName, amount;
  • first line: defining the matching pattern
  • second line: count products per person
  • third line: filter for brought products amount
  • fourth line: render the result

1.3 Result

╒════════════╤════════╕
│"personName"│"amount"│
╞════════════╪════════╡
│"John"      │7       │
├────────────┼────────┤
│"Ian"       │8       │
├────────────┼────────┤
│"Peter"     │6       │
└────────────┴────────┘

2. Answer for new question and requirements

2.1 Solution

MATCH path=(sourcePerson:Person)-[:TRADES]->(product:Product)<-[:TRADES]-(targetPerson:Person)
WITH sourcePerson, targetPerson, count(path) AS pathAmount, collect(product.name) AS products
  WHERE pathAmount >= 5 AND id(sourcePerson) > id(targetPerson)
RETURN DISTINCT products, collect(sourcePerson.name) AS sourcePersons, collect(targetPerson.name) AS targetPersons;

2.2 Result

╒════════════════════════════════════════════════════╤═══════════════╤═══════════════╕
│"products"                                          │"sourcePersons"│"targetPersons"│
╞════════════════════════════════════════════════════╪═══════════════╪═══════════════╡
│["Tomato","Cabbage","Plum","Book","Fish"]           │["Mary"]       │["Ken"]        │
├────────────────────────────────────────────────────┼───────────────┼───────────────┤
│["Milk","Orange","Pasta","Rice","Oil"]              │["Peter"]      │["John"]       │
├────────────────────────────────────────────────────┼───────────────┼───────────────┤
│["Milk","Orange","Pasta","Rice","Oil","Pork","Fish"]│["Ian"]        │["John"]       │
├────────────────────────────────────────────────────┼───────────────┼───────────────┤
│["Apple","Orange","Milk","Rice","Pasta","Oil"]      │["Peter"]      │["Ian"]        │
└────────────────────────────────────────────────────┴───────────────┴───────────────┘

2.3 Note

The result shown differs a little from your expectation, since for the relations Ian->Apple<-Peter, John->Pork<-Ian and John->Fish<-Ian your requirement "persons who bought more than four products" is met also and thus it creates a separate cluster.


3. Alternative

If the fine granular clustering does not meet your requirements, you can also drop the "bought >4 products" requirement. In this case the solution would look like this:

3.1 Solution

CALL algo.louvain.stream('', '', {})
YIELD nodeId, community
WITH algo.getNodeById(nodeId) AS node, community
  ORDER BY community
WITH community, collect(node) AS nodes
WITH
  community,
  [x IN nodes WHERE ('Person' IN labels(x)) | x.name] AS persons,
  [x IN nodes WHERE ('Product' IN labels(x)) | x.name] AS products
RETURN community, persons, products;
  • line 1: call the Neo4j Graph Algorithms procedure Louvain algorithm
  • line 2: define result variables
  • line 3: retrieve values from the result stream
  • line 4: order the community values
  • line 8: filter the resulting nodes for label Person
  • line 9: filter the resulting nodes for label Product
  • line 10: render the output

3.2 Result

╒═══════════╤══════════════════════╤═════════════════════════════════════════════════════════════╕
│"community"│"persons"             │"products"                                                   │
╞═══════════╪══════════════════════╪═════════════════════════════════════════════════════════════╡
│0          │["Ryan","Ken","Mary"] │["Fish","Book","Plum","Cabbage","Tomato","Pineapple","Pizza"]│
├───────────┼──────────────────────┼─────────────────────────────────────────────────────────────┤
│1          │["John","Ian","Peter"]│["Pork","Oil","Pasta","Rice","Milk","Orange","Apple"]        │
└───────────┴──────────────────────┴─────────────────────────────────────────────────────────────┘

If you prefer the node itself instead of the names, just remove both | x.name parts in the last WITH clause.



来源:https://stackoverflow.com/questions/53629951/find-cluster-in-neo4j

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!