Why is findOne(<id>, <depth>) getting unacceptably slow performance when adding more nodes of the same label?

寵の児 提交于 2021-02-11 13:01:44

问题


CONTEXT

I've been developing a spring boot website backed by a Neo4j database. It is designed to work as a university course search system. (the relevant structure is that courses have modulesets, that have modules, that are related to subjects, etc...)

@JsonIdentityInfo(generator=JSOGGenerator.class)
public class Course extends DomainObject {
  @NotNull private String name;
  @NotNull private String courseCode;
  private String description;
  private School school;

  @Convert(AttendanceTypeConverter.class)
  private E_AttendanceType attendanceType;

  @Convert(CourseTypeConverter.class)
  private E_CourseType courseType;

  @Convert(SandwichYearTypeConverter.class)
  private E_SandwichYearType sandwichYearType;

  @Relationship(type = "COURSE_DESCRIPTION_FOR", direction =     Relationship.OUTGOING)
  private Set<CourseYearDescription> courseYearDescription;

  @Relationship(type = "COURSE_REQUISITES_SET_FOR", direction =  Relationship.OUTGOING)
  private Set<EntryRequirementsSet> entryRequirementsSets;

  @Relationship(type = "RUNS_COURSE", direction = Relationship.OUTGOING)
  Set<MemberOfFaculty> courseRunners;

For course pages I need to populate all of the complex fields of a course so they can be displayed on a page. I'd been using the T findOne(Long var1, int var2) with a depth of 4 through a GraphRepository to get a comprehensive course object. I was concerned as to my knowledge this is a very uncommon depth. However, when running the method it returned without any noticeable delay.

PROBLEM When doing some stress testing I increased the number of courses in the database to 4000 and found the delay increase exponentially. Working backwards depth 2 was up to 20 seconds, 3 was about 60 seconds and 4 never returned over 5 minutes. This is despite that fact that all 3 previously returned in milliseconds.

I found this odd as I was building off of a single course node (identified by long node id) so the increased number of courses shouldn't have changed the speed of the findOne method in this way. It would still be building the same size object.

TESTING To test out alternatives I ran MATCH (course:Course{courseCode:'HG65'})-[*1..4]->(x)RETURN * to see how long that would take (obv here course code is limiting the query to one course node instead of node id). It returned instantly with exactly what I wanted:

This made me think it might be something to do with the result mapping to a POJO in the GraphRepository.To test this I created some mapping functions for taking a Neo4jOperation Result object and instantiating/populating my Course object by way of parsing + iterating through the Results Map. In this sense I would be emulating the findOne of depth 4. This ran with no delay. my only thought on this is that findOne ignores relationship directions leading to "course1 -> school -> course2" eventing in massive increase in fetching. Although I do not now how to confirm that as the case, nor how to get around it.

QUESTION

Why is findOne(ID, 4) running so slowly when I add more Course objects? How can I overcome this issue without writing bespoke queries and result mappers every time I want a complex POJO fetched.

Is there an alternative approach I should take?


回答1:


After inspecting the calls from my spring project to the Neo4j database I have confirmed the problem. findOne() uses a (n)-[]-(m) relationship. The exact query is as follows :

MATCH (n) WITH n.nodeId = {id} MATCH p=(n)-[*0..4]- (m) RETURN p

This is what I expected. In the case that I have 10000 courses that are all related to a single node one depth away, they will all match to each other with 2 depth. course -[]- school -[]- course. This means any other course related queries would exponentially increase in size.

My solution was to alter the default query and place it as a GraphRepository query as follows :

MATCH (n:Course{courseCode:{courseCode}}) WITH n MATCH p=(n)-[*0..4]->(m) RETURN p

Note that the relationship has changed from bidirectional to a -[]-> outwards direction. This solution works with the sping mapping OGM perfectly and all sub classes in my complex POJO are populated as expected.



来源:https://stackoverflow.com/questions/36488427/why-is-findoneid-depth-getting-unacceptably-slow-performance-when-adding

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!