In my tests, NE4J seems so slow compared to MySQL. How can I make it faster?

我是研究僧i 提交于 2019-12-06 18:04:45

问题


I am new to NEO4J and I wanted to see how fast it is. I started to test it and I created a table in both MySQL and NEO4J, with these properties (fields):

id    random_number    time_stamp

And I wrote a program to generate mass data and inserted about 150 million rows (and Nodes in Neo4J). I can say the write speed was almost same.

So, I tested a select query in both databases. "I wanted to get one of the rows(nodes) with the random_id of 255454" (we know from this random id there are more than 30 rows)

NEO4J:

match (t:testLabel {random_id: 255454}) RETURN t LIMIT 50;

MySQL:

SELECT * 
FROM  test 
WHERE  random_id=255454 LIMIT 50

NEO4J took ~47 seconds and MySQL took about ~25 seconds to return results.

NEO4J size on disk became ~35GB and MySQL size on disk became ~5.2 GB

And both databases did not have index on the table or properties.

Hardware: CPU: Corei7-4770 | RAM: 12GB | SSD Hard disk


This is a simple test, I mean both databases were so simple and had basic structures and before testing I thought, NEO4J is faster than MySQL. As I really like NEO4J I want to find a solution and use it again.

According to my simple test, NEO4J is not reasonable for big and scalable projects. I want to know maybe there are some ways that make it amazingly faster! The performance test was so simple and all databases have to have it as well regardless of data modeling.

And what about the size on disk?

+ I found another comparing question by Jörg Baach that you may like to see.


回答1:


Comparing relational databases and graph databases is a huge task.

I think a much more helpful test would be to check performance on queries across multiple tables with several joins and fk. Compare that to neo4j and you will find, possibly much better performance than mysql.

Do this: With your test model set up 4-5 possible use cases. A couple things that a dba will be doing, a couple things that users will be doing etc. Determine how many people are going to be doing this, determine how often they will be doing this.

Choose simple tasks, and complex tasks. Compare MySQL performance to Neo4J. You will find that one DB outperforms the other in different situation.

Try to weigh what your priorities are. How important is it to you to have great performance on matching 50 nodes with a certain property. How important is to you that users (dozens? milions?) will have fast, secure method of creating extensively complex networks of relationships? Once you determine what is important to you refer to the performance tests and determine which db is better for your needs.

If you are going to be performing basic queries you should probably use relational database model like sql. Neo4j is great for complex schemas and queries , not only from a performance perspective but from a readability standpoint.

Neo4j is storing data in a very different way, hence the disk storage difference.

Cypher is centered around the graph patterns that are core to your use-cases and represents them visually as part of its query syntax.

This article is really insightful, shows the transition from relational to graph databases.




回答2:


  1. Did you create an index on testLabel and property random_id?
  2. You're seeing a rather high disk usage since transaction logs are kept by default for 7 days, there's a config option to tweak this.

On a general notice: Just looking up a single node is not a reasonable performance test for a graph db. You should probably do some query following a few connections to see the difference.



来源:https://stackoverflow.com/questions/37378607/in-my-tests-ne4j-seems-so-slow-compared-to-mysql-how-can-i-make-it-faster

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!