问题
I am new to NEO4J and I wanted to see how fast it is. I started to test it and I created a table in both MySQL and NEO4J, with these properties (fields):
id random_number time_stamp
And I wrote a program to generate mass data and inserted about 150 million rows (and Nodes in Neo4J). I can say the write speed was almost same.
So, I tested a select query in both databases. "I wanted to get one of the rows(nodes) with the random_id of 255454" (we know from this random id there are more than 30 rows)
NEO4J:
match (t:testLabel {random_id: 255454}) RETURN t LIMIT 50;
MySQL:
SELECT *
FROM test
WHERE random_id=255454 LIMIT 50
NEO4J took ~47 seconds and MySQL took about ~25 seconds to return results.
NEO4J size on disk became ~35GB and MySQL size on disk became ~5.2 GB
And both databases did not have index on the table or properties.
Hardware: CPU: Corei7-4770 | RAM: 12GB | SSD Hard disk
This is a simple test, I mean both databases were so simple and had basic structures and before testing I thought, NEO4J is faster than MySQL. As I really like NEO4J I want to find a solution and use it again.
According to my simple test, NEO4J is not reasonable for big and scalable projects. I want to know maybe there are some ways that make it amazingly faster! The performance test was so simple and all databases have to have it as well regardless of data modeling.
And what about the size on disk?
+ I found another comparing question by Jörg Baach that you may like to see.
回答1:
Comparing relational databases and graph databases is a huge task.
I think a much more helpful test would be to check performance on queries across multiple tables with several joins and fk. Compare that to neo4j and you will find, possibly much better performance than mysql.
Do this: With your test model set up 4-5 possible use cases. A couple things that a dba will be doing, a couple things that users will be doing etc. Determine how many people are going to be doing this, determine how often they will be doing this.
Choose simple tasks, and complex tasks. Compare MySQL performance to Neo4J. You will find that one DB outperforms the other in different situation.
Try to weigh what your priorities are. How important is it to you to have great performance on matching 50 nodes with a certain property. How important is to you that users (dozens? milions?) will have fast, secure method of creating extensively complex networks of relationships? Once you determine what is important to you refer to the performance tests and determine which db is better for your needs.
If you are going to be performing basic queries you should probably use relational database model like sql. Neo4j is great for complex schemas and queries , not only from a performance perspective but from a readability standpoint.
Neo4j is storing data in a very different way, hence the disk storage difference.
Cypher is centered around the graph patterns that are core to your use-cases and represents them visually as part of its query syntax.
This article is really insightful, shows the transition from relational to graph databases.
回答2:
- Did you create an index on
testLabel
and propertyrandom_id
? - You're seeing a rather high disk usage since transaction logs are kept by default for 7 days, there's a config option to tweak this.
On a general notice: Just looking up a single node is not a reasonable performance test for a graph db. You should probably do some query following a few connections to see the difference.
来源:https://stackoverflow.com/questions/37378607/in-my-tests-ne4j-seems-so-slow-compared-to-mysql-how-can-i-make-it-faster