Is there a way to shard and replicate neo4j data?

问题

I am considering the option of neo4j for some of the new projects I am working for. For the given data needs (inherently graph based) neo4j fits well and a quick prototype is giving good response time for me. What I want to understand is how to scale a neo4j deployment. Specifically:

How do I shard my data across neo4j deployments. Since neo4j is deployed on a single machine, there is a limit to how much data I can store in a single machine and hence I would like to know how to distribute it. Clearly if I split it on users, then relationships between disconnected users (across shards) cannot be maintained.
How do I replicate the neo4j data? I am potentially thinking of putting up a sql-like-setup with masters used for write and slaves used for reads so that we can both scale up our potentially readers and writers, but also have a backup of our data in real time. I understand that all the neo4j data is stored in a filesystem - which is inherently non-replicatable. Is there a way I can do it here? Perhaps, something akin to a mysql bin log?

回答1:

sharding is as of now not handled by Neo4j itself, but by the domain, much as you describe. Neo4j 2.0 is going to target that problem.

For replication, Online Backup is working and real High Availability with Master failover is in the works, using ZooKeeper to track the cluster nodes and elect new masters, etc.

Any more details on your app sharding requirements? What domain etc?

来源：https://stackoverflow.com/questions/3234891/is-there-a-way-to-shard-and-replicate-neo4j-data

标签

replication

sharding

neo4j