What is the fastest way to import to Neo4j?

萝らか妹 提交于 2020-01-17 14:05:25

问题


I have a list of JSON documents, in the format:

[{a:1, b:[2,5,6]}, {a:2, b:[1,3,5]}, ...]

What I need to do is make nodes with parameter a, and connect them to all the nodes in the list b that have that value for a. So the first node will connect to nodes 2, 5 and 6. Right now I'm using Python's neo4jrestclient to populate but it's taking a long time. Is there a faster way to populate?

Currently this is my script:

break_list = []
for each in ans[1:]:
    ref = each[0]
    q = """MATCH n WHERE n.url = '%s' RETURN n;""" %(ref)
    n1 = gdb.query(q, returns=client.Node)[0][0]
    for link in each[6]:
        if len(link)>4:
            text,link = link.split('!__!')
            q2 = """MATCH n WHERE n.url = '%s' RETURN n;""" %(link)
            try:
                n2 = gdb.query(q2, returns=client.Node)
                n1.relationships.create("Links", n2[0][0], anchor_text=text)
            except:
                break_list.append((ref,link))

回答1:


You might want to consider converting your JSON to CSV (using some like jq), then you could use the LOAD CSV Cypher tool for import. LOAD CSV is optimized for data import so you will have much better performance using this method. With your example the LOAD CSV script would look something like this:

Your JSON converted to CSV:

"a","b"
"1","2,5,6"
"2","1,3,5"

First create uniqueness constraint / index. This will ensure only one Node is created for any "name" and create an index for faster lookup performance.

CREATE CONSTRAINT ON (p:Person) ASSERT p.name IS UNIQUE;

Given the above CSV file this Cypher script can be used to efficiently import data:

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:///path/to/file.csv" AS row
MERGE (a:Person{name: row.a})
WITH a,row
UNWIND split(row.b,',') AS other
MERGE (b:Person {name:other})
CREATE UNIQUE (a)-[:CONNECTED_TO]->(b);

Other option

Another option is to use the JSON as a parameter in a Cypher query and then iterate through each element of the JSON array using UNWIND.

WITH {d} AS json
UNWIND json AS doc
MERGE (a:Person{name: doc.a})
WITH doc, a
UNWIND doc.b AS other
MERGE (b:Person{name:other})
CREATE UNIQUE (a)-[:CONNECTED_TO]->(b); 

Although there might be some performance issues with a very large JSON array. See some examples of this here and here.



来源:https://stackoverflow.com/questions/34118491/what-is-the-fastest-way-to-import-to-neo4j

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!