Create Neo4j database using CSV files

问题

I have 2 CSV files which I want to convert into a Neo4j database. They look like this:

first file:

name,enzyme
Aminomonas paucivorans,M1.Apa12260I
Aminomonas paucivorans,M2.Apa12260I
Bacillus cellulosilyticus,M1.BceNI
Bacillus cellulosilyticus,M2.BceNI

second file 

name,motif
Aminomonas paucivorans,GGAGNNNNNGGC
Aminomonas paucivorans,GGAGNNNNNGGC
Bacillus cellulosilyticus,CCCNNNNNCTC

As you can see the common factor is the Name of the organism and the. Each Organism will have a few Enzymes and each Enzyme will have 1 Motif. Motifs can be same between enzymes . I used the following statement to create my database:

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file1.csv" AS csvLine
MATCH (o:Organism { name: csvLine.name}),(e:Enzyme { name: csvLine.enzyme})
CREATE (o)-[:has_enzyme]->(e) //or maybe CREATE UNIQUE?

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file2.csv" AS csvLine
MATCH (o:Organism { name: csvLine.name}),(m:Motif { name: csvLine.motif})
CREATE (o)-[:has_motif]->(m) //or maybe CREATE UNIQUE?

This gives me errors on the very first line at USING PERIODIC COMMIT which says Invalid input 'S': expected. If I get rid of ti, the next error I get is WITH is required between CREATE and LOAD CSV (line 6, column 1) "MATCH (o:Organism { name: csvLine.name}),(m:Motif { name: csvLine.motif})" . I googled this issue which led me to this answer . I tried the answer given ther (refreshing the browser cache) but the problem persists. WHat am I doing wrong here? Is the query correct? Is there an another solution to this issue? Any help will be greatly appreciated

回答1:

Your queries have two issues at once:

You can't refer to a local file just with "file1.csv", because neo4j is expecting a URL
You're using MATCH in cases where the data may not originally exist; you need to use MERGE there instead, which basically acts like the create unique comment you added.

I don't know what the source of your specific error message is, but as written it doesn't look like these queries could possibly work. Here are your queries reformulated, so that they will work (I tested it on my machine with your CSV samples)

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/myuser/tmp/file1.csv" AS csvLine
MERGE (o:Organism { name: coalesce(csvLine.name, "No Name")})
MERGE (e:Enzyme { name: csvLine.enzyme})
MERGE (o)-[:has_enzyme]->(e);

Notice here 3 merge statements (MERGE basically does MATCH + CREATE if it doesn't already exist), and the fact that I've used a file: URL.

The second query gets formulated basically the same way:

USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/myuser/tmp/file2.csv" AS csvLine
MERGE (o:Organism { name:  coalesce(csvLine.name, "No Name")})
MERGE (m:Motif { name: csvLine.motif})
MERGE (o)-[:has_motif]->(m);

EDIT I added coalesce in the Organism's name property. If you have null values for name in the CSV, then the query would otherwise fail. Coalesce guarantees that if csvLine.name is null, then you'll get back "No Name" instead.

来源：https://stackoverflow.com/questions/26090984/create-neo4j-database-using-csv-files

标签

csv

neo4j