问题
I have 2 CSV
files which I want to convert into a Neo4j
database. They look like this:
first file:
name,enzyme
Aminomonas paucivorans,M1.Apa12260I
Aminomonas paucivorans,M2.Apa12260I
Bacillus cellulosilyticus,M1.BceNI
Bacillus cellulosilyticus,M2.BceNI
second file
name,motif
Aminomonas paucivorans,GGAGNNNNNGGC
Aminomonas paucivorans,GGAGNNNNNGGC
Bacillus cellulosilyticus,CCCNNNNNCTC
As you can see the common factor is the Name
of the organism and the. Each Organism
will have a few Enzymes
and each Enzyme
will have 1 Motif
. Motifs
can be same between enzymes . I used the following statement to create my database:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file1.csv" AS csvLine
MATCH (o:Organism { name: csvLine.name}),(e:Enzyme { name: csvLine.enzyme})
CREATE (o)-[:has_enzyme]->(e) //or maybe CREATE UNIQUE?
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file2.csv" AS csvLine
MATCH (o:Organism { name: csvLine.name}),(m:Motif { name: csvLine.motif})
CREATE (o)-[:has_motif]->(m) //or maybe CREATE UNIQUE?
This gives me errors on the very first line at USING PERIODIC COMMIT
which says Invalid input 'S': expected
. If I get rid of ti, the next error I get is WITH is required between CREATE and LOAD CSV (line 6, column 1)
"MATCH (o:Organism { name: csvLine.name}),(m:Motif { name: csvLine.motif})"
. I googled this issue which led me to this answer . I tried the answer given ther (refreshing the browser cache) but the problem persists. WHat am I doing wrong here? Is the query correct? Is there an another solution to this issue? Any help will be greatly appreciated
回答1:
Your queries have two issues at once:
- You can't refer to a local file just with "file1.csv", because neo4j is expecting a URL
- You're using
MATCH
in cases where the data may not originally exist; you need to useMERGE
there instead, which basically acts like the create unique comment you added.
I don't know what the source of your specific error message is, but as written it doesn't look like these queries could possibly work. Here are your queries reformulated, so that they will work (I tested it on my machine with your CSV samples)
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/myuser/tmp/file1.csv" AS csvLine
MERGE (o:Organism { name: coalesce(csvLine.name, "No Name")})
MERGE (e:Enzyme { name: csvLine.enzyme})
MERGE (o)-[:has_enzyme]->(e);
Notice here 3 merge statements (MERGE
basically does MATCH
+ CREATE
if it doesn't already exist), and the fact that I've used a file:
URL.
The second query gets formulated basically the same way:
USING PERIODIC COMMIT
LOAD CSV WITH HEADERS FROM "file:/home/myuser/tmp/file2.csv" AS csvLine
MERGE (o:Organism { name: coalesce(csvLine.name, "No Name")})
MERGE (m:Motif { name: csvLine.motif})
MERGE (o)-[:has_motif]->(m);
EDIT I added coalesce
in the Organism's name
property. If you have null values for name
in the CSV, then the query would otherwise fail. Coalesce guarantees that if csvLine.name
is null, then you'll get back "No Name" instead.
来源:https://stackoverflow.com/questions/26090984/create-neo4j-database-using-csv-files