SET label based on data within LOAD CSV

爷,独闯天下 提交于 2019-12-19 09:59:15

问题


I'm using Neo4j 2.2.0 and importing data (in the form of a nodes file and relationships file) via LOAD CSV.

The nodes will all be imported under the "Person" label, however I want to add the "Geotag" label to some of them if their latitude and longitude fields in the nodes file are being empty.

So, for example, the below nodes file (ignore the extra line in between rows)

"username","id","latitude","longitude"

"abc123","111111111","33.223","33.223"

"abc456","222222222","",""

I would like to create node "abc123" with the Person and Geotag labels and node abc456 with just the Person label because it doesn't have a latitude and longitude.

I thought this would be something along the lines of:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line 
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
SET p: (CASE WHEN line.latitude IS NOT NULL THEN GEOTAGGED);

I know I am using the CASE statement incorrectly as well as the SET statement, but is this possible to do while importing the nodes? This file has over 3 million nodes in it and it would be helpful to do it upon insertion so that when new nodes get added (usually in batches), we're not exploring all nodes just to get to the new ones.

I've explored other SO questions (How to set relationship type and label in LOAD CSV?, Loading relationships from CSV data into neo4j db, Neo4j Cypher - creating nodes and setting labels with LOAD CSV), however they differ from my question in that those OP's are trying to use a field in the file as the label and I am simply trying to make a conditional decision on which labels to use based on data in the file.

Thanks!

EDIT: In response to an answer, I am trying the following:

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

I get the following error:

QueryExecutionKernelException: Invalid input 'A': expected 'r/R' (line 3, column 2 (offset: 454)) "CASE WHEN line.latitude IS NOT NULL THEN [1] ELSE [] END AS geotagged"

With the carrot under the 'A' in "CASE"

EDIT2:

Below is the complete solution, inspired by and only slightly different from David's solution.

LOAD CSV WITH HEADERS FROM "file:/users.csv" AS line
CREATE (p:Person { username: line.username, id: line.id, latitude: line.latitude, longitude: line.longitude }) 
WITH p, CASE WHEN line.latitude <> "" THEN [1] ELSE [] END AS geotagged 
FOREACH (x IN geotagged | SET p:Geotag); 

回答1:


you are close. You cannot put the conditional logic in the set label statement. You need to create a collection of 1 to iterate through when you have a not null lon/lat value. Then iterate through the collection of 1 and perform the statement there.

...
case when line.latitude IS NOT NULL then [1] else [] end as geotagged
foreach(x in geotagged | set p:Geotag)
...


来源:https://stackoverflow.com/questions/29419634/set-label-based-on-data-within-load-csv

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!