How to load a large csv file into Neo4j

瘦欲@ 提交于 2019-12-11 08:42:46

问题


I'm trying to load a large csv file (1458644 row) into neo4j, but i'm still getting this error :

Neo.TransientError.General.OutOfMemoryError: There is not enough memory to perform the current task. Please try increasing 'dbms.memory.heap.max_size' in the neo4j configuration (normally in 'conf/neo4j.conf' or, if you you are using Neo4j Desktop, found through the user interface) or if you are running an embedded installation increase the heap by using '-Xmx' command line flag, and then restart the database.

even if i change dbms.memory.heap.max_size=1024m with m=megbite , the same error occurs again !

Note : the size of the csv is 195.888 KB

this is my code :

load csv with headers from "file:///train.csv" as line
create(pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
create (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);

what should i do ?


回答1:


You should use periodic commits to process the CSV data in batches. For example, this will process 10,000 lines at a time (the default batch size is 1000):

USING PERIOD COMMIT 10000
LOAD CSV WITH HEADERS FROM "file:///train.csv" as line
CREATE (pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
CREATE (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);



回答2:


I Solved the problem by copying the solution for limited row here

so this is my solution:

USING PERIODIC COMMIT
load csv with headers from "file:///train.csv" as line
with line LIMIT 1458644
create   (pl:pickup_location{latitude:toFloat(line.pickup_latitude),longitude:toFloat(line.pickup_longitude)}),(pt:pickup_time{pickup:line.pickup_datetime}),(dl:dropoff_location{latitude:toFloat(line.dropoff_latitude),longitude:toFloat(line.dropoff_longitude)}),(dt:dropoff_time{dropoff:line.dropoff_datetime})
create (pl)-[:TLR]->(pt),(dl)-[:TLR]->(dt),(pl)-[:Trip]->(dl);

the downside of this solution is that you need to know the number of rows of your big csv file (excel can't open large csv files).



来源:https://stackoverflow.com/questions/50975631/how-to-load-a-large-csv-file-into-neo4j

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!