Creating neo4j graph database from csv file using py2neo

半世苍凉 提交于 2019-12-12 12:23:14

问题


I am currently working in a doctoral program and i am interested about Py2neo, so I am using it in order to perform some experiments using social graphs. However I got into newbie troubles. Excuse me for asking these simple questions.

I got a xml dataset containing data about publications of a jornal, I have converted it into a csv table, there are about 700 records and each record is composed by four fiels: date, title, keywords, author. So my first question is how to create a graph from this table programatically. I considered writing a python script which loops the csv table, reads for each row and columns fields and writes into nodes. +++++++++++++++++++++++++++++++++++++++++++++ Code +++++++++++++++++++++++++++++++++++++++++++

   #!/usr/bin/env python
   #
   import csv
   from py2neo import neo4j, cypher
   from py2neo import node,  rel

   # calls database service of Neo4j
   #
   graph_db = neo4j.GraphDatabaseService("http://localhost:7474/db/data/")
   #
   # Create nodes and relationships from a csv table
   # since it's a csv table, a reader must be invoked


   ifile  = open('testeout5_cp.csv', "rb")
   reader = csv.reader(ifile)

   # clear database
   graph_db.clear()

   rownum = 0
   for row in reader:
        colnum = 0
        for col in row:
            titulo, autor, rel = graph_db.create(
            {"titulo": col[1]}, {"autor": col[3]}, (1, "eh_autor_de", 0)
            )
            print(titulo,  autor)  
   rownum += 1

   ifile.close()

================ I got this output (Fragment): Python 2.7.5 (default, Aug 22 2013, 09:31:58) [GCC 4.8.1 20130603 (Red Hat 4.8.1-1)] on aires2, Standard

    (Node('http://localhost:7474/db/data/node/10392'), Node('http://localhost:7474/db/data /node/10393'))
    (Node('http://localhost:7474/db/data/node/10394'), Node('http://localhost:7474/db/data/node/10395'))
    (Node('http://localhost:7474/db/data/node/10396'), Node('http://localhost:7474/db/data/node/10397'))
    (Node('http://localhost:7474/db/data/node/10398'), Node('http://localhost:7474/db/data/node/10399'))
    (Node('http://localhost:7474/db/data/node/10400'), Node('http://localhost:7474/db/data/node/10401'))
    (Node('http://localhost:7474/db/data/node/10402'), Node('http://localhost:7474/db/data/node/10403'))
    (Node('http://localhost:7474/db/data/node/10404'), Node('http://localhost:7474/db/data/node/10405'))

========= What is wrong?


回答1:


I am not a py2neo expert, so can't help with that. However, have you tried using a different mechanism to create your graph? Since it is not very big, I would consider using a spreadsheet (I use that a lot) - it's dead easy.

See http://blog.neo4j.org/2013/03/importing-data-into-neo4j-spreadsheet.html for some more info.

Hope it makes sense.

Rik




回答2:


I think there is nothing wrong, your code looks good.

You print the nodes and get proper py2neo node instances. Try print(titulo, autor, rel) to see if your relationship is also created.

Just check with the webinterface at http://localhost:7474/webadmin/ if your data is there. Since you don't have too many nodes, you could try a simple cypher query to get all nodes and check if everything is ok.

START n=node(*) RETURN n;


来源:https://stackoverflow.com/questions/18804254/creating-neo4j-graph-database-from-csv-file-using-py2neo

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!