Normal JSON to GraphSON format

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-07 07:53:00

问题


I have two questions:

  1. Where I can actually find the basic format for a GraphSON file, that is guaranteed to be successfully loaded by the gremlin console? I'm trying to convert a JSON (with about 10-20 fields) to another file that can be queried by gremlin but I can't actually find any relevant information about the fields reserved by the graphson format or how I should handle the IDs etc. I exported the modern graph they provide and it's not even a valid JSON (Multiple JSON root elements), but a list of JSONs [1] I also saw fields like outE, inE...are these fields something I manually have to create?

  2. If I am able to create the JSON, where do I tell the server to load it as the base graph when I start it? In the config file or in the script?

Thanks! Adrian

[1] https://pastebin.com/drwXhg5k

{"id":1,"label":"person","outE":{"created":[{"id":9,"inV":3,"properties":{"weight":0.4}}],"knows":[{"id":7,"inV":2,"properties":{"weight":0.5}},{"id":8,"inV":4,"properties":{"weight":1.0}}]},"properties":{"name":[{"id":0,"value":"marko"}],"age":[{"id":1,"value":29}]}}
{"id":2,"label":"person","inE":{"knows":[{"id":7,"outV":1,"properties":{"weight":0.5}}]},"properties":{"name":[{"id":2,"value":"vadas"}],"age":[{"id":3,"value":27}]}}
{"id":3,"label":"software","inE":{"created":[{"id":9,"outV":1,"properties":{"weight":0.4}},{"id":11,"outV":4,"properties":{"weight":0.4}},{"id":12,"outV":6,"properties":{"weight":0.2}}]},"properties":{"name":[{"id":4,"value":"lop"}],"lang":[{"id":5,"value":"java"}]}}
{"id":4,"label":"person","inE":{"knows":[{"id":8,"outV":1,"properties":{"weight":1.0}}]},"outE":{"created":[{"id":10,"inV":5,"properties":{"weight":1.0}},{"id":11,"inV":3,"properties":{"weight":0.4}}]},"properties":{"name":[{"id":6,"value":"josh"}],"age":[{"id":7,"value":32}]}}
{"id":5,"label":"software","inE":{"created":[{"id":10,"outV":4,"properties":{"weight":1.0}}]},"properties":{"name":[{"id":8,"value":"ripple"}],"lang":[{"id":9,"value":"java"}]}}
{"id":6,"label":"person","outE":{"created":[{"id":12,"inV":3,"properties":{"weight":0.2}}]},"properties":{"name":[{"id":10,"value":"peter"}],"age":[{"id":11,"value":35}]}}

回答1:


Where I can actually find the basic format for a GraphSON file, that is guaranteed to be successfully loaded by the gremlin console?

There are multiple versions of GraphSON at this point. You can get a reference in the Apache TinkerPop IO Documentation. When you write, "successfully loaded by the gremlin console" I assume that you mean with the GraphSONReader methods described here. Of so, then the format you show above is one form you can use. It is not valid JSON as you can see, though you can build the reader/writer with the wrapAdjacencyList option set to true and it will produce valid JSON. Here is an example:

gremlin> graph = TinkerFactory.createModern();
==>tinkergraph[vertices:6 edges:6]
gremlin> writer =  graph.io(IoCore.graphson()).writer().wrapAdjacencyList(true).create()
==>org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONWriter@24a298a6
gremlin> os = new FileOutputStream('wrapped-adjacency-list.json')
==>java.io.FileOutputStream@6d3c232f
gremlin> writer.writeGraph(os, graph)
gremlin> os.close()
gremlin> newGraph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> ins = new FileInputStream('wrapped-adjacency-list.json')
==>java.io.FileInputStream@7435a578
gremlin> reader = graph.io(IoCore.graphson()).reader().unwrapAdjacencyList(true).create()
==>org.apache.tinkerpop.gremlin.structure.io.graphson.GraphSONReader@63da207f
gremlin> reader.readGraph(ins, newGraph)
gremlin> newGraph
==>tinkergraph[vertices:6 edges:6]

The reason you do not get valid JSON by default is because the standard format for a GraphSON file needs to be splittable for Hadoop and other distributed processing engines. Therefore it produces one line per vertex - a StarGraph format.

If I am able to create the JSON, where do I tell the server to load it as the base graph when I start it? In the config file or in the script?

A script would work. as would the gremlin.tinkergraph.graphLocation and gremlin.tinkergraph.graphFormat configuration options on TinkerGraph.

Ultimately though, if you have existing JSON and you aren't loading tens of millions of graph elements, it is probably easiest to just parse it and use standard g.addV() and g.addE() methods to build the graph:

gremlin> import groovy.json.*
==>org.apache.tinkerpop.gremlin.structure.*,...
gremlin> graph = TinkerGraph.open()
==>tinkergraph[vertices:0 edges:0]
gremlin> g = graph.traversal()
==>graphtraversalsource[tinkergraph[vertices:0 edges:0], standard]
gremlin> jsonSlurper = new JsonSlurper()
==>groovy.json.JsonSlurper@53e3a87a
gremlin> object = jsonSlurper.parseText('[{ "name": "John Doe" }, { "name" : "Jane Doe" }]')
==>[name:John Doe]
==>[name:Jane Doe]
gremlin> object.each {g.addV('name',it.name).iterate() }
==>[name:John Doe]
==>[name:Jane Doe]
gremlin> g.V().valueMap()
==>[name:[John Doe]]
==>[name:[Jane Doe]]

Trying to convert that to GraphSON is overly complicated compared to the approach above.



来源:https://stackoverflow.com/questions/44928121/normal-json-to-graphson-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!