import edges to OrientDB using etl

 ̄綄美尐妖づ 提交于 2019-12-23 12:19:47

问题


I have 3 tables, one is for vertex A, one is for vertex B, and the third is for edges from B to A. How can I import this graph to OrientDB?

For now, the tutorial just says how to import two csv files, one is for vertex A, and the other one is for vertex B and connected vertices from A. Load vertex A, then load vertex B and creating edges from A to B in the mean time.

This works for simple graphs. But for complicated graphs, for example, I have three types of vertices, A, B and C, three types of edges, A -> B, B -> C, C -> A, how can I import this graph?

I want to use the etl to load the graph although JAVA API should be a solution.


Update:
Finally, I figured out how Stack Overflow works. I shouldn't try to insert data in the comment, update the question instead.
This is a concrete example for my question:
Two types of vertices:
masters(John, Joey, Michael, Robert, Allen),
pets(Snoopy, White, Blue).

Two types of relationships:
like(John likes Snoopy, Michael linkes White, Michael likes Blue, Allen likes Snoopy, Michael likes White),
belong to(Snoopy belongs to Joey, White belongs to Robert, Blue belongs to John).

How can I import this little network into OrientDB using OETL?


回答1:


OrientDB ETL facility allows you to load graph from external sources. If you read the OriendDB-ETL introduction, a source made by a collection of records is read, a transformation is applied on each row to generate a document record (or vertex and optionally egdes, remember that OrientDB is a document database under the hood, that also supports graphs), and then load the generated documents (or vertex/edges) in OrientDB.

You should use multiple JSON ETL files to load more than a single source, but in each source you may load multiple vertex classes and edges, depending on the transformations applied (you may apply multiple). Input sources could be SQL databases, CSV- and JSON - formatted files. For an example with multiple vertex types and edges built from CSV sources, see "Import the database of beers".

When using database (SQL) tables as data source, take a look at Import from DBMS.

For tables, it is typically to have A -edge-> B mapped into a relation table (e.g. for many-to-many or when edge itself have properties), while for simpler edge kinds (one-to-one, one-to-many...) a simple foreign key is customary. ETL configuration is given in JSON format, and for SQL databases you may configure a SQL query (giving a "results table") and how to map the result fields to both vertex and related edges.

You may add multiple JSON ETL commands, when you have multiple sources, but when processing each source, you may use transformers (see VERTEX, EDGE and MERGE in Transformers documentation) for creating vertices and related edges. The CSV transformer may even create an ODocument straight from each CSV row.

Note: as an alternative to the ETL facility (which is relatively new), you may use the Java API for programmatically importing data into OrientDB databases. If you need to load huge graphs this is the recommended way, as you have full control on how data is loaded into OrientDB.



来源:https://stackoverflow.com/questions/32778905/import-edges-to-orientdb-using-etl

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!