Load DBpedia locally using Jena TDB?

匿名 (未验证) 提交于 2019-12-03 01:08:02

问题:

I need to perform a query against DBpedia:

SELECT DISTINCT ?poi ?lat ?long ?photos ?template ?type ?label WHERE {   ?poi   ?label .   ?poi  ?lat .   ?poi  ?long .   ?poi  ?photos .                         OPTIONAL {?poi  ?template } .   OPTIONAL {?poi  ?type } .   FILTER ( ?lat > x && ?lat  z && ?long 

I'm guessing this information is scattered among different dumps (.nt) files and somehow the SPARQL endpoint serves us with a result set. I need to download these different .nt files locally (not all DBpedia), perform only once my query and store the results locally (I don't want to use the SPARQL endpoint).

  • What parts of Jena should I use for this one run?

I m a bit confused reading from this post:

So, you can load the entire DBPedia data into a single TDB location on disk (i.e. a single directory). This way, you can run SPARQL queries over it.

  • How do I load the DBpedia into a single TDB location, in Jena terms, if we got three .nt DBpedia files? How do we apply the above query on those .nt files? (Any code would help.)

  • Example, is this wrong?

 String tdbDirectory = "C:\\TDB";  String dbdump1 = "C:\\Users\\dump1_en.nt";  String dbdump2 = "C:\\Users\\dump2_en.nt";  String dbdump3 = "C:\\Users\\dump3_en.nt";  Dataset dataset = TDBFactory.createDataset(tdbDirectory);  Model tdb = dataset.getDefaultModel(); //
  • In the above code we used "dataset.getDefaultModel" (to get the default graph as a Jena Model). Is this statement valid? Do we need to create a dataset to perform the query, or should we go with TDBFactory.createModel(tdbdirectory)?

回答1:

To let Jena index locally :

/** The Constant tdbDirectory. */ public static final String tdbDirectory = "C:\\TDBLoadGeoCoordinatesAndLabels";   /** The Constant dbdump0. */ public static final String dbdump0 = "C:\\Users\\Public\\Documents\\TDB\\dbpedia_3.8\\dbpedia_3.8.owl";  /** The Constant dbdump1. */ public static final String dbdump1 = "C:\\Users\\Public\\Documents\\TDB\\geo_coordinates_en\\geo_coordinates_en.nt";   ...  Model tdbModel = TDBFactory.createModel(tdbDirectory);  /*Incrementally read data to the Model, once per run , RAM > 6 GB*/ FileManager.get().readModel( tdbModel, dbdump0); FileManager.get().readModel( tdbModel, dbdump1, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump2, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump3, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump4, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump5, "N-TRIPLES"); FileManager.get().readModel( tdbModel, dbdump6, "N-TRIPLES"); tdbModel.close(); 

To query Jena:

String queryStr = "dbpedia query ";  Dataset dataset = TDBFactory.createDataset(tdbDirectory); Model tdb = dataset.getDefaultModel();  Query query = QueryFactory.create(queryStr); QueryExecution qexec = QueryExecutionFactory.create(query, tdb);  /*Execute the Query*/ ResultSet results = qexec.execSelect();  while (results.hasNext()) {     // Do something important }  qexec.close(); tdb.close() ; 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!