Querying large RDF Datasets out of memory

三世轮回 提交于 2019-11-27 14:50:56
Joshua Taylor

Jena's Fuseki can use TDB as a storage mechanism, and TDB stores things on disk. The TDB docmentation on caching on 32 and 64 bit Java systems discusses the way that the file contents are mapped into memory. I do not believe that TDB/Fuseki loads the entire dataset into memory; this just is not feasible for large datasets, yet TDB can handle rather large datasets. I think what you should consider doing is using tdbloader to create a TDB store; then you can point Fuseki to it.

There's an example of setting up a TDB store in this answer. In there, the query is performed with tdbquery, but according to the Running a Fuseki server section of the documentation, all you will need to do to start Fuseki with the same TDB store is use the --loc=DIR option:

  • --loc=DIR
    Use an existing TDB database. Create an empty one if it does not exist.
Ali R

As Joshua said, Jena's Fuseki uses TDB so it can store very large ontologies without using a lot of resources. For example, you can load the Yago2 taxonomy into it and use only about 600MB of RAM. You do not need to load Fuseki into your Java project, you can just run it from the command line and query it inside your project.

Load it at the Windows command line by the following:

java -jar c:\your_ontology_directory\fuseki-server.jar \
  --file=your_ontology.rdf /your_namespace

Then you can run a SPARQL query against it with any GET/POST application (even in your browser):

http://localhost:3030/your_namespace/sparql?query=SELECT * { ?s ?p ?o }

The results are, by default, returned in XML format.

<?xml version="1.0"?>
<sparql xmlns="http://www.w3.org/2005/sparql-results#">
  <head>
    <variable name="s"/>
    <variable name="p"/>
    <variable name="o"/>
  </head>
  <results>
    <result>
      <binding name="s">
        <uri>http://yago-knowledge/resource/wordnet_gulag_103467887</uri>
      </binding>
      <binding name="p">
        <uri>http://www.w3.org/2000/01/rdf-schema#subClassOf</uri>
      </binding>
      <binding name="o">
        <uri>http://yago-knowledge/resource/wordnet_prison_camp_104005912</uri>
      </binding>
    </result>
    …
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!