Can't write large owl file with Jena

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-23 12:19:59

问题


I'm trying to convert data contained in a database table in a set of triples so I'm writing an owl file using Jena java library. I have successfully done it with a small number of table records (100) which corresponds to nearly 20.000 rows in the .owl file and I'm happy with it.

To write the owl file I have used the following code (m is an OntModel object):

 BufferedWriter out = null;
 try {
    out = new BufferedWriter (new FileWriter(FILENAME));        
    m.write(out);
    out.close();
 }catch(IOException e) {};

Unfortunately when I try to do the same with the entire result set of the table (800.000 records) eclipse console shows me the exception:

Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded

the exception is raised by

m.write(out);

I'm absolutely sure the model is correctly filled because I tried to execute the program without creating the owl file and all worked fine. To fix it, I tried to increase heap memory setting -Xmx4096Minrun->configuration->vm arguments but the error still appears.

I'm executing the application on a macbook so I have no unlimited memory. Are there chances to complete the task? maybe is there a more efficient way to store the model?


回答1:


The default format is RDF/XML is a pretty form, but to calculate the "pretty", quite a lot of work needs to be done before writing starts. This includes building up internal datstructures. Some shapes of data cause quite extensive work to be done searching for the "most pretty" variation.

RDF/XML in pretty form is the most expensive format. Even the pretty Turtle form is cheaper though it still involves some preparation calculations.

To write in RDF/XML in a simpler format, with no complex pretty features:

RDFDataMgr.write(System.out, m, RDFFormat.RDFXML_PLAIN);

Output streams are preferred, and the output will be UTF-8 - "new BufferedWriter (new FileWriter(FILENAME));" will use the platform default character set.

See the documentation for other formats and variations:

https://jena.apache.org/documentation/io/rdf-output.html

such as RDFFormat.TURTLE_BLOCKS.



来源:https://stackoverflow.com/questions/47719028/cant-write-large-owl-file-with-jena

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!