How to improve xdmp:document-filter() performance in Marklogic?

Deadly 提交于 2019-12-11 05:51:47

问题


I am using xdmp:document-filter(doc()) to extract metadata from documents(doc, docx, pdf etc). We are using this because it works for all kinds of document format and generates the XHTML format for every kind of document. But the major drawback of this command is that it slows down the query. If there are one or two documents in the database then the query works fine but if there are more documents (e.g. 10 or 15) then the query slows down. We want to extract and show the information from the metadata of all the documents in the database.

We are using this query:-

for $d in fn:doc()
return xdmp:document-filter(doc(fn:base-uri($d)))

Is there any way to make this query work faster or is there any alternative to xdmp:document-filter() ?


回答1:


The xdmp:document-filter() is typically used at ETL time. If you use Information Studio to load your content, then you can add a 'Filter documents' transform. You can choose between storing the extracted metadata as separate xhtml documents, or as document properties. That way they don't need to be calculated on the fly at each request.

HTH!



来源:https://stackoverflow.com/questions/11845977/how-to-improve-xdmpdocument-filter-performance-in-marklogic

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!