What is the TREC format?

馋奶兔 提交于 2019-11-30 02:48:45

问题


I'm looking for the specifications of the TREC format. I've been googling a lot but I didn't find a clue.

Does any one know where to find any information about it?


回答1:


AFAIK TREC is an abbreviation for NIST's Text REtrieval Conference. In order for the indexer to know where the document boundaries are within files, each document must have begin document and end document tags. These tags are similar to HTML or XML tags and are actually the format for TREC documents.

TrecParser: This parser recognizes text in the TEXT, HL, HEAD, HEADLINE, TTL, and LP fields.

Source: TREC Wikipedia

Source: Lemur Guide




回答2:


Found: http://sourceforge.net/apps/trac/lemur/wiki/Indexer%20File%20Formats




回答3:


It is also a new recording file format for TechSmith Camtasia. https://feedback.techsmith.com/techsmith/topics/mac_upgrade-ri5ox




回答4:


It is also the file format used by IBM Watson for knowledge ingestion



来源:https://stackoverflow.com/questions/10480022/what-is-the-trec-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!