问题
I'm looking for the specifications of the TREC format. I've been googling a lot but I didn't find a clue.
Does any one know where to find any information about it?
回答1:
AFAIK TREC is an abbreviation for NIST's Text REtrieval Conference. In order for the indexer to know where the document boundaries are within files, each document must have begin document and end document tags. These tags are similar to HTML or XML tags and are actually the format for TREC documents.
TrecParser: This parser recognizes text in the TEXT, HL, HEAD, HEADLINE, TTL, and LP fields.
Source: TREC Wikipedia
Source: Lemur Guide
回答2:
Found: http://sourceforge.net/apps/trac/lemur/wiki/Indexer%20File%20Formats
回答3:
It is also a new recording file format for TechSmith Camtasia. https://feedback.techsmith.com/techsmith/topics/mac_upgrade-ri5ox
回答4:
It is also the file format used by IBM Watson for knowledge ingestion
来源:https://stackoverflow.com/questions/10480022/what-is-the-trec-format