Export inception output to spacy's training input format

こ雲淡風輕ζ 提交于 2021-01-28 21:18:42

问题


I am using INCEpTION 0.11.0 (https://inception-project.github.io/) to annotate my training data. I would like to use python spacy to use this training data. I could see couple of format in Inception to which I can exported to but I am not sure which one is best suited for spacy.

I could not see any document about converting these exported file to space’s format.

I could write a new script to do this conversion. Before doing that I was wondering is someone already solved this and can give some advice? Which export format I should choose so that it will be easier to convert to spacy’s format?


回答1:


Exporting your data as CONLLU is likely the most straightforward approach. SpaCy can convert CONLLU documents to its expected format using the the converter script: python -m spacy convert /path/to/input/doc.connlu /path/to/output/doc.jsonl -c conllu.

You'll find that it supports the conversion of CONLL documents, but it isn't immediately obvious which CONLL format is supported. You can try this by playing with the -c argument above.



来源:https://stackoverflow.com/questions/57840677/export-inception-output-to-spacys-training-input-format

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!