What are the supported Date and Time Formats in Spacy 2.0

安稳与你 提交于 2021-01-01 06:57:25

问题


I am using the following models in my application:

en_core_web_sm

xx_ent_wiki_sm

I wanted to know the supported Date and Time formats that default Spacy model can extract.

Python Version Used:3.6 spaCy Version Used: 2.0.x


回答1:


The English models were trained on the OntoNotes 5 corpus, which supports the more extensive label scheme including DATE and TIME.

The xx_ent_wiki_sm model was trained on a Wikipedia corpus with a more limited label scheme and only recognises PER, LOC, ORG and MISC out of the box (model details here).

When using the models to extract mentions of date and time, it's important to keep in mind that it's a statistical process – so the results you see will depend on the context and the data the models were trained on. Depending on the texts you're working with, you likely want to update and fine-tune the pre-trained models with more examples specific to your application, or try a rule-based approach instead. Also see this thread for more details on date and time parsing.



来源:https://stackoverflow.com/questions/50792574/what-are-the-supported-date-and-time-formats-in-spacy-2-0

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!