spacy

How to import text from CoNNL format with named entities into spaCy, infer entities with my model and write them to the same dataset (with Python)?

依然范特西╮ 提交于 2020-12-06 16:20:27
问题 I have a dataset in CoNLL NER format which is basically a TSV file with two fields. The first field contains tokens from some text - one token per line (each punctuation symbol is also considered a token there) and the second field contains named entity tags for tokens in BIO format. I would like to load this dataset into spaCy, infer new named entity tags for the text with my model and write these tags into the same TSV file as the new third column. All I know is that I can infer named

How to add a Spacy model to a requirements.txt file?

别来无恙 提交于 2020-12-05 10:24:03
问题 I have an app that uses the Spacy model "en_core_web_sm". I have tested the app on my local machine and it works fine. However when I deploy it to Heroku, it gives me this error: "Can't find model 'en_core_web_sm'. It doesn't seem to be a shortcut link, a Python package or a valid path to a data directory." My requirements file contains spacy==2.2.4. I have been doing some research on this error and found that the model needs to be downloaded separately using this command: python -m spacy

Python: Chunking others than noun phrases (e.g. prepositional) using Spacy, etc

最后都变了- 提交于 2020-12-01 07:24:22
问题 Since I was told Spacy was such a powerful Python module for natural speech processing, I am now desperately looking for a way to group words together to more than noun phrases, most importantly, prepositional phrases. I doubt there is a Spacy function for this but that would be the easiest way I guess (SpacySpaCy import is already implemented in my project). Nevertheless, I'm open for any possibility of phrase recognition/ chunking. 回答1: Here's a solution to get PPs. In general you can get

spaCy's regex is different to Python's regex

与世无争的帅哥 提交于 2020-11-29 10:28:02
问题 I have the following text text = 'Monday to Friday 12 midnight to 5am 30% . Midnight Friday to 6am Saturday 30% . 9pm Saturday to Midnight Saturday 25% . Midnight Saturday to 6am Sunday 100% . 6am Sunday to 9pm Sunday 50%' When I used normal regex, I obtained the following import re regex = '\d{1}[a|p]m' re.findall(regex, text) # Returned: ['5am', '6am', '9pm', '6am', '6am', '6pm'] However, when I used the same regex in spaCy, I got nothing back from spacy.matcher import Matcher nlp = spacy

Spacy Update msvc not found

别来无恙 提交于 2020-11-29 10:25:46
问题 I'trying to update spacy from version 2.0.18 to version 2.1.1. But every time I try to run the command pip install spacy-nightly or pip install -U spacy==2.1.1 I just get error: [WinError 2] System cannot find file specified msvc py_compiler msvc with a lot of unreadable output. Now I figured it has something to do with the C++ compiler spacy uses and I installed like every package I found at the Microsoft Visual Website but my problem didnt solve itself. I really would appreciate some help!

spaCy's regex is different to Python's regex

我的梦境 提交于 2020-11-29 10:25:09
问题 I have the following text text = 'Monday to Friday 12 midnight to 5am 30% . Midnight Friday to 6am Saturday 30% . 9pm Saturday to Midnight Saturday 25% . Midnight Saturday to 6am Sunday 100% . 6am Sunday to 9pm Sunday 50%' When I used normal regex, I obtained the following import re regex = '\d{1}[a|p]m' re.findall(regex, text) # Returned: ['5am', '6am', '9pm', '6am', '6am', '6pm'] However, when I used the same regex in spaCy, I got nothing back from spacy.matcher import Matcher nlp = spacy