spacy

ImportError: No module named 'spacy.en'

久未见 提交于 2019-12-17 09:47:10
问题 I'm working on a codebase that uses Spacy. I installed spacy using: sudo pip3 install spacy and then sudo python3 -m spacy download en At the end of this last command, I got a message: Linking successful /home/rayabhik/.local/lib/python3.5/site-packages/en_core_web_sm --> /home/rayabhik/.local/lib/python3.5/site-packages/spacy/data/en You can now load the model via spacy.load('en') Now, when I try running my code, on the line: from spacy.en import English it gives me the following error:

Dataflow job failed after more than 6 hours with “The worker lost contact with the service”?

余生颓废 提交于 2019-12-14 04:02:23
问题 I am using Dataflow to read data from BigQuery and then do NLP preprocessing using python. I am using Python 3 and SDK 2.16.0 . I am using 100 workers (provite IP, private access and Cloud NAT) with workers in europe-west6 and endpoint in europe-west1 . The BigQuery tables are in US . Test jobs were working without any issue but when trying to process the full table (32 GB), the job failed after 6h 40 min and it is hard to fully understand what is the underlying error. First the following is

Train spaCy's existing POS tagger with my own training examples

烂漫一生 提交于 2019-12-14 03:44:25
问题 I am trying to train the existing POS tagger on my own lexicon, not starting off from scratch (I do not want to create an "empty model"). In spaCy's documentation, it says "Load the model you want to stat with", and the next step is "Add the tag map to the tagger using add_label method". However, when I try to load the English small model, and add the tag map, it throws this error: ValueError: [T003] Resizing pre-trained Tagger models is not currently supported. I was wondering how it can be

Running out of RAM when writing to a file line by line [Python]

风流意气都作罢 提交于 2019-12-13 05:49:36
问题 I have a data processing task on some large data. I run the script on EC2 using Python that looks something like the following: with open(LARGE_FILE, 'r') as f: with open(OUTPUT_FILE, 'w') as out: for line in f: results = some_computation(line) out.write(json.dumps(results)) out.write('\n') I loop over the data line by line and write the results to another file line by line. After running it for a few hours, I can't log in to the server. I would have to restart the instance to continue. $ ssh

Not able to import en_core_web_sm or Spacy

[亡魂溺海] 提交于 2019-12-13 03:47:12
问题 I am trying to import en_core_web_sm as independent package and also tried through spacy . But I am getting an error in ujson module in both case. Error : ModuleNotFoundError: No module named 'srsly.ujson.ujson' I installed en_core_web_sm through following command python -m spacy download en_core_web_sm Going by Spacy documentation it shall work. But it is not. I want to import en_core_web_sm . 回答1: srsly listed as a dependency so it should be installed. If it's not installed just install it

spaCy: errors attempting to load serialized Doc

…衆ロ難τιáo~ 提交于 2019-12-13 03:44:13
问题 I am trying to serialize/deserialize spaCy documents (setup is Windows 7, Anaconda) and am getting errors. I haven't been able to find any explanations. Here is a snippet of code and the error it generates: import spacy nlp = spacy.load('en') text = 'This is a test.' doc = nlp(text) fout = 'test.spacy' # <-- according to the API for Doc.to_disk(), this needs to be a directory (but for me, spaCy writes a file) doc.to_disk(fout) doc.from_disk(fout) Traceback (most recent call last): File "

Better way to use SpaCy to parse sentences?

依然范特西╮ 提交于 2019-12-13 03:31:24
问题 I'm using SpaCy to find sentences that contain 'is' or 'was' that have pronouns as their subjects and return the object of the sentence. My code works, but I feel like there must be a much better way to do this. import spacy nlp = spacy.load('en_core_web_sm') ex_phrase = nlp("He was a genius. I really liked working with him. He is a dog owner. She is very kind to animals.") #create an empty list to hold any instance of this particular construction list_of_responses = [] #split into sentences

Spacy-nightly (spacy 2.0) issue with “thinc.extra.MaxViolation has wrong size”

…衆ロ難τιáo~ 提交于 2019-12-12 18:02:12
问题 After apparently successful installation of spacy-nightly (spacy-nightly-2.0.0a14) and english model (en_core_web_sm) I was still receiving error message during attempt to run it import spacy nlp = spacy.load('en_core_web_sm') ValueError: thinc.extra.search.MaxViolation has the wrong size, try recompiling. Expected 104, got 128 I tried to reinstall spacy and model as well and it has not help. Tried it again within new venv (Python 3.6) 回答1: Issue is probably with thinc package, spacy-nightly

RASA nlu parse not give currect intent, give same intent in result

假如想象 提交于 2019-12-12 08:16:13
问题 RASA version : 0.10.5 spaCy version : 1.9.0 Installed models : en, en_core_web_sm I am creating training data using dialogflow export data and successfully finished training but when I request and give some text it gives wrong intent result. It always gives the same intent in result and also every time same intent_ranking. Please let me know how can I get proper intent results as well as entities result. 回答1: The general recommendation when Rasa NLU seems to be functioning correctly, but

spacy english model install is failing

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-12 05:59:58
问题 windows 10, python 26 - 32 bit. vc++ 32 bit. console as admin. failing to install English model as instructed here tried also German. tried to download and link it manually. something wrong with spacy link command. Anyone knows about this issue? Traceback (most recent call last): File "c:\python27\lib\runpy.py", line 174, in _run_module_as_main "__main__", fname, loader, pkg_name) File "c:\python27\lib\runpy.py", line 72, in _run_code exec code in run_globals File "c:\python27\lib\site