spacy

How does spacy lemmatizer works?

主宰稳场 提交于 2019-11-29 13:35:54
问题 For lemmatization spacy has a lists of words: adjectives, adverbs, verbs... and also lists for exceptions: adverbs_irreg... for the regular ones there is a set of rules Let's take as example the word "wider" As it is an adjective the rule for lemmatization should be take from this list: ADJECTIVE_RULES = [ ["er", ""], ["est", ""], ["er", "e"], ["est", "e"] ] As I understand the process will be like this: 1) Get the POS tag of the word to know whether it is a noun, a verb... 2) If the word is

Noun phrases with spacy

与世无争的帅哥 提交于 2019-11-28 18:45:55
How can I extract noun phrases from text using spacy? I am not referring to part of speech tags. In the documentation I cannot find anything about noun phrases or regular parse trees. syllogism_ If you want base NPs, i.e. NPs without coordination, prepositional phrases or relative clauses, you can use the noun_chunks iterator on the Doc and Span objects: >>> from spacy.en import English >>> nlp = English() >>> doc = nlp(u'The cat and the dog sleep in the basket near the door.') >>> for np in doc.noun_chunks: >>> np.text u'The cat' u'the dog' u'the basket' u'the door' If you need something else

spacy adding special case tokenization rules by regular expression or pattern

冷暖自知 提交于 2019-11-28 09:36:28
问题 I want to add special case for tokenization in spacy according to the documentation. The documentation shows how specific words can be considered as special cases. I want to be able to specify a pattern (e.g. a suffix). For example, I have a string like this text = "A sample string with <word-1> and <word-2>" where <word-i> specifies a single word. I know I can have it for one special case at a time by the following code. But how can I specify a pattern for that? import spacy from spacy

I get CERTIFICATE_VERIFY_FAILED when I try to install the spaCy English language model

假装没事ソ 提交于 2019-11-28 09:19:32
I'm running OS X El Capitan on Python 3.5.2 via Anaconda and have spaCy 0.101.0. I'm trying to install the spaCy English language model using python -m spacy.en.download . However when I do that, I get an error that says urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)> . The complete Traceback is as follows: Traceback (most recent call last): File "/Users/bsherman/anaconda/lib/python3.5/urllib/request.py", line 1254, in do_open h.request(req.get_method(), req.selector, req.data, headers) File "/Users/bsherman/anaconda/lib/python3.5

Failed building wheel for spacy

孤街浪徒 提交于 2019-11-28 08:20:57
I'm trying to install spacy by running pip install spacy for python version 3.6.1 but continuously i'm getting errors like below,how to get rid of this issue? previously i was having cl.exe not found error, after that i added visual studio path in environment variables where cl.exe exists. Failed building wheel for spacy Running setup.py clean for spacy Running setup.py bdist_wheel for murmurhash ... error Complete output from command c:\users\sh00428701\appdata\local\programs\python\python36\python.exe -u -c "import setuptools, tokenize;__file__='C:\\Users\\SH0042~1\\AppData\\Local\\Temp\\pip

Spacy annotation tool entities indices

本秂侑毒 提交于 2019-11-28 02:14:48
How can I read my annotated data in Spacy? 1) My annotated data's form: "annotation": [ [ 79, 99, "Nom complet" ], 2) Annotated data's form in the script: "annotation": [ { "label": [ "Companies worked at" ], "points": [ { "start": 1749, "end": 1754, "text": "Oracle" } ] }, 3) How can I change this code that can read my annotated data? for line in lines: data = json.loads(line) text = data['text'] entities = [] for annotation in data['annotation']: #only a single point in text annotation. point = annotation['points'][0] labels = annotation['label'] # handle both list of labels or a single

How to get the dependency tree with spaCy?

你说的曾经没有我的故事 提交于 2019-11-27 17:27:10
I have been trying to find how to get the dependency tree with spaCy but I can't find anything on how to get the tree, only on how to navigate the tree . Christos Baziotis In case someone wants to easily view the dependency tree produced by spacy, one solution would be to convert it to an nltk.tree.Tree and use the nltk.tree.Tree.pretty_print method. Here is an example: import spacy from nltk import Tree en_nlp = spacy.load('en') doc = en_nlp("The quick brown fox jumps over the lazy dog.") def to_nltk_tree(node): if node.n_lefts + node.n_rights > 0: return Tree(node.orth_, [to_nltk_tree(child)

ImportError: No module named 'spacy.en'

允我心安 提交于 2019-11-27 14:18:02
I'm working on a codebase that uses Spacy. I installed spacy using: sudo pip3 install spacy and then sudo python3 -m spacy download en At the end of this last command, I got a message: Linking successful /home/rayabhik/.local/lib/python3.5/site-packages/en_core_web_sm --> /home/rayabhik/.local/lib/python3.5/site-packages/spacy/data/en You can now load the model via spacy.load('en') Now, when I try running my code, on the line: from spacy.en import English it gives me the following error: ImportError: No module named 'spacy.en' I've looked on Stackexchange and the closest is: Import error with

Spacy annotation tool entities indices

懵懂的女人 提交于 2019-11-27 04:52:19
问题 How can I read my annotated data in Spacy? 1) My annotated data's form: "annotation": [ [ 79, 99, "Nom complet" ], 2) Annotated data's form in the script: "annotation": [ { "label": [ "Companies worked at" ], "points": [ { "start": 1749, "end": 1754, "text": "Oracle" } ] }, 3) How can I change this code that can read my annotated data? for line in lines: data = json.loads(line) text = data['text'] entities = [] for annotation in data['annotation']: #only a single point in text annotation.

I get CERTIFICATE_VERIFY_FAILED when I try to install the spaCy English language model

帅比萌擦擦* 提交于 2019-11-27 02:49:32
问题 I'm running OS X El Capitan on Python 3.5.2 via Anaconda and have spaCy 0.101.0. I'm trying to install the spaCy English language model using python -m spacy.en.download . However when I do that, I get an error that says urllib.error.URLError: <urlopen error [SSL: CERTIFICATE_VERIFY_FAILED] certificate verify failed (_ssl.c:645)> . The complete Traceback is as follows: Traceback (most recent call last): File "/Users/bsherman/anaconda/lib/python3.5/urllib/request.py", line 1254, in do_open h