nlp | 易学教程

baidu接口使用

阅读更多关于 baidu接口使用

1、如何使用百度接口 https://ai.baidu.com/tech/nlp/dnnlm_cn 应用列表中–》创建应用—>会生成一个新创建的应用，生成AK,SK 二、调用接口的使用方法：来源： CSDN 作者： liulina603 链接： https://blog.csdn.net/liulina603/article/details/103486580

Mixing words and PoS tags in NLTK parser grammars

阅读更多关于 Mixing words and PoS tags in NLTK parser grammars

问题 I've been playing with NLTK for awhile already and am at the point to define custom parser grammar for special chunking. I am following the description in http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html but what I am interested to do is slightly different than what is described in the chapter. For instance in example 7.10 instead using the following for the verb phase: VP: {<VB.*><NP|PP|CLAUSE>+$} I would like to just match sentences that use one particular verb and not any verb.

Python convert list of multiple words to single words

阅读更多关于 Python convert list of multiple words to single words

问题 I have a list of words for example: words = ['one','two','three four','five','six seven'] # quote was missing And I am trying to create a new list where each item in the list is just one word so I would have: words = ['one','two','three','four','five','six','seven'] Would the best thing to do be join the entire list into a string and then tokenize the string? Something like this: word_string = ' '.join(words) tokenize_list = nltk.tokenize(word_string) Or is there a better option? 回答1: You can

Tensorflow : ValueError: Shape must be rank 2 but is rank 3

阅读更多关于 Tensorflow : ValueError: Shape must be rank 2 but is rank 3

问题 I'm new to tensorflow and I'm trying to update some code for a bidirectional LSTM from an old version of tensorflow to the newest (1.0), but I get this error: Shape must be rank 2 but is rank 3 for 'MatMul_3' (op: 'MatMul') with input shapes: [100,?,400], [400,2]. The error happens on pred_mod. _weights = { # Hidden layer weights => 2*n_hidden because of foward + backward cells 'w_emb' : tf.Variable(0.2 * tf.random_uniform([max_features,FLAGS.embedding_dim], minval=-1.0, maxval=1.0, dtype=tf

NLTK data out of date - Python 3.4

阅读更多关于 NLTK data out of date - Python 3.4

问题 I'm trying to install NLTK for Python 3.4. The actual NLTK module appears to have installed fine. I then ran import nltk nltk.download() and chose to download everything. However, after it was done, the window simply says 'out of date'. I tried refreshing and downloading, yet it stays 'out of date' as shown here:NLTK Window 1 I looked online and tried various fixes, but I haven't found any that helped my case yet. I also tried to manually find the missing parts, which turned out to be 'Open

TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0] while using RF classifier?

阅读更多关于 TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0] while using RF classifier?

问题 I am learning about random forests in scikit learn and as an example I would like to use Random forest classifier for text classification, with my own dataset. So first I vectorized the text with tfidf and for classification: from sklearn.ensemble import RandomForestClassifier classifier=RandomForestClassifier(n_estimators=10) classifier.fit(X_train, y_train) prediction = classifier.predict(X_test) When I run the classification I got this: TypeError: A sparse matrix was passed, but dense data

Is there an easy way generate a probable list of words from an unspaced sentence in python?

阅读更多关于 Is there an easy way generate a probable list of words from an unspaced sentence in python?

问题 I have some text: s="Imageclassificationmethodscan beroughlydividedinto two broad families of approaches:" I'd like to parse this into its individual words. I quickly looked into the enchant and nltk, but didn't see anything that looked immediately useful. If I had time to invest in this, I'd look into writing a dynamic program with enchant's ability to check if a word was english or not. I would have thought there'd be something to do this online, am I wrong? 回答1: Greedy approach using trie

Pharse level dependency parser using java,nlp

阅读更多关于 Pharse level dependency parser using java,nlp

问题 Can someone please elaborate on how to obtain " pharse level dependency" using the Stanfords's Natural Language Processing Lexical Parser- open source Java code? http://svn.apache.org/repos/asf/nutch/branches/branch-1.2/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/RobotRulesParser.java http://docs.mongodb.org/manual/reference/sql-comparison/ such as pharse dependency The accident --------->happened falling ---------> as the night ---------->falling as such as many more...

Algorithm to match natural text in mail

阅读更多关于 Algorithm to match natural text in mail

问题 I need to separate natural, coherent text/sentences in emails from lists, signatures, greetings and so on before further processing. example: Hi tom, last monday we did bla bla, lore Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. list item 2 list item 3 list item 3 Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid x ea commodi consequat. Quis aute iure reprehenderit in voluptate velit

How to extract elements from NLP Tree?

阅读更多关于 How to extract elements from NLP Tree?

问题 I am using the NLP package to parse sentences. How can I extract an element from the Tree output that is created? For example I'd like to grab the Noun Phrases ( NP ) from the example below: library(NLP) library(openNLP) s <- c( "Really, I like chocolate because it is good.", "Robots are rather evil and most are devoid of decency" ) s <- as.String(s) sent_token_annotator <- Maxent_Sent_Token_Annotator() word_token_annotator <- Maxent_Word_Token_Annotator() a2 <- annotate(s, list(sent_token