nlp

baidu接口使用

感情迁移 提交于 2019-12-12 12:30:10
1、如何使用百度接口 https://ai.baidu.com/tech/nlp/dnnlm_cn 应用列表中–》创建应用—>会生成一个新创建的应用,生成AK,SK 二、调用接口的使用方法: 来源: CSDN 作者: liulina603 链接: https://blog.csdn.net/liulina603/article/details/103486580

Mixing words and PoS tags in NLTK parser grammars

天大地大妈咪最大 提交于 2019-12-12 12:22:25
问题 I've been playing with NLTK for awhile already and am at the point to define custom parser grammar for special chunking. I am following the description in http://nltk.googlecode.com/svn/trunk/doc/book/ch07.html but what I am interested to do is slightly different than what is described in the chapter. For instance in example 7.10 instead using the following for the verb phase: VP: {<VB.*><NP|PP|CLAUSE>+$} I would like to just match sentences that use one particular verb and not any verb.

Python convert list of multiple words to single words

夙愿已清 提交于 2019-12-12 11:17:56
问题 I have a list of words for example: words = ['one','two','three four','five','six seven'] # quote was missing And I am trying to create a new list where each item in the list is just one word so I would have: words = ['one','two','three','four','five','six','seven'] Would the best thing to do be join the entire list into a string and then tokenize the string? Something like this: word_string = ' '.join(words) tokenize_list = nltk.tokenize(word_string) Or is there a better option? 回答1: You can

Tensorflow : ValueError: Shape must be rank 2 but is rank 3

邮差的信 提交于 2019-12-12 10:53:41
问题 I'm new to tensorflow and I'm trying to update some code for a bidirectional LSTM from an old version of tensorflow to the newest (1.0), but I get this error: Shape must be rank 2 but is rank 3 for 'MatMul_3' (op: 'MatMul') with input shapes: [100,?,400], [400,2]. The error happens on pred_mod. _weights = { # Hidden layer weights => 2*n_hidden because of foward + backward cells 'w_emb' : tf.Variable(0.2 * tf.random_uniform([max_features,FLAGS.embedding_dim], minval=-1.0, maxval=1.0, dtype=tf

NLTK data out of date - Python 3.4

孤街醉人 提交于 2019-12-12 10:37:38
问题 I'm trying to install NLTK for Python 3.4. The actual NLTK module appears to have installed fine. I then ran import nltk nltk.download() and chose to download everything. However, after it was done, the window simply says 'out of date'. I tried refreshing and downloading, yet it stays 'out of date' as shown here:NLTK Window 1 I looked online and tried various fixes, but I haven't found any that helped my case yet. I also tried to manually find the missing parts, which turned out to be 'Open

TypeError: sparse matrix length is ambiguous; use getnnz() or shape[0] while using RF classifier?

穿精又带淫゛_ 提交于 2019-12-12 10:36:25
问题 I am learning about random forests in scikit learn and as an example I would like to use Random forest classifier for text classification, with my own dataset. So first I vectorized the text with tfidf and for classification: from sklearn.ensemble import RandomForestClassifier classifier=RandomForestClassifier(n_estimators=10) classifier.fit(X_train, y_train) prediction = classifier.predict(X_test) When I run the classification I got this: TypeError: A sparse matrix was passed, but dense data

Is there an easy way generate a probable list of words from an unspaced sentence in python?

本小妞迷上赌 提交于 2019-12-12 10:33:24
问题 I have some text: s="Imageclassificationmethodscan beroughlydividedinto two broad families of approaches:" I'd like to parse this into its individual words. I quickly looked into the enchant and nltk, but didn't see anything that looked immediately useful. If I had time to invest in this, I'd look into writing a dynamic program with enchant's ability to check if a word was english or not. I would have thought there'd be something to do this online, am I wrong? 回答1: Greedy approach using trie

Pharse level dependency parser using java,nlp

℡╲_俬逩灬. 提交于 2019-12-12 10:22:51
问题 Can someone please elaborate on how to obtain " pharse level dependency" using the Stanfords's Natural Language Processing Lexical Parser- open source Java code? http://svn.apache.org/repos/asf/nutch/branches/branch-1.2/src/plugin/lib-http/src/java/org/apache/nutch/protocol/http/api/RobotRulesParser.java http://docs.mongodb.org/manual/reference/sql-comparison/ such as pharse dependency The accident --------->happened falling ---------> as the night ---------->falling as such as many more...

Algorithm to match natural text in mail

十年热恋 提交于 2019-12-12 10:06:04
问题 I need to separate natural, coherent text/sentences in emails from lists, signatures, greetings and so on before further processing. example: Hi tom, last monday we did bla bla, lore Lorem ipsum dolor sit amet, consectetur adipisici elit, sed eiusmod tempor incidunt ut labore et dolore magna aliqua. list item 2 list item 3 list item 3 Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquid x ea commodi consequat. Quis aute iure reprehenderit in voluptate velit

How to extract elements from NLP Tree?

删除回忆录丶 提交于 2019-12-12 09:57:31
问题 I am using the NLP package to parse sentences. How can I extract an element from the Tree output that is created? For example I'd like to grab the Noun Phrases ( NP ) from the example below: library(NLP) library(openNLP) s <- c( "Really, I like chocolate because it is good.", "Robots are rather evil and most are devoid of decency" ) s <- as.String(s) sent_token_annotator <- Maxent_Sent_Token_Annotator() word_token_annotator <- Maxent_Word_Token_Annotator() a2 <- annotate(s, list(sent_token