textblob

tf-idf documents of different length

隐身守侯 提交于 2020-01-22 19:48:27
问题 i have searched the web about normalizing tf grades on cases when the documents' lengths are very different (for example, having the documents lengths vary from 500 words to 2500 words) the only normalizing i've found talk about dividing the term frequency in the length of the document, hence causing the length of the document to not have any meaning. this method though is a really bad one for normalizing tf. if any, it causes the tf grades for each document to have a very large bias (unless

Apply textblob in for each row of a dataframe

我的梦境 提交于 2019-12-30 05:08:08
问题 i have a data frame with a col which has text. I want to apply textblob and calculate sentiment value for each row. text sentiment this is great great movie great story When i execute the below code: df['sentiment'] = list(map(lambda tweet: TextBlob(tweet), df['text'])) I get the error: TypeError: The `text` argument passed to `__init__(text)` must be a string, not <class 'float'> How do you apply textBLob to each row of a col in a dataframe to get the sentiment value? 回答1: You can use .apply

Trouble installing TextBlob with pip

拥有回忆 提交于 2019-12-27 16:02:14
问题 I'm having a bit of difficulty when installing TextBlob in the command line on Windows 10 using pip. According to their docs, you need to run two commands in succession: pip install -U textblob python -m textblob.download_corpora Upon trying the first command, I get an error I have never seen before when trying to install a package: C:\Users\phys>pip install -U textblob Traceback (most recent call last): File "c:\program files (x86)\python37-32\lib\runpy.py", line 193, in _run_module_as_main

Trouble installing TextBlob for Python

萝らか妹 提交于 2019-12-24 00:47:01
问题 I am new to programming, and I'm trying to install the TextBlob library for Python to help me do some stuff. Sadly, I'm having trouble installing TextBlob, let alone use it. I am using Windows, which seems to make things more difficult. I wish I could just run the Linux commands or whatever they are that everybody uses. Anyway Here is what I have done so far: Forked the Textblob program from here. Copied the entire repository to my desktop, and opened the folder up. Using Command Prompt, ran

Error when using python textblob library tagger

房东的猫 提交于 2019-12-23 06:20:36
问题 I had the textblob library working fine for a while, but decided to install (using easy_install) an additional library (page here) claiming faster and more accurate tagging. I couldn't get it working so I uninstalled it, but it seems to have messed with the tagging function in TextBlob. I've uninstalled and reinstalled both nltk and TextBlob numerous times with both pip and easy_install, and made sure they're up to date. Here is an example of a simple script which generates the error: from

Converting POS tags from TextBlob into Wordnet compatible inputs

守給你的承諾、 提交于 2019-12-23 06:07:13
问题 I'm using Python and nltk + Textblob for some text analysis. It's interesting that you can add a POS for wordnet to make your search for synonyms more specific, but unfortunately the tagging in both nltk and Textblob aren't "compatible" with the kind of input that wordnet expects for it's synset class. Example Wordnet.synsets() requires that the POS you give it is one of n,v,a,r, like so wn.synsets("dog", POS="n,v,a,r") But a standard POS tagging from upenn_treebank looks like JJ, VBD, VBZ,

Converting POS tags from TextBlob into Wordnet compatible inputs

我是研究僧i 提交于 2019-12-23 06:07:04
问题 I'm using Python and nltk + Textblob for some text analysis. It's interesting that you can add a POS for wordnet to make your search for synonyms more specific, but unfortunately the tagging in both nltk and Textblob aren't "compatible" with the kind of input that wordnet expects for it's synset class. Example Wordnet.synsets() requires that the POS you give it is one of n,v,a,r, like so wn.synsets("dog", POS="n,v,a,r") But a standard POS tagging from upenn_treebank looks like JJ, VBD, VBZ,

Create wordforms using python

拜拜、爱过 提交于 2019-12-23 05:33:05
问题 How can I get different word forms using Python. I want to create a list like the following. Work=['Work','Working','Works'] My code: raw = nltk.clean_html(html) cleaned = re.sub(r'& ?(ld|rd)quo ?[;\]]', '\"', raw) tokens = nltk.wordpunct_tokenize(cleaned) stemmer = PorterStemmer() t = [stemmer.stem(t) if t in Words else t for t in tokens] text = nltk.Text(t) word = words(n) Words = [stemmer.stem(e) for e in word] find = ' '.join(str(e) for e in Words) search_words = set(find.split(' '))

TextBlob installation in windows

旧街凉风 提交于 2019-12-22 10:09:10
问题 I have followed the instruction in Trouble installing TextBlob for Python for TextBlob installation in the Windows 7. It got installed but when I go to Python Idle and type import TextBlob it says No module named TextBlob How to solve this problem? Or can I directly place the libraries associated with the package in the Python Lib folder and try to import it in the program? If it is advisable please tell the procedure to do that. Will it work? Any help will be highly appreciated. 回答1: Try

python textblob and text classification

☆樱花仙子☆ 提交于 2019-12-21 02:59:07
问题 I'm trying do build a text classification model with python and textblob, the script is runing on my server and in the future the idea is that users will be able to submit their text and it will be classified. i'm loading the training set from csv : # -*- coding: utf-8 -*- import sys import codecs sys.stdout = open('yyyyyyyyy.txt',"w"); from nltk.tokenize import word_tokenize from textblob.classifiers import NaiveBayesClassifier with open('file.csv', 'r', encoding='latin-1') as fp: cl =