nlp

Generating dictionaries to categorize tweets into pre-defined categories using NLTK

给你一囗甜甜゛ 提交于 2020-06-24 12:21:19
问题 I have a list of twitter users (screen_names) and I need to categorise them into 7 pre-defined categories - Education, Art, Sports, Business, Politics, Automobiles, Technology based on thier interest area. I have extracted last 100 tweets of the users in Python and created a corpus for each user after cleaning the tweets. As mentioned here Tweet classification into multiple categories on (Unsupervised data/tweets) : I am trying to generate dictionaries of common words under each category so

Generating dictionaries to categorize tweets into pre-defined categories using NLTK

别等时光非礼了梦想. 提交于 2020-06-24 12:17:46
问题 I have a list of twitter users (screen_names) and I need to categorise them into 7 pre-defined categories - Education, Art, Sports, Business, Politics, Automobiles, Technology based on thier interest area. I have extracted last 100 tweets of the users in Python and created a corpus for each user after cleaning the tweets. As mentioned here Tweet classification into multiple categories on (Unsupervised data/tweets) : I am trying to generate dictionaries of common words under each category so

Parsing city of origin / destination city from a string

*爱你&永不变心* 提交于 2020-06-22 06:54:13
问题 I have a pandas dataframe where one column is a bunch of strings with certain travel details. My goal is to parse each string to extract the city of origin and destination city (I would like to ultimately have two new columns titled 'origin' and 'destination'). The data: df_col = [ 'new york to venice, italy for usd271', 'return flights from brussels to bangkok with etihad from €407', 'from los angeles to guadalajara, mexico for usd191', 'fly to australia new zealand from paris from €422

Parsing city of origin / destination city from a string

强颜欢笑 提交于 2020-06-22 06:53:30
问题 I have a pandas dataframe where one column is a bunch of strings with certain travel details. My goal is to parse each string to extract the city of origin and destination city (I would like to ultimately have two new columns titled 'origin' and 'destination'). The data: df_col = [ 'new york to venice, italy for usd271', 'return flights from brussels to bangkok with etihad from €407', 'from los angeles to guadalajara, mexico for usd191', 'fly to australia new zealand from paris from €422

Extract Noun Phrases with Stanza and CoreNLPClient

吃可爱长大的小学妹 提交于 2020-06-17 13:29:27
问题 I am trying to extract noun phrases from sentences using Stanza(with Stanford CoreNLP). This can only be done with the CoreNLPClient module in Stanza. # Import client module from stanza.server import CoreNLPClient # Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001 client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001') Here is an example of a sentence, and I am

Extract Noun Phrases with Stanza and CoreNLPClient

|▌冷眼眸甩不掉的悲伤 提交于 2020-06-17 13:27:25
问题 I am trying to extract noun phrases from sentences using Stanza(with Stanford CoreNLP). This can only be done with the CoreNLPClient module in Stanza. # Import client module from stanza.server import CoreNLPClient # Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001 client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001') Here is an example of a sentence, and I am

Easy way to clamp Neural Network outputs between 0 and 1?

喜你入骨 提交于 2020-06-17 00:04:28
问题 So I'm working on writing a GAN neural network and I want to set my network's output to 0 if it is less than 0 and 1 if it is greater than 1 and leave it unchanged otherwise. I'm pretty new to tensorflow, but I don't know of any tensorflow function or activation to do this without unwanted side effects. So I made my loss function so it calculates the loss as if the output was clamped, with this code: def discriminator_loss(real_output, fake_output): real_output_clipped = min(max(real_output

Using regex in spaCy: matching various (different cased) words

笑着哭i 提交于 2020-06-16 07:27:33
问题 Edit due to off-topic I want to use regex in SpaCy to find any combination of (Accrued or accrued or Annual or annual) leave by this code: from spacy.matcher import Matcher nlp = spacy.load('en_core_web_sm') matcher = Matcher(nlp.vocab) # Add the pattern to the matcher matcher.add('LEAVE', None, [{'TEXT': {"REGEX": "(Accrued|accrued|Annual|annual)"}}, {'LOWER': 'leave'}]) # Call the matcher on the doc doc= nlp('Annual leave shall be paid at the time . An employee is to receive their annual

Using regex in spaCy: matching various (different cased) words

ぐ巨炮叔叔 提交于 2020-06-16 07:26:28
问题 Edit due to off-topic I want to use regex in SpaCy to find any combination of (Accrued or accrued or Annual or annual) leave by this code: from spacy.matcher import Matcher nlp = spacy.load('en_core_web_sm') matcher = Matcher(nlp.vocab) # Add the pattern to the matcher matcher.add('LEAVE', None, [{'TEXT': {"REGEX": "(Accrued|accrued|Annual|annual)"}}, {'LOWER': 'leave'}]) # Call the matcher on the doc doc= nlp('Annual leave shall be paid at the time . An employee is to receive their annual

Incompatible shapes: [128,37] vs. [128,34]

心不动则不痛 提交于 2020-06-16 06:38:07
问题 I have added attention layer in an LStM model for encoder-decoder. The model.fit function history = model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size), steps_per_epoch = train_samples//batch_size, epochs=epochs, validation_data = generate_batch(X_test, y_test, batch_size = batch_size), validation_steps = val_samples//batch_size) And this is the error I am getting ---------------------------------------------------------------------------