nlp | 易学教程

Generating dictionaries to categorize tweets into pre-defined categories using NLTK

阅读更多关于 Generating dictionaries to categorize tweets into pre-defined categories using NLTK

问题 I have a list of twitter users (screen_names) and I need to categorise them into 7 pre-defined categories - Education, Art, Sports, Business, Politics, Automobiles, Technology based on thier interest area. I have extracted last 100 tweets of the users in Python and created a corpus for each user after cleaning the tweets. As mentioned here Tweet classification into multiple categories on (Unsupervised data/tweets) : I am trying to generate dictionaries of common words under each category so

Generating dictionaries to categorize tweets into pre-defined categories using NLTK

阅读更多关于 Generating dictionaries to categorize tweets into pre-defined categories using NLTK

Parsing city of origin / destination city from a string

阅读更多关于 Parsing city of origin / destination city from a string

问题 I have a pandas dataframe where one column is a bunch of strings with certain travel details. My goal is to parse each string to extract the city of origin and destination city (I would like to ultimately have two new columns titled 'origin' and 'destination'). The data: df_col = [ 'new york to venice, italy for usd271', 'return flights from brussels to bangkok with etihad from â‚¬407', 'from los angeles to guadalajara, mexico for usd191', 'fly to australia new zealand from paris from â‚¬422

Parsing city of origin / destination city from a string

阅读更多关于 Parsing city of origin / destination city from a string

Extract Noun Phrases with Stanza and CoreNLPClient

阅读更多关于 Extract Noun Phrases with Stanza and CoreNLPClient

问题 I am trying to extract noun phrases from sentences using Stanza(with Stanford CoreNLP). This can only be done with the CoreNLPClient module in Stanza. # Import client module from stanza.server import CoreNLPClient # Construct a CoreNLPClient with some basic annotators, a memory allocation of 4GB, and port number 9001 client = CoreNLPClient(annotators=['tokenize','ssplit','pos','lemma','ner', 'parse'], memory='4G', endpoint='http://localhost:9001') Here is an example of a sentence, and I am

Extract Noun Phrases with Stanza and CoreNLPClient

阅读更多关于 Extract Noun Phrases with Stanza and CoreNLPClient

Easy way to clamp Neural Network outputs between 0 and 1?

阅读更多关于 Easy way to clamp Neural Network outputs between 0 and 1?

问题 So I'm working on writing a GAN neural network and I want to set my network's output to 0 if it is less than 0 and 1 if it is greater than 1 and leave it unchanged otherwise. I'm pretty new to tensorflow, but I don't know of any tensorflow function or activation to do this without unwanted side effects. So I made my loss function so it calculates the loss as if the output was clamped, with this code: def discriminator_loss(real_output, fake_output): real_output_clipped = min(max(real_output

Using regex in spaCy: matching various (different cased) words

阅读更多关于 Using regex in spaCy: matching various (different cased) words

问题 Edit due to off-topic I want to use regex in SpaCy to find any combination of (Accrued or accrued or Annual or annual) leave by this code: from spacy.matcher import Matcher nlp = spacy.load('en_core_web_sm') matcher = Matcher(nlp.vocab) # Add the pattern to the matcher matcher.add('LEAVE', None, [{'TEXT': {"REGEX": "(Accrued|accrued|Annual|annual)"}}, {'LOWER': 'leave'}]) # Call the matcher on the doc doc= nlp('Annual leave shall be paid at the time . An employee is to receive their annual

Using regex in spaCy: matching various (different cased) words

阅读更多关于 Using regex in spaCy: matching various (different cased) words

Incompatible shapes: [128,37] vs. [128,34]

阅读更多关于 Incompatible shapes: [128,37] vs. [128,34]

问题 I have added attention layer in an LStM model for encoder-decoder. The model.fit function history = model.fit_generator(generator = generate_batch(X_train, y_train, batch_size = batch_size), steps_per_epoch = train_samples//batch_size, epochs=epochs, validation_data = generate_batch(X_test, y_test, batch_size = batch_size), validation_steps = val_samples//batch_size) And this is the error I am getting ---------------------------------------------------------------------------