supervised-learning | 易学教程

Training Tagger with Custom Tags in NLTK

阅读更多关于 Training Tagger with Custom Tags in NLTK

I have a document with tagged data in the format Hi here's my [KEYWORD phone number], let me know when you wanna hangout: [PHONE 7802708523]. I live in a [PROP_TYPE condo] in [CITY New York] . I want to train a model based on a set of these type of tagged documents, and then use my model to tag new documents. Is this possible in NLTK? I have looked at chunking and NLTK-Trainer scripts, but these have a restricted set of tags and corpora, while my dataset has custom tags. As @AleksandarSavkov wrote already, this is essentially a named entity recognition (NER) task-- or more generally a chunking

Convolutional Neural Network (CNN) for Audio [closed]

阅读更多关于 Convolutional Neural Network (CNN) for Audio [closed]

I have been following the tutorials on DeepLearning.net to learn how to implement a convolutional neural network that extracts features from images. The tutorial are well explained, easy to understand and follow. I want to extend the same CNN to extract multi-modal features from videos (images + audio) at the same time. I understand that video input is nothing but a sequence of images (pixel intensities) displayed in a period of time (ex. 30 FPS) associated with audio. However, I don't really understand what audio is, how it works, or how it is broken down to be feed into the network. I have

Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

阅读更多关于 Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

My problem: I have a dataset which is a large JSON file. I read it and store it in the trainList variable. Next, I pre-process it - in order to be able to work with it. Once I have done that I start the classification: I use the kfold cross validation method in order to obtain the mean accuracy and train a classifier. I make the predictions and obtain the accuracy & confusion matrix of that fold. After this, I would like to obtain the True Positive(TP) , True Negative(TN) , False Positive(FP) and False Negative(FN) values. I'll use these parameters to obtain the Sensitivity and Specificity .

Plot learning curves with caret package and R

阅读更多关于 Plot learning curves with caret package and R

问题 I would like to study the optimal tradeoff between bias/variance for model tuning. I'm using caret for R which allows me to plot the performance metric (AUC, accuracy...) against the hyperparameters of the model (mtry, lambda, etc.) and automatically chooses the max. This typically returns a good model, but if I want to dig further and choose a different bias/variance tradeoff I need a learning curve, not a performance curve. For the sake of simplicity, let's say my model is a random forest,

How should I teach machine learning algorithm using data with big disproportion of classes? (SVM)

阅读更多关于 How should I teach machine learning algorithm using data with big disproportion of classes? (SVM)

I am trying to teach my SVM algorithm using data of clicks and conversion by people who see the banners. The main problem is that the clicks is around 0.2% of all data so it's big disproportion in it. When I use simple SVM in testing phase it always predict only "view" class and never "click" or "conversion". In average it gives 99.8% right answers (because of disproportion), but it gives 0% right prediction if you check "click" or "conversion" ones. How can you tune the SVM algorithm (or select another one) to take into consideration the disproportion? The most basic approach here is to use

What is the difference between supervised learning and unsupervised learning?

阅读更多关于 What is the difference between supervised learning and unsupervised learning?

In terms of artificial intelligence and machine learning, what is the difference between supervised and unsupervised learning? Can you provide a basic, easy explanation with an example? Davide Since you ask this very basic question, it looks like it's worth specifying what Machine Learning itself is. Machine Learning is a class of algorithms which is data-driven, i.e. unlike "normal" algorithms it is the data that "tells" what the "good answer" is. Example: a hypothetical non-machine learning algorithm for face detection in images would try to define what a face is (round skin-like-colored

Convolutional Neural Network (CNN) for Audio [closed]

阅读更多关于 Convolutional Neural Network (CNN) for Audio [closed]

问题 Closed . This question needs to be more focused. It is not currently accepting answers. Want to improve this question? Update the question so it focuses on one problem only by editing this post. Closed 11 months ago . I have been following the tutorials on DeepLearning.net to learn how to implement a convolutional neural network that extracts features from images. The tutorial are well explained, easy to understand and follow. I want to extend the same CNN to extract multi-modal features from

Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

阅读更多关于 Scikit-learn: How to obtain True Positive, True Negative, False Positive and False Negative

问题 My problem: I have a dataset which is a large JSON file. I read it and store it in the trainList variable. Next, I pre-process it - in order to be able to work with it. Once I have done that I start the classification: I use the kfold cross validation method in order to obtain the mean accuracy and train a classifier. I make the predictions and obtain the accuracy & confusion matrix of that fold. After this, I would like to obtain the True Positive(TP) , True Negative(TN) , False Positive(FP)

How should I teach machine learning algorithm using data with big disproportion of classes? (SVM)

阅读更多关于 How should I teach machine learning algorithm using data with big disproportion of classes? (SVM)

问题 I am trying to teach my SVM algorithm using data of clicks and conversion by people who see the banners. The main problem is that the clicks is around 0.2% of all data so it's big disproportion in it. When I use simple SVM in testing phase it always predict only "view" class and never "click" or "conversion". In average it gives 99.8% right answers (because of disproportion), but it gives 0% right prediction if you check "click" or "conversion" ones. How can you tune the SVM algorithm (or