scikit-learn

Can I use TfidfVectorizer in scikit-learn for non-English language? Also how do I read a non-English text in Python?

Deadly 提交于 2021-01-29 05:22:57
问题 I have to read a text document which contains both English and non-English (Malayalam specifically) languages in Python. The following I see: >>>text_english = 'Today is a good day' >>>text_non_english = 'ആരാണു സന്തോഷമാഗ്രഹിക്കാത്തത' Now, if I write a code to extract the first letter using >>>print(text_english[0]) 'T' and when I run >>>print(text_non_english[0]) � To get the first letter, I have to write the following >>>print(text_non_english[0:3]) ആ Why this happens? My aim to extract the

How to build a Neural Network to multiply two numbers

时光总嘲笑我的痴心妄想 提交于 2021-01-29 05:12:33
问题 I am trying to build a neural network which would multiply 2 numbers. To do the same, I took help of scikit-learn. I am going for a neural network with 2 hidden layers, (5, 3) and ReLU as my activation function. I have defined my MLPRegressor as follows: X = data.drop('Product', axis=1) y = data['Product'] X_train, X_test, y_train, y_test = train_test_split(X, y) scaler = StandardScaler() scaler.fit(X_train) X_train = scaler.transform(X_train) X_test = scaler.transform(X_test) mlp =

Why does sklearn package run in terminal but not in jupyter?

本秂侑毒 提交于 2021-01-29 03:00:44
问题 When importing sklearn in jupiter, the result is: >>> import sklearn ImportError: No module named 'sklearn' I've installed scikit-learn with pip, and pip list shows the sklearn is installed. Importing sklearn works fully in terminal, just not here in jupyter. My only thoughts are that they're running in different environments? In terminal: >>> sys.executable '/Users/Victoria/anaconda3/bin/python' However, in Jupyter: >>> sys.executable '/Users/Victoria/anaconda3/envs/py35/bin/python' Any help

Using Pandas and Sklearn.Neighbors

和自甴很熟 提交于 2021-01-29 02:48:14
问题 I'm trying to fit a KNN model on a dataframe, using Python 3.5/Pandas/Sklearn.neighbors. I've imported the data, split it into training and testing data and labels, but when I try to predict using it, I get the following error. I'm quite new to Pandas so any help would be appreciated, thanks! import pandas as pd from sklearn import cross_validation import numpy as np from sklearn.neighbors import KNeighborsRegressor seeds = pd.read_csv('seeds.tsv',sep='\t',names=['Area','Perimeter',

RuntimeError: Cannot clone object: Scikit-Learn custom estimator

你说的曾经没有我的故事 提交于 2021-01-29 02:25:23
问题 I wrote an estimator that gets as parameters a model and model's kwargs, and initiate 2 models with this kwargs (for red wine and white wine), split the data to 2 populations, run the model on each and then combines the results. Unfourtunately, my code works well, but trying to implement GridSearch fails due to a failure in sanity check of the parameters of the clone. class run_estimator (BaseEstimator, TransformerMixin): def __init__(self, model=None, **kwargs): self.model = model self.model

Determine whether a model is pytorch model or a tensorflow model or scikit model

▼魔方 西西 提交于 2021-01-29 00:39:41
问题 If I want to determine the type of model i.e. from which framework was it made programmatically, is there a way to do that? I have a model in some serialized manner(Eg. a pickle file). For simplicity purposes, assume that my model can be either tensorflow's, pytorch's or scikit learn's. How can I determine programmatically which one of these 3 is the one? 回答1: AFAIK, I have never heard of Tensorflow/Keras and Pytorch models to be saved with pickle or joblib - these frameworks provide their

Determine whether a model is pytorch model or a tensorflow model or scikit model

二次信任 提交于 2021-01-29 00:36:15
问题 If I want to determine the type of model i.e. from which framework was it made programmatically, is there a way to do that? I have a model in some serialized manner(Eg. a pickle file). For simplicity purposes, assume that my model can be either tensorflow's, pytorch's or scikit learn's. How can I determine programmatically which one of these 3 is the one? 回答1: AFAIK, I have never heard of Tensorflow/Keras and Pytorch models to be saved with pickle or joblib - these frameworks provide their

All probability values are less than 0.5 on unseen data

女生的网名这么多〃 提交于 2021-01-28 23:38:14
问题 I have 15 features with a binary response variable and I am interested in predicting probabilities than 0 or 1 class labels. When I trained and tested the RF model with 500 trees, CV, balanced class weight, and balanced samples in the data frame, I achieved a good amount of accuracy and also good Brier score. As you can see in the image, the predicted probabilities values of class 1 on test data are in between 0 to 1. Here is the Histogram of predicted probabilities on test data: with

All probability values are less than 0.5 on unseen data

[亡魂溺海] 提交于 2021-01-28 23:31:52
问题 I have 15 features with a binary response variable and I am interested in predicting probabilities than 0 or 1 class labels. When I trained and tested the RF model with 500 trees, CV, balanced class weight, and balanced samples in the data frame, I achieved a good amount of accuracy and also good Brier score. As you can see in the image, the predicted probabilities values of class 1 on test data are in between 0 to 1. Here is the Histogram of predicted probabilities on test data: with

All probability values are less than 0.5 on unseen data

泄露秘密 提交于 2021-01-28 23:25:13
问题 I have 15 features with a binary response variable and I am interested in predicting probabilities than 0 or 1 class labels. When I trained and tested the RF model with 500 trees, CV, balanced class weight, and balanced samples in the data frame, I achieved a good amount of accuracy and also good Brier score. As you can see in the image, the predicted probabilities values of class 1 on test data are in between 0 to 1. Here is the Histogram of predicted probabilities on test data: with