machine-learning

Categorical and ordinal feature data difference in regression analysis?

谁说胖子不能爱 提交于 2021-02-19 05:15:46
问题 I am trying to completely understand difference between categorical and ordinal data when doing regression analysis. For now, what is clear: Categorical feature and data example: Color: red, white, black Why categorical: red < white < black is logically incorrect Ordinal feature and data example: Condition: old, renovated, new Why ordinal: old < renovated < new is logically correct Categorical-to-numeric and ordinal-to-numeric encoding methods: One-Hot encoding for categorical data Arbitrary

How to fix “IndexError: list index out of range” in Tensorflow

半城伤御伤魂 提交于 2021-02-19 04:29:48
问题 I'm creating an Image Classifier using Tensorflow and Keras, but when I tried to train my model I got an error: IndexError: list index out of range. I think the problem is with my model, because when I remove the conv2D layers, then the code throws no error. model = Sequential() model.add(Conv2D(64,(3,3),activation='relu',padding='same')) model.add(Conv2D(64,(3,3),activation='relu',padding='same')) model.add(MaxPool2D((2,2),strides=(2,2))) model.add(Conv2D(128,(3,3),activation='relu',padding=

Why val_loss and val_acc are not displaying?

六月ゝ 毕业季﹏ 提交于 2021-02-19 03:19:38
问题 When the training starts, in the run window only loss and acc are displayed, the val_loss and val_acc are missing. Only at the end, these values are showed. model.add(Flatten()) model.add(Dense(512, activation="relu")) model.add(Dropout(0.5)) model.add(Dense(10, activation="softmax")) model.compile( loss='categorical_crossentropy', optimizer="adam", metrics=['accuracy'] ) model.fit( x_train, y_train, batch_size=32, epochs=1, validation_data=(x_test, y_test), shuffle=True ) this is how the

Adapting Tensorflow RNN Seq2Seq model code for Tensorflow 2.0

£可爱£侵袭症+ 提交于 2021-02-19 03:00:15
问题 I am very new to Tensorflow and have been messing around with a simple chatbot-building project from this link. There were many warnings that were saying that things would be deprecated in Tensorflow 2.0 and that I should upgrade, so I did. I then used the automatic Tensorflow code upgrader to update all the necessary files to 2.0. There were a few errors with this. When processing the model.py file, it returned these warnings: 133:20: WARNING: tf.nn.sampled_softmax_loss requires manual check

Data Prediction using Decision Tree of rpart

妖精的绣舞 提交于 2021-02-18 22:28:47
问题 I am using R to classify a data-frame called 'd' containing data structured like below: The data has 576666 rows and the column "classLabel" has a factor of 3 levels: ONE, TWO, THREE. I am making a decision tree using rpart: fitTree = rpart(d$classLabel ~ d$tripduration + d$from_station_id + d$gender + d$birthday) And I want to predict the values for the "classLabel" for newdata : newdata = data.frame( tripduration=c(345,244,543,311), from_station_id=c(60,28,100,56), gender=c("Male","Female",

Make better machine learning prediction thanks to negative feedback

冷暖自知 提交于 2021-02-18 18:19:07
问题 I'm currently using sklearn library in python to use supervised machine learning. I have a list of records like this: [x1, x2, x3] -> [y1] And i'm using the Bag Of Words technique. It all works. Sometimes it could happen that the user says the prediction is not right. Something like a negative record: [x1, x2, x3] != [y1] I would like that if this happens the next time (or after many negative feedbacks) the same prediction won't appear. 来源: https://stackoverflow.com/questions/45545178/make

Using multiple custom classes with Pipeline sklearn (Python)

元气小坏坏 提交于 2021-02-18 11:43:27
问题 I try to do a tutorial on Pipeline for students but I block. I'm not an expert but I'm trying to improve. So thank you for your indulgence. In fact, I try in a pipeline to execute several steps in preparing a dataframe for a classifier: Step 1: Description of the dataframe Step 2: Fill NaN Values Step 3: Transforming Categorical Values into Numbers Here is my code: class Descr_df(object): def transform (self, X): print ("Structure of the data: \n {}".format(X.head(5))) print ("Features names:

Using multiple custom classes with Pipeline sklearn (Python)

落爺英雄遲暮 提交于 2021-02-18 11:43:10
问题 I try to do a tutorial on Pipeline for students but I block. I'm not an expert but I'm trying to improve. So thank you for your indulgence. In fact, I try in a pipeline to execute several steps in preparing a dataframe for a classifier: Step 1: Description of the dataframe Step 2: Fill NaN Values Step 3: Transforming Categorical Values into Numbers Here is my code: class Descr_df(object): def transform (self, X): print ("Structure of the data: \n {}".format(X.head(5))) print ("Features names:

Plot k-Nearest-Neighbor graph with 8 features?

回眸只為那壹抹淺笑 提交于 2021-02-18 10:28:11
问题 I'm new to machine learning and would like to setup a little sample using the k-nearest-Neighbor-method with the Python library Scikit . Transforming and fitting the data works fine but I can't figure out how to plot a graph showing the datapoints surrounded by their "neighborhood". The dataset I'm using looks like that: So there are 8 features, plus one "outcome" column. From my understanding, I get an array, showing the euclidean-distances of all datapoints, using the kneighbors_graph from

sklearn multiclass svm function

久未见 提交于 2021-02-18 08:30:48
问题 I have multi class labels and want to compute the accuracy of my model. I am kind of confused on which sklearn function I need to use. As far as I understood the below code is only used for the binary classification. # dividing X, y into train and test data X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25,random_state = 0) # training a linear SVM classifier from sklearn.svm import SVC svm_model_linear = SVC(kernel = 'linear', C = 1).fit(X_train, y_train) svm