data-science | 易学教程

SVC classifier taking too much time for training

阅读更多关于 SVC classifier taking too much time for training

问题 I am using SVC classifier with Linear kernel to train my model. Train data: 42000 records model = SVC(probability=True) model.fit(self.features_train, self.labels_train) y_pred = model.predict(self.features_test) train_accuracy = model.score(self.features_train,self.labels_train) test_accuracy = model.score(self.features_test, self.labels_test) It takes more than 2 hours to train my model. Am I doing something wrong? Also, what can be done to improve the time Thanks in advance 回答1: There are

Split Column into Unknown Number of Columns by Delimiter Pandas

阅读更多关于 Split Column into Unknown Number of Columns by Delimiter Pandas

问题 I am trying to split a column into multiple columns based off comma/space seperation. my dataframe currently looks like Item Colors 0 ID-1 Red, Blue, Green 1 ID-2 Red, Blue 2 ID-3 Blue, Green 3 ID-4 Blue 4 ID-5 Red I would like to transform the 'Colors' column into Red, Blue and Green like this: Item Red Blue Green 0 ID-1 1 1 1 1 ID-2 1 1 0 2 ID-3 0 1 1 3 ID-4 0 1 0 4 ID-5 1 0 1 I really have no idea how to do this. Any help would be greatly appreciated. 回答1: You can using get_dummies pd

Neural network is not giving the expected output after training in Python

阅读更多关于 Neural network is not giving the expected output after training in Python

问题 My neural network is not giving the expected output after training in Python. Is there any error in the code? Is there any way to reduce the mean squared error (MSE)? I tried to train (Run the program) the network repeatedly but it is not learning, instead it is giving the same MSE and output. Here is the Data I used: https://drive.google.com/open?id=1GLm87-5E_6YhUIPZ_CtQLV9F9wcGaTj2 Here is my code: #load and evaluate a saved model from numpy import loadtxt from tensorflow.keras.models

How can I merge merge two dictionries while performing addition operation on same on its values, if the keys match?

阅读更多关于 How can I merge merge two dictionries while performing addition operation on same on its values, if the keys match?

问题 I have data that looks like this: current Now, I wrote a code that returns a dictionary like this: history I have other dictionary that looks like almost the same with more nesting, like this: latest Now, If I have these two dictionaries, I want to merge them such that if: dict1 = {201: {'U': {'INR': 10203, 'SGD': 10203, 'USD': 10203, 'YEN': 10203}, 'V': {'INR': 10203, 'SGD': 10203, 'USD': 10203, 'YEN': 10203}} and dict2= {201: {'X': {'GBP': 10203, 'SGD': 10203, 'USD': 10203, 'YEN': 10203},

Date Difference based on matching values in two columns - Pandas

阅读更多关于 Date Difference based on matching values in two columns - Pandas

问题 I have a dataframe, I am struggling to create a column based out of other columns, I will share the problem for a sample data. Date Target1 Close 0 2018-05-25 198.0090 188.580002 1 2018-05-25 197.6835 188.580002 2 2018-05-25 198.0090 188.580002 3 2018-05-29 196.6230 187.899994 4 2018-05-29 196.9800 187.899994 5 2018-05-30 197.1375 187.500000 6 2018-05-30 196.6965 187.500000 7 2018-05-30 196.8750 187.500000 8 2018-05-31 196.2135 186.869995 9 2018-05-31 196.2135 186.869995 10 2018-05-31 196

Randomly reassign participants to groups such that participants originally from same group don't end up in same group

阅读更多关于 Randomly reassign participants to groups such that participants originally from same group don't end up in same group

问题 I'm basically trying to do this Monte Carlo kind of analysis where I randomly reassign the participants in my experiment to new groups, and then reanalyze the data given the random new groups. So here's what I want to do: Participants are originally grouped into eight groups of four participants each. I want to randomly reassign each participant to a new group, but I don't want any participants to end up in a new group with another participant from their same original group . Here is how far

Randomly reassign participants to groups such that participants originally from same group don't end up in same group

阅读更多关于 Randomly reassign participants to groups such that participants originally from same group don't end up in same group

Plotting the count of occurrences per date

阅读更多关于 Plotting the count of occurrences per date

问题 I'm very new to pandas data frame that has a date time column, and a column that contains a string of text (headlines). Each headline will be a new row. I need to plot the date on the x-axis, and the y-axis needs to contain how many times a headline occurs on each date. So for example, one date may contain 3 headlines. What's the simplest way to do this? I can't figure out how to do it at all. Maybe add another column with a '1' for each row? If so, how would you do this? Please point me in

Trip Advisor Scraping 'moreLink'

阅读更多关于 Trip Advisor Scraping 'moreLink'

问题 I've been building a web scraper in BS4 and have gotten stuck. I am using Trip Advisor as a test for other data I will be going after, but am not able to isolate the tag of the 'entire' reviews. Here is an example: https://www.tripadvisor.com/Restaurant_Review-g56010-d470148-Reviews-Chez_Nous-Humble_Texas.html Notice in the first review, there is an icon below "the wine list is...". I am able to easily isolate the partial reviews, but have not been able to figure out a way to get BS4 to pull

R : knnImputation Giving Error

阅读更多关于 R : knnImputation Giving Error

问题 Getting below error in R coding. in my Brand_X.xlsx dataset, there are few NA values which I am trying to compute using KNN imputation but I am getting below error. whats wrong here? Thanks! > library(readxl) > Brand_X <- read_excel("Brand_X.xlsx") > str(Brand_X) Classes ‘tbl_df’, ‘tbl’ and 'data.frame': 101 obs. of 8 variables: $ Rel_price_lag5: num 108 111 105 103 109 104 110 114 103 108 ... $ Rel_price_lag1: num 110 109 217 241 855 271 234 297 271 999 ... $ Rel_Price : num 122 110 109 217