sklearn-pandas

how to fix ''Found input variables with inconsistent numbers of samples: [219, 247]''

时间秒杀一切 提交于 2021-01-28 08:32:13
问题 As title says when running the following code i get a trouble Found input variables with inconsistent numbers of samples: [219, 247], i have read that the problem should be on the np.array set for X and y, but i cannot address the problem because there is a price for every date so i dont get why it is happening, any help will be appreciated thanks! import pandas as pd import quandl, math, datetime import numpy as np from sklearn import preprocessing, svm, model_selection from sklearn.linear

Consistent ColumnTransformer for intersecting lists of columns

ε祈祈猫儿з 提交于 2021-01-24 08:17:31
问题 I want to use sklearn.compose.ColumnTransformer consistently (not parallel, so, the second transformer should be executed only after the first) for intersecting lists of columns in this way: log_transformer = p.FunctionTransformer(lambda x: np.log(x)) df = pd.DataFrame({'a': [1,2, np.NaN, 4], 'b': [1,np.NaN, 3, 4], 'c': [1 ,2, 3, 4]}) compose.ColumnTransformer(n_jobs=1, transformers=[ ('num', impute.SimpleImputer() , ['a', 'b']), ('log', log_transformer, ['b', 'c']), ('scale', p

Collate model coefficients across multiple test-train splits from sklearn

我的梦境 提交于 2021-01-07 02:42:38
问题 I would like to combine the model/feature coefficients from multiple (random) test-train splits into a single dataframe in python. Currently, my approach this is to generate model coefficients for each test-train split one at a time and then combining them at the end of the code. While this works, this is excessively verbose and not feasible to extend to very large number of test-train splits. Can somebody simplify my approach with a simple for loop perhaps? My inelegant, excessively verbose,

Collate model coefficients across multiple test-train splits from sklearn

限于喜欢 提交于 2021-01-07 02:41:40
问题 I would like to combine the model/feature coefficients from multiple (random) test-train splits into a single dataframe in python. Currently, my approach this is to generate model coefficients for each test-train split one at a time and then combining them at the end of the code. While this works, this is excessively verbose and not feasible to extend to very large number of test-train splits. Can somebody simplify my approach with a simple for loop perhaps? My inelegant, excessively verbose,

Collate model coefficients across multiple test-train splits from sklearn

狂风中的少年 提交于 2021-01-07 02:40:01
问题 I would like to combine the model/feature coefficients from multiple (random) test-train splits into a single dataframe in python. Currently, my approach this is to generate model coefficients for each test-train split one at a time and then combining them at the end of the code. While this works, this is excessively verbose and not feasible to extend to very large number of test-train splits. Can somebody simplify my approach with a simple for loop perhaps? My inelegant, excessively verbose,

How to view cluster centroids for each iteration of n_init using skleans' KMeans

血红的双手。 提交于 2020-12-31 05:15:58
问题 I am currently trying to view the created centroids(cluster centers) for each iteration of KMeans that is determined from each iteration of n_init. As of now I am able to view the final results but I would like to see these at each iteration so I am able to report the differences of KMeans when using init='random' and preset cluster centers at each iteration. The following is a brief example of what I currently have \ #Creating model for Kmeans Model=[] Model=KMeans(n_clusters=5,max_iter=10,n

How to view cluster centroids for each iteration of n_init using skleans' KMeans

倾然丶 夕夏残阳落幕 提交于 2020-12-31 05:12:36
问题 I am currently trying to view the created centroids(cluster centers) for each iteration of KMeans that is determined from each iteration of n_init. As of now I am able to view the final results but I would like to see these at each iteration so I am able to report the differences of KMeans when using init='random' and preset cluster centers at each iteration. The following is a brief example of what I currently have \ #Creating model for Kmeans Model=[] Model=KMeans(n_clusters=5,max_iter=10,n

How to view cluster centroids for each iteration of n_init using skleans' KMeans

无人久伴 提交于 2020-12-31 05:12:02
问题 I am currently trying to view the created centroids(cluster centers) for each iteration of KMeans that is determined from each iteration of n_init. As of now I am able to view the final results but I would like to see these at each iteration so I am able to report the differences of KMeans when using init='random' and preset cluster centers at each iteration. The following is a brief example of what I currently have \ #Creating model for Kmeans Model=[] Model=KMeans(n_clusters=5,max_iter=10,n

How to view cluster centroids for each iteration of n_init using skleans' KMeans

自闭症网瘾萝莉.ら 提交于 2020-12-31 05:11:21
问题 I am currently trying to view the created centroids(cluster centers) for each iteration of KMeans that is determined from each iteration of n_init. As of now I am able to view the final results but I would like to see these at each iteration so I am able to report the differences of KMeans when using init='random' and preset cluster centers at each iteration. The following is a brief example of what I currently have \ #Creating model for Kmeans Model=[] Model=KMeans(n_clusters=5,max_iter=10,n