check_arrays() limiting array dimensions in scikit-learn?

冷暖自知 提交于 2019-12-13 06:47:17

问题


I would like to use the sklearn.learning_curves.py available in scikit-learn X0.15. After I cloned this version, several functions no longer work because check_arrays() is limiting the dimension of the arrays to 2.

>>> from sklearn import metrics 
>>> from sklearn.cross_validation import train_test_split 
>>> import numpy as np
>>> X = np.random.random((10,2,2,2))
>>> y = np.random.random((10,2,2,2))
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)
>>> error "Found array with dim 4d. Expected <= 2"

Using the same X and y I get the same error.

>>> mse = metrics.mean_squared_error
>>> mse(X,y)
>>> error "Found array with dim 4d. Expected <= 2"

If I go to sklearn.utils.validation.py and comment out lines 272, 273, and 274 as shown below everything works just fine.

# if array.ndim >= 3:
#     raise ValueError("Found array with dim %d. Expected <= 2" %
#                      array.ndim)

Why are the dimensions of the arrays being limited to 2?


回答1:


Because scikit-learn uses a 2-d convention (n_samples × n_features) for all feature data. If any function or method lets a higher-d array through, that's usually just oversight and you can't really rely on it.



来源:https://stackoverflow.com/questions/23790091/check-arrays-limiting-array-dimensions-in-scikit-learn

易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!