问题
I would like to use the sklearn.learning_curves.py available in scikit-learn X0.15. After I cloned this version, several functions no longer work because check_arrays() is limiting the dimension of the arrays to 2.
>>> from sklearn import metrics
>>> from sklearn.cross_validation import train_test_split
>>> import numpy as np
>>> X = np.random.random((10,2,2,2))
>>> y = np.random.random((10,2,2,2))
>>> X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.5, random_state=3)
>>> error "Found array with dim 4d. Expected <= 2"
Using the same X and y I get the same error.
>>> mse = metrics.mean_squared_error
>>> mse(X,y)
>>> error "Found array with dim 4d. Expected <= 2"
If I go to sklearn.utils.validation.py and comment out lines 272, 273, and 274 as shown below everything works just fine.
# if array.ndim >= 3:
# raise ValueError("Found array with dim %d. Expected <= 2" %
# array.ndim)
Why are the dimensions of the arrays being limited to 2?
回答1:
Because scikit-learn uses a 2-d convention (n_samples
× n_features
) for all feature data. If any function or method lets a higher-d array through, that's usually just oversight and you can't really rely on it.
来源:https://stackoverflow.com/questions/23790091/check-arrays-limiting-array-dimensions-in-scikit-learn