Dimension of data before and after performing PCA
问题 I'm attempting kaggle.com's digit recognizer competition using Python and scikit-learn. After removing labels from the training data, I add each row in CSV into a list like this: for row in csv: train_data.append(np.array(np.int64(row))) I do the same for the test data. I pre-process this data with PCA in order to perform dimension reduction (and feature extraction?): def preprocess(train_data, test_data, pca_components=100): # convert to matrix train_data = np.mat(train_data) # reduce both