Performing PCA on large sparse matrix by using sklearn
问题 I am trying to apply PCA on huge sparse matrix, in the following link it says that randomizedPCA of sklearn can handle sparse matrix of scipy sparse format. Apply PCA on very large sparse matrix However, I always get error. Can someone point out what I am doing wrong. Input matrix 'X_train' contains numbers in float64: >>>type(X_train) <class 'scipy.sparse.csr.csr_matrix'> >>>X_train.shape (2365436, 1617899) >>>X_train.ndim 2 >>>X_train[0] <1x1617899 sparse matrix of type '<type 'numpy