pca

how do I find the angles between an original and a rotated PCA loadings matrix?

只谈情不闲聊 提交于 2019-12-11 11:07:47
问题 Suppose I have two matrices of PCA loadings loa.orig , and loa.rot , and I know that loa.rot is a rotation (by-hand or otherwise) of loa.orig . ( loa.orig might also have been already orthogonally rotated by varimax or something, but I don't think that matters). I know want to know the angles by which loa.orig has been rotated to arrive at loa.rot . I understand from this comment on another question that "rotations don't commute", and therefore, the order of pair-wise (plane-wise) rotations

scores not being generated from princomp, unable to generate biplot

你离开我真会死。 提交于 2019-12-11 10:57:24
问题 I am having issues with princomp , specifically biplot , when I want to use a covariance or correlation matrix not generated by princomp itself. For simplicity I will use a much smaller dataset than the one I am dealing with. cr <- cov.wt(USArrests) biplot(princomp(data = USArrests, covmat = cr)) gives me the error Error in biplot.princomp(princomp(data = USArrests, covmat = cr)) : object 'princomp(data = USArrests, covmat = cr)' has no scores Seems like something simple going on here, but

Apply pca to the test data

走远了吗. 提交于 2019-12-11 07:32:26
问题 I am trying to perform the python implementation of PCA using sklearn. I have created the following function: def dimensionality_reduction(train_dataset_mod1, train_dataset_mod2, test_dataset_mod1, test_dataset_mod2): pca = PCA(n_components= 200) pca.fit(train_dataset_mod1.transpose()) mod1_features_train = pca.components_ pca2 = PCA(n_components=200) pca2.fit(train_dataset_mod2.transpose()) mod2_features_train = pca2.components_ mod1_features_test = pca.transform(test_dataset_mod1) mod2

Keep csv feature labels for LDA pca

半城伤御伤魂 提交于 2019-12-11 07:25:14
问题 I am trying to use the 2000 topics' top 20 frequency data at https://github.com/wwbp/facebook_topics/tree/master/csv I would like to perform randomizedPCA on the data. From the documentation, X needs to be array-like, shape (n_samples, n_features) . I have imported the file with LDA_topics = pd.read_csv(r'2000topics.top20freqs.keys.csv', header=None, index_col=0, error_bad_lines=False) however this is not the right format for the following line: pca2 = sklearn.decomposition.RandomizedPCA(n

scikit learn PCA dimension reduction - data lot of features and few samples

旧城冷巷雨未停 提交于 2019-12-11 07:08:52
问题 I am trying to do a dimension reduction using PCA from scikit-learn. My data set has around 300 samples and 4096 features. I want to reduce the dimensions to 400 and 40. But when I call the algorithm the resulting data does have at most "number of samples" features. from sklearn.decomposition import PCA pca = PCA(n_components = 400) trainData = pca.fit_transform(trainData) testData = pca.transform(testData) Where initial shape of trainData is 300x4096 and the resulting data shape is 300x300.

3D Biplot in plotly - R

拟墨画扇 提交于 2019-12-11 06:09:15
问题 I want to build a 3D PCA bi-plot using plotly package because the graph is nice and interactive in html format (something that I need). My difficulty is to add the loading. I want the loading to be presented as straight lines from the point (0,0,0) (i.e. the equivalent to 2D biplots) So all in all I don't know how to add straight lines starting from the centre of the 3D graph. I have calculated the scores and loading using the PCA function; pca1 <- PCA (dat1, graph = F) for scores: ind1 <-

Saving pca object in opencv

流过昼夜 提交于 2019-12-11 04:24:33
问题 I'm working on a face recognition project in which we are using PCA to reduce feature vector size of an image. The trouble is, during training, I create the PCA object by incorporating all the training images. Now, during testing, I need the PCA object obtained earlier. I cannot seem to figure out how to write the PCA object to a file, so that I can use it during testing. One alternative is that I write it's eigenvectors to the file. But it would be so much more convenient to write the object

Scikit-learn (sklearn) PCA throws Type Error on sparse matrix

我们两清 提交于 2019-12-11 03:25:38
问题 From the documentation of sklearn RandomizedPCA, sparse matrices are accepted as input. However when I called it with a sparse matrix, I got a TypeError : > sklearn.__version__ '0.16.1' > pca = RandomizedPCA(n_components=2) > pca.fit(my_sparce_mat) TypeError: A sparse matrix was passed, but dense data is required. Use X.toarray() to convert to a dense numpy array. I obtained the same error using fit_transform . Any suggestion on how to have it work? 回答1: The answer is that it is not possible

Why are there differences in psych::principal between “Varimax” and “varimax”?

百般思念 提交于 2019-12-11 03:16:37
问题 In a related question, I have asked why there are differences between stats::varimax and GPArotation::Varimax , both of which psych::principal calls, depending on the option set for rotate = . The differences between these two (see other question) account for some, but not all of the differences from psych::principal . It appears that these differences somehow get exacerbated by psych::principal . (I have a simple theory why, and I'd like to get that confirmed). library(GPArotation) library

sklearn StandardScaler differece between “with_std=False or True” and “with_mean=False or True”

时光毁灭记忆、已成空白 提交于 2019-12-11 02:24:42
问题 I am trying to standardize some data to be able to apply PCA to it. I am using sklearn.preprocessing.StandardScaler. I am having trouble to understand the difference between using "True" or "False" in the parameters "with_mean" and "with_std". Here is the description of the command: https://scikit-learn.org/stable/modules/generated/sklearn.preprocessing.StandardScaler.html Can someone give a more extended explanation? Thank you very much! 回答1: I have provided more details in this post https:/