PCA on transposed data

大憨熊 提交于 2020-12-13 10:36:23

问题


I am using R to do some PCA analysis. Everything was working fine until it occurred to me that I should be dealing with the transpose of my data set. However when I tried to do PCA on the transposed data set I could not get it to work out!

> sum(is.na(data_t))
[1] 1367
> dim(data_t)
[1]  599 9505
> data_t[1:4,1:4]
                             2'-PDE    7A5      A1BG     A2M
TCGA.A1.A0SD.01A.11R.A115.07  0.0153750 2.4105 0.9493333 0.24200
TCGA.A1.A0SE.01A.11R.A084.07  0.4669375 0.3635 0.2798333 1.03850
TCGA.A1.A0SH.01A.11R.A084.07 -0.0295625 1.8550 0.7486667 1.16050
TCGA.A1.A0SJ.01A.11R.A084.07  0.7919375 1.4080 0.7500000 1.67775

> pca2<-princomp(~.,data=data_t, na.action=na.omit)
 Error in `[.data.frame`(mf, , x) : undefined columns selected

> pca2<-princomp(data_t, na.action=na.omit)
 Error in princomp.default(data_t, na.action = na.omit) : 
  'princomp' can only be used with more units than variables

Turns out that you cannot use princomp if you have more variables than units. But you can use prcomp (see R - 'princomp' can only be used with more units than variables) but I still get errors with that!

> pca2<-prcomp(data_t,na.action=na.omit)
 Error in svd(x, nu = 0) : infinite or missing values in 'x'

> pca2<-prcomp(~ ., data=data_t, na.action=na.omit, scale=TRUE)
 Error in `[.data.frame`(mf, , x) : undefined columns selected

回答1:


I had the same problem. For me, it worked when I assigned column names (other than numeric) to my data.frame. For example, when colnames(mydf) was (1,2,3,4,5), I got this error:

Error in [.data.frame(mf, , x) : undefined columns selected

What I did was:

colnames(mydf) <- paste("var", 1:5, sep="")

and then ran the princomp function:

mypca <- princom(~. , data=myrdf, cor=F, na.action=na.exclude)

and had no problems.




回答2:


Seems like R does not like it when there is missing data and you try to use a formula with all of the variables. So this ended up working:

pca2<-prcomp(na.omit(data_t), scale=TRUE)

of course this omits those columns with missing data.



来源:https://stackoverflow.com/questions/14325880/pca-on-transposed-data

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!