How to extract components after performing principal component regression for further analysis in R caret package

眉间皱痕 提交于 2019-12-09 01:48:50

问题


I had a dataset that contained 151 variables, that were found to be high in colinearility, so I performed principal component regression on it by doing the following:-

ctrl <- trainControl(method = "repeatedcv", repeats = 10, savePred = T)
model <- train(RT..seconds.~., data = cadets100, method = "pcr", trControl = ctrl)

which gives me me:- RMSE = 65.7 R-squared 0.443

I was just wondering how I went about extracting these components after so that I could get say apply further analysis (i.e. perform SVM on it, or random forest)


回答1:


If you want to do SVM, RF or whatever second classifier on top of the scores of your PCs, then there is a shortcut to that instead of trying to re-invent caret package.

you can do the following:

set.seed(1)
sigDist <- sigest(RT..seconds.~., data = cadets100, frac = 1)

svmGrid <- expand.grid(.sigma = sigDist, .C = 2^(-2:7))
set.seed(2)
svmPCAFit <- train(RT..seconds.~.,
                  method = "svmRadial",
                  tuneGrid = svmrGrid,                  
                  preProcess = c("center","scale","pca"), # if center and scale needed
                  trControl = ctrl)

This way pca will be done on each fold of test, and scores will be used instead of observations for the SVM classifier. So you don't need to do it yourself, caret would do it for you automatically. All what you pass in the preProcess will by applied to the new data set whether be it a CV fold test or fitting the holdout test set.

However, if you want to perform PLS, which is a supervised method as opposed to PCA, before passing the scores to the next classifier, then you have to custom such a model in caret (see here). More on examples you can study the code here also, there you will find two custom models, one for PLS-RF, and PLS-LDA.



来源:https://stackoverflow.com/questions/20505613/how-to-extract-components-after-performing-principal-component-regression-for-fu

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!