问题
I have a quite simple script in R. It loads in two data frames, and then performs rCCA with mixOmics:
system('defaults write org.R-project.R force.LANG en_US.UTF-8')
## install.packages("mixOmics")
library(mixOmics)
TCIA <- read.csv("/Users/kimrants/Desktop/Data_for_R/TCIA",
header=TRUE,
sep=",",
stringsAsFactors=FALSE)
TCGA <- read.csv("/Users/kimrants/Desktop/Data_for_R/TCGA",
header=TRUE,
sep=",",
stringsAsFactors=FALSE)
# Remove first column (of ID)
df_TCGA <- TCGA[,-1] df_TCIA<- TCIA[,-1]
data.shrink <- rcc(X=df_TCIA, Y=df_TCGA, ncomp = 5, method = 'shrinkage')
plot(data.shrink, scree.type = "barplot")
grid1 <- seq(0, 0.2, length = 5)
grid2 <- seq(0.0001, 0.2, length = 5)
cv <- tune.rcc(df_TCIA, df_TCGA,
grid1 = grid1, grid2 = grid2, validation = "loo")
result <- rcc(df_TCIA, df_TCGA, ncomp = 5,
lambda1= cv$opt.lambda1, lambda2 = cv$opt.lambda2)
However, when performing the second to last line, I get this error:
Error in chol.default(Cxx) : the leading minor of order 4 is not positive definite
I have visited the documentation for similar errors: http://mixomics.org/faq/parameters-tuning/
Here, it says: "This is mostly likely to occur when encountering singular matrices, where the total number of variables from both data sets is much larger than the number of samples. We suggest using regularized CCA" ... But I am already using rCCA? So I don't know how to fix this..
回答1:
The error you are seeing occurs when some of the eigenvectors of the matrix you are trying to operate on are not positive (typically they'll be zero, or below some very small threshold); this means, essentially, that your data are too noisy/small to estimate a full covariance matrix.
Regularizing means (approximately) adding a penalty term to push your estimates away from zero (in this case, pushing your matrices away from having non-positive eigenvectors). If your regularization parameters (lambda1, lambda2) are too small, then you'll get the error. Since your grid1 and grid2 sequences start from zero or very small values, rCCA will choke for these too-small values.
Try setting your grid1 and grid2 sequences to start at a larger value, e.g.
grid1 <- grid2 <- seq(0.05, 0.2, length=5)
来源:https://stackoverflow.com/questions/51064686/error-in-chol-defaultcxx-the-leading-minor-of-order-is-not-positive-definite