Assume we have two numeric vectors x
and y
. The Pearson correlation coefficient between x
and y
is given by
You might try bootstrapping your data to find the highest correlation coefficient, e.g.:
x <- cars$dist
y <- cars$speed
percent <- 0.9 # given in the question above
n <- 1000 # number of resampling
boot.cor <- replicate(n, {tmp <- sample(round(length(x)*percent), replace=FALSE); cor(x[tmp], y[tmp])})
And after run max(boot.cor)
. Do not be dissapointed if all the correlation coefficients will be all the same :)