My simple question is: How do you do a ks.test between two data frames column by column?
Eg. We have two data frames:
D1 <- data.fra
Created two data.frames D1 and D2 with some random numbers and same column names.
set.seed(12)
D1 = data.frame(A=rnorm(n = 30,mean = 5,sd = 2.5),B=rnorm(n = 30,mean = 4.5,sd = 2.2),C=rnorm(n = 30,mean = 2.5,sd = 12))
D2 = data.frame(A=rnorm(n = 30,mean = 5,sd = 2.49),B=rnorm(n = 30,mean = 4.4,sd = 2.2),C=rnorm(n = 30,mean = 2,sd = 12))
Now we can use the column names to loop through and pass it to D1 and D2 to perform the ks.test on the corresponding columns of the respective data.frames.
col.names = colnames(D1)
lapply(col.names,function(t,d1,d2){ks.test(d1[,t],d2[,t])},D1,D2)
#[[1]]
#Two-sample Kolmogorov-Smirnov test
#data: d1[, t] and d2[, t]
#D = 0.167, p-value = 0.81
#alternative hypothesis: two-sided
#[[2]]
#Two-sample Kolmogorov-Smirnov test
#data: d1[, t] and d2[, t]
#D = 0.233, p-value = 0.39
#alternative hypothesis: two-sided
#[[3]]
#Two-sample Kolmogorov-Smirnov test
#data: d1[, t] and d2[, t]
#D = 0.2, p-value = 0.59
#alternative hypothesis: two-sided
In the notation you have used in the question description, ideally the following code should work:
col.names =colnames(S)
lapply(col.names,function(t,d1,d2){ks.test(d1[,t],d2[,t])},D,S)