问题
I have this list:
lst
lst <- list(a=c(2.5,9.8,5.0,6.7,6.5,5.2,34.4, 4.2,39.5, 1.3,0.0,0.0,4.1,0.0,0.0,25.5,196.5, 0.0,104.2,0.0,0.0,0.0,0.0,0.0),b=c(147.4,122.9,110.2,142.3))
I would like to calculate for each values of a list and for each element of a list (a
and b
) a z.score as: (x[i]-mean(x)/sd(x)
, where x are all values (togheter) of each element of a list and x[i] each single component of each list element.
I tried with lapply
lapply(lst,function (x) as.data.frame(apply(x,2, function(y)- lapply(lst,mean)/lapply(lst,sd))))
but there is an error...
maybe with for
loop as:
lst.new <- vector("list",1)
for (i in 1:length(lst)){
for (j in 1:dim(data.frame(lst[i]))[1]){
res[j] <- (as.numeric(unlist(lst[i]))[j]-mean(as.numeric(unlist(lst[i])))/
sd(as.numeric(unlist(lst[i])))
lst.new[[i]] <- res
}
}
but the result is strange (sure I'm wrong in the lst.new
output):
[[1]]
[1] -0.3635464 -0.1982809 -0.3069486 -0.2684621 -0.2729899 -0.3024208 0.3586413 -0.3250599 0.4741007 -0.3907133
[11] -0.4201442 -0.4201442 -0.3273238 -0.4201442 -0.4201442 0.1571532 4.0284412 -0.4201442 1.9388512 -0.4201442
[21] -0.4201442 -0.4201442 -0.4201442 -0.4201442
[[2]]
[1] 0.9671130 -0.4517055 -1.1871746 0.6717671 -0.2729899 -0.3024208 0.3586413 -0.3250599 0.4741007 -0.3907133
[11] -0.4201442 -0.4201442 -0.3273238 -0.4201442 -0.4201442 0.1571532 4.0284412 -0.4201442 1.9388512 -0.4201442
[21] -0.4201442 -0.4201442 -0.4201442 -0.4201442
the expected result can be a list or a data frame with different length as:
a b
-0.36 0.967113
-0.19 -0.45
[...] [...]
and so on...
P.S:
0.36 == (2.5- mean(unlist(lst[1])))/sd(unlist(lst[1]))
0.967113 == (147.4 -mean(unlist(lst[2])))/sd(unlist(lst[2]))
It's better for me to use lapply
(or his family function) and to resolve the problem
回答1:
Just for completeness' sake, if there wasn't the scale
function @akrun pointed out, your code should have been:
lapply(lst,function(x) x-mean(x)/sd(x))
all those lapply
s within apply
s mean you're trying to calculate the mean
and sd
of individual values...
Let's work through it step by step.
lapply
takes lst
and breaks it down into elements. Each element in turn is given as the argument to your anonymous function. That means the function gets a vector of numbers. Then, using R's vectorization, what we do is calculate for every element of the vector the result of that element, minus the mean
of the whole vector divided by the sd
of the whole vector.
Compare that with what happens in your code:
lapply(lst,function (x) as.data.frame(apply(x,2, function(y)- lapply(lst,mean)/lapply(lst,sd))))
So the first lapply
breaks lst and sends the vectors one at a time to your function.
The function then has to break the vector down by columns (apply
with dimension argument 2) - which is where it throws the error. But even if it succeeded to just break down the vector into elements, you then have two more lapply
s that break down that single element and calculate the mean
and sd
for them individually.
回答2:
Based on the input and the expected output, scale
should work
lapply(lst, scale)
来源:https://stackoverflow.com/questions/53597621/lapply-and-apply-for-each-component-and-element-of-a-list-r