问题
Here is a simple r question which basically pertains to correctly understanding list syntax I think. I have a series of matrices loaded into a list (following some preliminary calculations) which I then want to conduct some basic block averaging on. My basic workflow will be as follows:
1) Rounding each vector contained within a list to an integer corresponding to the number of blocks I am interested in averaging to.
2) Padding each vector in a list to this new length.
3) Conversion of each matrix in the list to a new matrix to which I will then apply colmeans ignoring NA's.
This very basic workflow follows the simple approach shown here for a vector: http://www.cookbook-r.com/Manipulating_data/Averaging_a_sequence_in_blocks/
However I have a list of vectors and not just a vector. For example for blocks of two:
test1 <- list(a=c(1,2,3,4), b=c(2,4,6,8,10), c=c(3,6))
# Round up the length of vector the to the nearest 2
newlength <- lapply(test1, function(x) {ceiling(length(x)/2)*2})
Now to my problem. If these were matrices outside a list I would normally pad their length with NAs as follows:
test1[newlength] <- NA
But how to do this using lappy (or something akin- mapply?). I am obviously not thinking about the syntax correctly here:
lapply(test1, function(x) {x[newlength] <- NA})
This obviously returns the error:
Error in x[newlength] <- NA : invalid subscript type 'list'
since the syntax for a list is incorrect. So how should I do this correctly?
Just to finish the process in case there is an entirely better way of doing this at the end I would normally do the following to a vector:
# Convert to a matrix with 2 rows
test1 <- matrix(test1, nrow=2)
# Take the means of the columns, and ignore any NA's
colMeans(test1, na.rm=TRUE)
Would I be better leaving a list environment first? My reason for the list is that I have a large dataset and using a list seemed a more elegant approach. I am open to suggestions and more logical approaches however. Thanks.
回答1:
There are lots of ways to fix your problem, but I think there are two important improvements to make. The first is to do all this in a single call to lapply(). The other main problem you have is that there is no actual return() value from the function() in your call that returns the error (sorry, on a tablet, difficult to copy and paste). So you pad out "x" ok, but what do you tell function() to return? Nothing.
Here is one solution that does both these things, if I understand you correctly:
lapply(test1, function(x){
newlength <- ceiling(length(x)/2)*2
if(newlength!=length(x)){x[newlength] <- NA}
colMeans(matrix(x, nrow=2), na.rm=TRUE)
})
回答2:
It sounds like you want:
mapply(function(x,y) {
# x[y] <- NA # OP's proposed strategy
length(x) <- y # Roland's better suggestion
return(x)
}, test1, newlength)
来源:https://stackoverflow.com/questions/17804389/pad-each-element-in-a-list-to-specific-length-in-r