Pad each element in a list to specific length in R

问题

Here is a simple r question which basically pertains to correctly understanding list syntax I think. I have a series of matrices loaded into a list (following some preliminary calculations) which I then want to conduct some basic block averaging on. My basic workflow will be as follows:

1) Rounding each vector contained within a list to an integer corresponding to the number of blocks I am interested in averaging to.

2) Padding each vector in a list to this new length.

3) Conversion of each matrix in the list to a new matrix to which I will then apply colmeans ignoring NA's.

This very basic workflow follows the simple approach shown here for a vector: http://www.cookbook-r.com/Manipulating_data/Averaging_a_sequence_in_blocks/

However I have a list of vectors and not just a vector. For example for blocks of two:

test1 <- list(a=c(1,2,3,4), b=c(2,4,6,8,10), c=c(3,6))
# Round up the length of vector the to the nearest 2
newlength <-  lapply(test1, function(x) {ceiling(length(x)/2)*2})

Now to my problem. If these were matrices outside a list I would normally pad their length with NAs as follows:

test1[newlength] <- NA

But how to do this using lappy (or something akin- mapply?). I am obviously not thinking about the syntax correctly here:

lapply(test1, function(x) {x[newlength] <- NA})

This obviously returns the error:

Error in x[newlength] <- NA : invalid subscript type 'list'

since the syntax for a list is incorrect. So how should I do this correctly?

Just to finish the process in case there is an entirely better way of doing this at the end I would normally do the following to a vector:

# Convert to a matrix with 2 rows
test1 <- matrix(test1, nrow=2)
# Take the means of the columns, and ignore any NA's
colMeans(test1, na.rm=TRUE)

Would I be better leaving a list environment first? My reason for the list is that I have a large dataset and using a list seemed a more elegant approach. I am open to suggestions and more logical approaches however. Thanks.

回答1:

There are lots of ways to fix your problem, but I think there are two important improvements to make. The first is to do all this in a single call to lapply(). The other main problem you have is that there is no actual return() value from the function() in your call that returns the error (sorry, on a tablet, difficult to copy and paste). So you pad out "x" ok, but what do you tell function() to return? Nothing.

Here is one solution that does both these things, if I understand you correctly:

lapply(test1, function(x){
  newlength <- ceiling(length(x)/2)*2
  if(newlength!=length(x)){x[newlength] <- NA}
  colMeans(matrix(x, nrow=2), na.rm=TRUE)
})

回答2:

It sounds like you want:

mapply(function(x,y) {
     # x[y] <- NA # OP's proposed strategy
     length(x) <- y # Roland's better suggestion
     return(x)
     }, test1, newlength)

来源：https://stackoverflow.com/questions/17804389/pad-each-element-in-a-list-to-specific-length-in-r

标签

lapply