Pad each element in a list to specific length in R

匿名 (未验证) 提交于 2019-12-03 01:34:02

问题:

Here is a simple r question which basically pertains to correctly understanding list syntax I think. I have a series of matrices loaded into a list (following some preliminary calculations) which I then want to conduct some basic block averaging on. My basic workflow will be as follows:

1) Rounding each vector contained within a list to an integer corresponding to the number of blocks I am interested in averaging to.

2) Padding each vector in a list to this new length.

3) Conversion of each matrix in the list to a new matrix to which I will then apply colmeans ignoring NA's.

This very basic workflow follows the simple approach shown here for a vector: http://www.cookbook-r.com/Manipulating_data/Averaging_a_sequence_in_blocks/

However I have a list of vectors and not just a vector. For example for blocks of two:

test1 <- list(a=c(1,2,3,4), b=c(2,4,6,8,10), c=c(3,6)) # Round up the length of vector the to the nearest 2 newlength <-  lapply(test1, function(x) {ceiling(length(x)/2)*2})

Now to my problem. If these were matrices outside a list I would normally pad their length with NAs as follows:

test1[newlength] <- NA

But how to do this using lappy (or something akin- mapply?). I am obviously not thinking about the syntax correctly here:

lapply(test1, function(x) {x[newlength] <- NA})

This obviously returns the error:

Error in x[newlength] <- NA : invalid subscript type 'list'

since the syntax for a list is incorrect. So how should I do this correctly?

Just to finish the process in case there is an entirely better way of doing this at the end I would normally do the following to a vector:

# Convert to a matrix with 2 rows test1 <- matrix(test1, nrow=2) # Take the means of the columns, and ignore any NA's colMeans(test1, na.rm=TRUE)

Would I be better leaving a list environment first? My reason for the list is that I have a large dataset and using a list seemed a more elegant approach. I am open to suggestions and more logical approaches however. Thanks.

回答1:

There are lots of ways to fix your problem, but I think there are two important improvements to make. The first is to do all this in a single call to lapply(). The other main problem you have is that there is no actual return() value from the function() in your call that returns the error (sorry, on a tablet, difficult to copy and paste). So you pad out "x" ok, but what do you tell function() to return? Nothing.

Here is one solution that does both these things, if I understand you correctly:

lapply(test1, function(x){   newlength <- ceiling(length(x)/2)*2   if(newlength!=length(x)){x[newlength] <- NA}   colMeans(matrix(x, nrow=2), na.rm=TRUE) })


回答2:

It sounds like you want:

mapply(function(x,y) {      # x[y] <- NA # OP's proposed strategy      length(x) <- y # Roland's better suggestion      return(x)      }, test1, newlength)


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!