Apply function too slow in r

折月煮酒 提交于 2019-12-11 02:52:56

问题


I have to calculate for a lot of species a specific formula per row. The formula is a product between a value of abundance and a value present in the last row of the data frame. Then, all these products are summed.

My current script consists in using an apply function which appears to be as slow as the for-loop I started with. I simplified the problem in the following script, using a simple df called az :

az=data.frame(c(1,2,10),c(2,4,20),c(3,6,30))
colnames(az)=c("a","b","c")


# Initial for loop
prov=0 # prov for provisional number
    for (i in 1:nrow(az)){
            for (j in 1:ncol(az)){
                   prov=prov+az[i,j]*az[nrow(az),j]
            }
        print(prov)
        prov=0
        }

# Apply solution
apply(az[,], 1, function(x) {sum(x*az[nrow(az),], na.rm=TRUE)})

Both solutions work but they are quite slow (with my original df) and I have to repeat the operation for a huge number of species. Thus, I was wondering if anyone has a more efficient solution, maybe using vectorized expressions.

Kind regards.


回答1:


The fastest solution is probably matrix algebra:

apply(az[,], 1, function(x) {sum(x*az[nrow(az),], na.rm=TRUE)})
#[1]  140  280 1400

m <- as.matrix(az)
m[is.na(m)] <- 0 #remove NA from sums
as.vector(m %*% m[nrow(m),])
#[1]  140  280 1400



回答2:


Try

  rowSums(az*unlist(az[nrow(az),])[col(az)], na.rm=TRUE)

Or a slightly faster option would be to use rep

  rowSums(az*rep(unlist(az[nrow(az),]),each=ncol(az)), na.rm=TRUE)


来源:https://stackoverflow.com/questions/29125609/apply-function-too-slow-in-r

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!