问题
I have a large dataset for which I want to get the value of each row minus the following row except for each fifth row. With a for loop, it is fairly simple but with my large dataset, it takes over an hour. I've been told that apply with a function is MUCH faster, but I don't know how to write a complicated function and I can't find examples of similar problems.
#set up matrix
x=matrix(0,15,2)
x[,1]=c(1, 5, 4, 3, 4, 2, 4, 3, 7, 8, 3, 2, 9, 7, 3)
#run for loop
for (i in c(0:((nrow(x)/5)-1)*5)){
x[i+1,2]<-x[i+1,1]-x[i+2,1]
x[i+2,2]<-x[i+2,1]-x[i+3,1]
x[i+3,2]<-x[i+3,1]-x[i+4,1]
x[i+4,2]<-x[i+4,1]-x[i+5,1]
x[i+5,2]<-x[i+5,1]
}
I got as far as this using apply but it doesn't even work the way I thought it would...
apply(x, FUN=function(i) x[i]-x[i+1], MARGIN=1)
EDIT: I figured out how to make the for loop different using if ... else... statement within my for loop which might be one step in writing the function.
for (i in 1:nrow(x)){
if (i%%5==0){# for those rows that are a multiple of five
x[i,2]<-x[i,1]
}else{ # for all other rows
x[i,2]<-x[i,1]-x[i+1,1]
}
}
回答1:
You can do this with a vectorized calculation. This scales up if you use nrow(x)
instead of 15
.
# set up indexes for the 5, 10, ...
index.fifth<-seq(5,15,5)
# set up indexes for 1:4,6:9,11:14,...
# basically delete the ones for every fifth one
index.rest<-seq(1:15)[-index.fifth]
# calculate subtractions first
x[index.rest,2]<-x[index.rest,1]-x[index.rest+1,1]
# set 5, 10, ... to their values
x[index.fifth,2]<-x[index.fifth,1]
来源:https://stackoverflow.com/questions/48548550/use-function-instead-of-for-loop-in-r-substracting-previous-rows-iteratively-wi