Exit a function within *apply

◇◆丶佛笑我妖孽 提交于 2019-12-12 05:39:52

问题


sapply and replicate (etc.) run a specified number of times. Putting sapply(1:N, function(n){expr}) will execute expr N times. Supposing I wanted sapply to stop after m runs. Is this possible without making an error? break doesn't work, and a for or while loop would be too slow in my context.

Something akin to:

sapply(1:N, function(n){
  #some expression
  if(identical(n, m)) break
})

except that break doesn't work.

What I'm trying to do:

Creating a function to read in large (binary) data files of a defined structure but unknown lengths. Using replicate with array(readBin(...), ...) is the best way I've found to do this, but I'd like it to stop when NA starts to be returned (i.e. end of file is reached).


回答1:


A partial workaround can be to used in form of global control variable

i<-TRUE
unlist( sapply(1:10, function(x){if(i){ if(x>=4)(i<<-FALSE); 2*x;}}) ) 

Though it still runs n times, at least it does not perform the operation every time and spares resources. And I can't exactly make out why simplification did not work all the way and I had to use unlist.




回答2:


Aside from the for vs *apply battle, if your problem is to use readBin until the end of file is reached, keep in mind that:

  • you can use a (small) overestimate for n (the number of elements to read);
  • you can know the size of the file through file.info(filename)$size; then you can estimate by yourself how many elements are contained in the file.

For instance, say that you are reading integer (four bytes) values. Just try:

readBin(con,"int",n=file.info(filename)$size/4+10)

to read all the file in one shot. The +10 is to make a little overestimation.




回答3:


Thought I'd post what I came up with for this. cbb is the array being built. In the bad example, cbb is progressively built up as an array using abind. At each step, cbb has to be re-evaluated and so each successive step is slower - a growing object. In the good example, cbb is built up as a list and a new list entry is declared with each step. R does not need to re-evaluate the existing list each time. The array is bound together at the end with do.call(abind, c(cbb, list(along = 4))).

One thing I found helpful was to put cat(".") within the loop. If the dot printing slows down, that might indicate a growing object. With the good example, dots are printed at a more or less constant rate. The good example is about ten times faster than the bad.

BAD:

cbb <- array(NA, c(N1, N2, N3, 0))
repeat{ 
    sptsnew <- readBin(to.read, "integer", 2L, 4L)
    if(identical(sptsnew, integer(0))){cat("\nend of file\n"); break}
    ... #reading array metadata
    cbb <- abind(cbb, array(readBin(to.read, "double", N1*N2*N3, 4L), c(N1, N2, N3, 1)), along = 4)
    cat(".")
}

GOOD:

i <- 1
cbb <- list()
repeat{ 
    sptsnew <- readBin(to.read, "integer", 2L, 4L)
    if(identical(sptsnew, integer(0))){cat("\nend of file\n"); break}
    ... #reading array metadata
    cbb[[i]] <- array(readBin(to.read, "double", N1*N2*N3, 4L), c(N1, N2, N3, 1))
    i <- i + 1
    cat(".")
}
cbb <- do.call(abind, c(cbb, list(along = 4)))


来源:https://stackoverflow.com/questions/34220552/exit-a-function-within-apply

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!