R: lapply function - skipping the current function loop

不问归期 提交于 2019-12-12 09:34:59

问题


I am using a lapply function over a list of multiple files. Is there a way in which I can skip the function on the current file without returning anything and just skip to the next file in the list of the files?

To be precise, I have an if statement that checks for a condition, and I would like to skip to the next file if the statement returns FALSE.


回答1:


lapply will always return a list the same length as the X it is provided. You can simply set the items to something that you can later filter out.

For example if you have the function parsefile

parsefile <-function(x) {
  if(x>=0) {
    x
  } else {
    NULL
  }
}

and you run it on a vector runif(10,-5,5)

result<-lapply(runif(10,-5,5), parsefile)

then you'll have your list filled with answers and NULLs

You can subset out the NULLs by doing...

result[!vapply(result, is.null, logical(1))]



回答2:


As already answered by the others, I do not think you can proceed to the next iteration without returning something using the *apply family of functions.

In such cases, I use Dean MacGregor's method, with a small change: I use NA instead of NULL, which makes filtering the results easier.

files <- list("file1.txt", "file2.txt", "file3.txt")

parse_file <- function(file) {
  if(file.exists(file)) {
    readLines(file)
  } else {
    NA
  }
}

results <- lapply(files, parse_file)
results <- results[!is.na(results)]

A quick benchmark

res_na   <- list("a",   NA, "c")
res_null <- list("a", NULL, "c")
microbenchmark::microbenchmark(
  na = res_na[!is.na(res_na)],
  null = res_null[!vapply(res_null, is.null, logical(1))]
)

illustrates that the NA solution is quite a bit faster than the solution that uses NULL:

Unit: nanoseconds
expr  min   lq    mean median   uq   max neval
  na    0    1  410.78    446  447  5355   100
null 3123 3570 5283.72   3570 4017 75861   100



回答3:


You can define a custom function to use in your call to lapply(). Here is some sample code which iterates over a list of files and processes a file only if the name does not contain the number 3 (a bit contrived, but hopefully this gets the point across):

files <- as.list(c("file1.txt", "file2.txt", "file3.txt"))

fun <- function(x) {
    test <- grep("3", x)                     // check for files with "3" in their name
    if (length(test) == 0) {                 // replace with your statement here
        // process the file here
    }
    // otherwise do not process the file
}

result <- lapply(files, function(x) fun(x))  // call lapply with custom function


来源:https://stackoverflow.com/questions/31543307/r-lapply-function-skipping-the-current-function-loop

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!