Access lapply index names inside FUN

前端 未结 12 2293
自闭症患者
自闭症患者 2020-11-22 15:04

Is there a way to get the list index name in my lapply() function?

n = names(mylist)
lapply(mylist, function(list.elem) { cat(\"What is the name of this list         


        
12条回答
  •  难免孤独
    2020-11-22 15:25

    @ferdinand-kraft gave us a great trick and then tells us we shouldn't use it because it's undocumented and because of the performance overhead.

    I can't argue much with the first point but I'd like to note that the overhead should rarely be a concern.

    let's define active functions so we don't have to call the complex expression parent.frame()$i[] but only .i(), We will also create .n() to access the name, which should work for both base and purrr functionals (and probably most others as well).

    .i <- function() parent.frame(2)$i[]
    # looks for X OR .x to handle base and purrr functionals
    .n <- function() {
      env <- parent.frame(2)
      names(c(env$X,env$.x))[env$i[]]
    }
    
    sapply(cars, function(x) paste(.n(), .i()))
    #>     speed      dist 
    #> "speed 1"  "dist 2"
    

    Now let's benchmark a simple function Which pastes the items of a vector to their index, using different approaches (this operations can of course be vectorized using paste(vec, seq_along(vec)) but that's not the point here).

    We define a benchmarking function and a plotting function and plot the results below :

    library(purrr)
    library(ggplot2)
    benchmark_fun <- function(n){
      vec <- sample(letters,n, replace = TRUE)
      mb <- microbenchmark::microbenchmark(unit="ms",
                                          lapply(vec, function(x)  paste(x, .i())),
                                          map(vec, function(x) paste(x, .i())),
                                          lapply(seq_along(vec), function(x)  paste(vec[[x]], x)),
                                          mapply(function(x,y) paste(x, y), vec, seq_along(vec), SIMPLIFY = FALSE),
                                          imap(vec, function(x,y)  paste(x, y)))
      cbind(summary(mb)[c("expr","mean")], n = n)
    }
    
    benchmark_plot <- function(data, title){
      ggplot(data, aes(n, mean, col = expr)) + 
        geom_line() +
        ylab("mean time in ms") +
        ggtitle(title) +
        theme(legend.position = "bottom",legend.direction = "vertical")
    }
    
    plot_data <- map_dfr(2^(0:15), benchmark_fun)
    benchmark_plot(plot_data[plot_data$n <= 100,], "simplest call for low n")
    

    benchmark_plot(plot_data,"simplest call for higher n")
    

    Created on 2019-11-15 by the reprex package (v0.3.0)

    The drop at the start of the first chart is a fluke, please ignore it.

    We see that the chosen answer is indeed faster, and for a decent amount of iterations our .i() solutions are indeed slower, the overhead compared to the chosen answer is about 3 times the overhead of using purrr::imap(), and amount to about, 25 ms for 30k iterations, so I lose about 1 ms per 1000 iterations, 1 sec per million. That's a small cost for convenience in my opinion.

提交回复
热议问题