Is there a way to get the list index name in my lapply() function?
n = names(mylist)
lapply(mylist, function(list.elem) { cat(\"What is the name of this list
@ferdinand-kraft gave us a great trick and then tells us we shouldn't use it because it's undocumented and because of the performance overhead.
I can't argue much with the first point but I'd like to note that the overhead should rarely be a concern.
let's define active functions so we don't have to call the complex expression
parent.frame()$i[]
but only .i()
, We will also create .n()
to access
the name, which should work for both base and purrr functionals (and probably most others as well).
.i <- function() parent.frame(2)$i[]
# looks for X OR .x to handle base and purrr functionals
.n <- function() {
env <- parent.frame(2)
names(c(env$X,env$.x))[env$i[]]
}
sapply(cars, function(x) paste(.n(), .i()))
#> speed dist
#> "speed 1" "dist 2"
Now let's benchmark a simple function Which pastes the items of a vector to their index,
using different approaches (this operations can of course be vectorized using paste(vec, seq_along(vec))
but that's not the point here).
We define a benchmarking function and a plotting function and plot the results below :
library(purrr)
library(ggplot2)
benchmark_fun <- function(n){
vec <- sample(letters,n, replace = TRUE)
mb <- microbenchmark::microbenchmark(unit="ms",
lapply(vec, function(x) paste(x, .i())),
map(vec, function(x) paste(x, .i())),
lapply(seq_along(vec), function(x) paste(vec[[x]], x)),
mapply(function(x,y) paste(x, y), vec, seq_along(vec), SIMPLIFY = FALSE),
imap(vec, function(x,y) paste(x, y)))
cbind(summary(mb)[c("expr","mean")], n = n)
}
benchmark_plot <- function(data, title){
ggplot(data, aes(n, mean, col = expr)) +
geom_line() +
ylab("mean time in ms") +
ggtitle(title) +
theme(legend.position = "bottom",legend.direction = "vertical")
}
plot_data <- map_dfr(2^(0:15), benchmark_fun)
benchmark_plot(plot_data[plot_data$n <= 100,], "simplest call for low n")
benchmark_plot(plot_data,"simplest call for higher n")
Created on 2019-11-15 by the reprex package (v0.3.0)
The drop at the start of the first chart is a fluke, please ignore it.
We see that the chosen answer is indeed faster, and for a decent amount of iterations our .i()
solutions are indeed slower, the overhead compared to the chosen answer is about 3 times the overhead of using purrr::imap()
, and amount to about, 25 ms for 30k iterations, so I lose about 1 ms per 1000 iterations, 1 sec per million. That's a small cost for convenience in my opinion.