Why use purrr::map instead of lapply?

前端 未结 3 1318
借酒劲吻你
借酒劲吻你 2020-11-27 09:15

Is there any reason why I should use

map(, function(x) )

instead of

lapply(

        
3条回答
  •  伪装坚强ぢ
    2020-11-27 09:32

    Comparing purrr and lapply boils down to convenience and speed.


    1. purrr::map is syntactically more convenient than lapply

    extract second element of the list

    map(list, 2)  
    

    which as @F. Privé pointed out, is the same as:

    map(list, function(x) x[[2]])
    

    with lapply

    lapply(list, 2) # doesn't work
    

    we need to pass an anonymous function...

    lapply(list, function(x) x[[2]])  # now it works
    

    ...or as @RichScriven pointed out, we pass [[ as an argument into lapply

    lapply(list, `[[`, 2)  # a bit more simple syntantically
    

    So if find yourself applying functions to many lists using lapply, and tire of either defining a custom function or writing an anonymous function, convenience is one reason to favor purrr.

    2. Type-specific map functions simply many lines of code

    • map_chr()
    • map_lgl()
    • map_int()
    • map_dbl()
    • map_df()

    Each of these type-specific map functions returns a vector, rather than the lists returned by map() and lapply(). If you're dealing with nested lists of vectors, you can use these type-specific map functions to pull out the vectors directly, and coerce vectors directly into int, dbl, chr vectors. The base R version would look something like as.numeric(sapply(...)), as.character(sapply(...)), etc.

    The map_ functions also have the useful quality that if they cannot return an atomic vector of the indicated type, they fail. This is useful when defining strict control flow, where you want a function to fail if it [somehow] generates the wrong object type.

    3. Convenience aside, lapply is [slightly] faster than map

    Using purrr's convenience functions, as @F. Privé pointed out slows down processing a bit. Let's race each of the 4 cases I presented above.

    # devtools::install_github("jennybc/repurrrsive")
    library(repurrrsive)
    library(purrr)
    library(microbenchmark)
    library(ggplot2)
    
    mbm <- microbenchmark(
    lapply       = lapply(got_chars[1:4], function(x) x[[2]]),
    lapply_2     = lapply(got_chars[1:4], `[[`, 2),
    map_shortcut = map(got_chars[1:4], 2),
    map          = map(got_chars[1:4], function(x) x[[2]]),
    times        = 100
    )
    autoplot(mbm)
    

    And the winner is....

    lapply(list, `[[`, 2)
    

    In sum, if raw speed is what you're after: base::lapply (although it's not that much faster)

    For simple syntax and expressibility: purrr::map


    This excellent purrr tutorial highlights the convenience of not having to explicitly write out anonymous functions when using purrr, and the benefits of type-specific map functions.

提交回复
热议问题