问题
This dataframe contains what I'll call the "data":
library(tidyverse)
df_d <- data_frame(key = c("cat", "cat", "dog", "dog"),
value_1 = c(1,2,3,4),
value_2 = c(2,4,6,8))
Here is a dataframe that I intend to use as something like a function look-up table. f
is a single variable function and f2
is a multivariable function:
df_f <- data_frame(key = c("cat", "dog"),
f = c(function(x) x^2, function(x) sqrt(x)),
f2 = c(function(x) (x[1]+x[2])^2, function(x) sqrt(x[1]+x[2])))
I can easily make a dataframe so that any cat
row gets the cat
functions and any dog
row gets the dog
functions:
df_both <- left_join(df_d, df_f)
I was able to figure out how to apply each of the f
functions to, say, the value_1
column to get:
df_both %>% mutate(result = invoke_map_dbl(f, value_1))
#> # A tibble: 4 x 6
#> key value_1 value_2 f f2 result
#> <chr> <dbl> <dbl> <list> <list> <dbl>
#> 1 cat 1.00 2.00 <fn> <fn> 1.00
#> 2 cat 2.00 4.00 <fn> <fn> 4.00
#> 3 dog 3.00 6.00 <fn> <fn> 1.73
#> 4 dog 4.00 8.00 <fn> <fn> 2.00
My question is: how can I create a columns result2
that takes each function in f2
and uses as its input c(value_1, value_2)
. If re-defining the functions in f2
to be explicitly functions of two variables makes things much easier, that's fine too.
Desired output:
#> # A tibble: 4 x 7
#> key value_1 value_2 f f2 result result2
#> <chr> <dbl> <dbl> <list> <list> <dbl> <dbl>
#> 1 cat 1.00 2.00 <fn> <fn> 1.00 9.00
#> 2 cat 2.00 4.00 <fn> <fn> 4.00 36.0
#> 3 dog 3.00 6.00 <fn> <fn> 1.73 3.00
#> 4 dog 4.00 8.00 <fn> <fn> 2.00 3.46
(Question motivated by an unfortunately self-deleted question from earlier today.)
回答1:
"If re-defining the functions in f2 to be explicitly functions of two variables makes things much easier, that's fine too."
Yes, that would be a more natural situation here, I think. Otherwise data is stored rowwise, and should possibly be reshaped.
Redefining your functions:
df_f <- data_frame(key = c("cat", "dog"),
f = c(function(x) x^2, function(x) sqrt(x)),
f2 = c(function(x, y) (x + y)^2, function(x, y) sqrt(x + y)))
df_both <- left_join(df_d, df_f)
Now you again use map_invoke
, passing .x
as a list, although you need to turn the lists inside out using transpose
:
mutate(
df_both,
result = invoke_map_dbl(f, value_1),
result2 = invoke_map_dbl(f2, transpose(list(value_1, value_2)))
)
# A tibble: 4 x 7 key value_1 value_2 f f2 result result2 <chr> <dbl> <dbl> <list> <list> <dbl> <dbl> 1 cat 1. 2. <fn> <fn> 1.00 9.00 2 cat 2. 4. <fn> <fn> 4.00 36.0 3 dog 3. 6. <fn> <fn> 1.73 3.00 4 dog 4. 8. <fn> <fn> 2.00 3.46
A set of three argument functions would then simply extend to invoke_map_dbl(f3, transpose(list(value_1, value_2, value_3))
Note that this kind of approach will not work well on large datasets, since you aren't using vectorization.
A more scalable alternative may involve nesting, where you at least apply each function once within each group:
df_both %>%
group_by(key) %>%
nest() %>%
mutate(data = map(
data,
~mutate(., result = first(f)(value_1), result2 = first(f2)(value_1, value_2))
)) %>%
unnest()
Which gives the same result.
回答2:
We could use pmap
df_both %>%
mutate(result = invoke_map_dbl(f, value_1),
result2 = pmap_dbl(.[c('value_1', 'value_2', 'f2')], ~(..3)(c(..1, ..2))))
# A tibble: 4 x 7
# key value_1 value_2 f f2 result result2
# <chr> <dbl> <dbl> <list> <list> <dbl> <dbl>
#1 cat 1.00 2.00 <fun> <fun> 1.00 9.00
#2 cat 2.00 4.00 <fun> <fun> 4.00 36.0
#3 dog 3.00 6.00 <fun> <fun> 1.73 3.00
#4 dog 4.00 8.00 <fun> <fun> 2.00 3.46
Here, we don't change the OP's functions. It is the same as in the OP's post.
来源:https://stackoverflow.com/questions/49266076/apply-data-frame-with-list-variable-of-multivariable-functions-to-a-data-frame-w