How to use sample and seq in a dplyr pipline?

徘徊边缘 提交于 2020-01-15 10:21:27

问题


I have a dataframe with two columns, low and high. I would like to create a new variable that is a randomly selected value between low and high (inclusive and equal probability) using dplyr. I have tried

library(tidyverse)

data_frame(low = 1:10, high = 11) %>% 
    mutate(rand_btwn = base::sample(seq(low, high, by = 1), size = 1))

which gives me an error since seq expects scalar arguments.

I then tried again using a vectorized version of seq

seq2 <- Vectorize(seq.default, vectorize.args = c("from", "to"))

data_frame(low = 1:10, high = 11) %>% 
    mutate(rand_btwn = base::sample(seq2(low, high, by = 1), size = 1))

but this does not give me the desired result either.


回答1:


To avoid the rowwise() pattern, I usually prefer to map() in mutate(), like:

set.seed(123)
data_frame(low = 1:10, high = 11) %>% 
  mutate(rand_btwn = map_int(map2(low, high, seq), sample, size = 1))
# # A tibble: 10 x 3
#      low  high rand_btwn
#    <int> <dbl>     <int>
#  1     1    11         4
#  2     2    11         9
#  3     3    11         6
#  4     4    11        11
#  5     5    11        11
#  6     6    11         6
#  7     7    11         9
#  8     8    11        11
#  9     9    11        10
# 10    10    11        10

or:

set.seed(123)
data_frame(low = 1:10, high = 11) %>% 
  mutate(rand_btwn = map2_int(low, high, ~ sample(seq(.x, .y), 1)))

Your Vectorize() approach also works:

sample_v <- Vectorize(function(x, y) sample(seq(x, y), 1))

set.seed(123)
data_frame(low = 1:10, high = 11) %>% 
  mutate(rand_btwn = sample_v(low, high))


来源:https://stackoverflow.com/questions/47519201/how-to-use-sample-and-seq-in-a-dplyr-pipline

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!