tidyverse

using purrr to extract elements from multiple lists starting with a common letter

岁酱吖の 提交于 2020-01-16 05:42:07
问题 I have a list of lists. One element in each list has a name beginning with "n_". How do I extract these elements and store them in a separate list? Can I use a combination of map and starts_with ? E.g.: m1 <- list(n_age = c(19,40,39), names = c("a", "b", "c")) m2 <- list(n_gender = c("m","f","f"), names = c("f", "t", "d")) nice_list <- list(m1, m2) I was hoping that something like the following to work (it doesn't!): output <- map(nice_list, starts_with("n_")) 回答1: How about this? map(nice

How to use sample and seq in a dplyr pipline?

徘徊边缘 提交于 2020-01-15 10:21:27
问题 I have a dataframe with two columns, low and high. I would like to create a new variable that is a randomly selected value between low and high (inclusive and equal probability) using dplyr. I have tried library(tidyverse) data_frame(low = 1:10, high = 11) %>% mutate(rand_btwn = base::sample(seq(low, high, by = 1), size = 1)) which gives me an error since seq expects scalar arguments. I then tried again using a vectorized version of seq seq2 <- Vectorize(seq.default, vectorize.args = c("from"

How to give dplyr a SQL query and have it return a remote tbl object?

爷,独闯天下 提交于 2020-01-15 06:50:09
问题 Say I have a remote tbl open using dbplyr, and I want to use a SQL query on it (maybe because there's not dbplyr translation for what I want to do), how do I give it such that it returns a remote tbl object? The DBI::dbGetQuery() function allows you to give a query to db, but it returns a data frame on memory, and not an remote tbl object. For example, say you already have a connection con open to a db, you can create a table like this: library(tidyverse) x_df <- expand.grid(A = c('a','b','c'

loop to multiply across columns

梦想的初衷 提交于 2020-01-15 03:12:34
问题 I have a data frame with columns labeled sales1 , sales2 , price1 , price2 and I want to calculate revenues by multiplying sales1 * price1 and so-on across each number in an iterative fashion. data <- data_frame( "sales1" = c(1, 2, 3), "sales2" = c(2, 3, 4), "price1" = c(3, 2, 2), "price2" = c(3, 3, 5)) data # A tibble: 3 x 4 # sales1 sales2 price1 price2 # <dbl> <dbl> <dbl> <dbl> #1 1 2 3 3 #2 2 3 2 3 #3 3 4 2 5 Why doesn't the following code work? data %>% mutate ( for (i in seq_along(1:2))

iteratively apply ggplot function within a map function

送分小仙女□ 提交于 2020-01-14 19:01:30
问题 I would like to generate a series of histograms for all variables in a dataset, but I am clearly not preparing the data correctly for use in the map function. library(tidyverse) mtcars %>% select(wt, disp, hp) %>% map(., function(x) ggplot(aes(x = x)) + geom_histogram() ) I can accomplish this task with a for loop (h/t but am trying to do the same thing within the tidyverse. foo <- function(df) { nm <- names(df) for (i in seq_along(nm)) { print( ggplot(df, aes_string(x = nm[i])) + geom

Create a new column based on an index column

雨燕双飞 提交于 2020-01-14 13:53:12
问题 I have a dataset containing n observation and a column containing observation indices, e.g. col1 col2 col3 ID 12 0 4 1 6 5 3 1 5 21 42 2 and want to create a new column based on my index like col1 col2 col3 ID col_new 12 0 4 1 12 6 5 3 1 6 5 21 42 2 21 without for loops. Actually I'm doing col_new <- rep(NA, length(ID)) for (i in 1:length(ID)) { col_new[i] <- df[i, ID[i]] } Is there a better or ( tidyverse ) way? 回答1: We can use row/column indexing from base R which should be very fast df1

Installing tidyverse on Ubuntu 18.x & R 3.4.4/3.5.1

。_饼干妹妹 提交于 2020-01-13 10:08:21
问题 I attempted to install tidyverse (and the packages that make up tidyverse) and got the following output: > install.packages('tidyverse', dependencies=TRUE, type="source") Installing package into ‘/home/aos11409/R/x86_64-pc-linux-gnu-library/3.4’ (as ‘lib’ is unspecified) also installing the dependencies ‘dbplyr’, ‘modelr’ trying URL 'https://cloud.r-project.org/src/contrib/dbplyr_1.2.2.tar.gz' Content type 'application/x-gzip' length 263687 bytes (257 KB) =====================================

How to separate a column in dplyr based on regex

若如初见. 提交于 2020-01-12 10:11:42
问题 I have the following data frame: df <- structure(list(X2 = c("BB_137.HVMSC", "BB_138.combined.HVMSC", "BB_139.combined.HVMSC", "BB_140.combined.HVMSC", "BB_141.HVMSC", "BB_142.combined.HMSC-bm")), .Names = "X2", row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) Which looks like this > df # A tibble: 6 x 1 X2 <chr> 1 BB_137.HVMSC 2 BB_138.combined.HVMSC 3 BB_139.combined.HVMSC 4 BB_140.combined.HVMSC 5 BB_141.HVMSC 6 BB_142.combined.HMSC-bm What I want to do is to separate into

In nested data frame, pass information from one list column to function applied in another

会有一股神秘感。 提交于 2020-01-11 12:33:36
问题 I am working on a report for which I have to export a large number of similar data frames into nice looking tables in Word. My goal is to achieve this in one go, using flextable to generate the tables and purrr / tidyverse to apply all the formatting procedures to all rows in a nested data frame. This is what my data frame looks like: df <- data.frame(school = c("A", "B", "A", "B", "A", "B"), students = c(round(runif(6, 1, 10), 0)), grade = c(1, 1, 2, 2, 3, 3)) I want to generate separate

In nested data frame, pass information from one list column to function applied in another

久未见 提交于 2020-01-11 12:33:13
问题 I am working on a report for which I have to export a large number of similar data frames into nice looking tables in Word. My goal is to achieve this in one go, using flextable to generate the tables and purrr / tidyverse to apply all the formatting procedures to all rows in a nested data frame. This is what my data frame looks like: df <- data.frame(school = c("A", "B", "A", "B", "A", "B"), students = c(round(runif(6, 1, 10), 0)), grade = c(1, 1, 2, 2, 3, 3)) I want to generate separate