tidyverse | 易学教程

using purrr to extract elements from multiple lists starting with a common letter

阅读更多关于 using purrr to extract elements from multiple lists starting with a common letter

问题 I have a list of lists. One element in each list has a name beginning with "n_". How do I extract these elements and store them in a separate list? Can I use a combination of map and starts_with ? E.g.: m1 <- list(n_age = c(19,40,39), names = c("a", "b", "c")) m2 <- list(n_gender = c("m","f","f"), names = c("f", "t", "d")) nice_list <- list(m1, m2) I was hoping that something like the following to work (it doesn't!): output <- map(nice_list, starts_with("n_")) 回答1: How about this? map(nice

How to use sample and seq in a dplyr pipline?

阅读更多关于 How to use sample and seq in a dplyr pipline?

问题 I have a dataframe with two columns, low and high. I would like to create a new variable that is a randomly selected value between low and high (inclusive and equal probability) using dplyr. I have tried library(tidyverse) data_frame(low = 1:10, high = 11) %>% mutate(rand_btwn = base::sample(seq(low, high, by = 1), size = 1)) which gives me an error since seq expects scalar arguments. I then tried again using a vectorized version of seq seq2 <- Vectorize(seq.default, vectorize.args = c("from"

How to give dplyr a SQL query and have it return a remote tbl object?

阅读更多关于 How to give dplyr a SQL query and have it return a remote tbl object?

问题 Say I have a remote tbl open using dbplyr, and I want to use a SQL query on it (maybe because there's not dbplyr translation for what I want to do), how do I give it such that it returns a remote tbl object? The DBI::dbGetQuery() function allows you to give a query to db, but it returns a data frame on memory, and not an remote tbl object. For example, say you already have a connection con open to a db, you can create a table like this: library(tidyverse) x_df <- expand.grid(A = c('a','b','c'

loop to multiply across columns

阅读更多关于 loop to multiply across columns

问题 I have a data frame with columns labeled sales1 , sales2 , price1 , price2 and I want to calculate revenues by multiplying sales1 * price1 and so-on across each number in an iterative fashion. data <- data_frame( "sales1" = c(1, 2, 3), "sales2" = c(2, 3, 4), "price1" = c(3, 2, 2), "price2" = c(3, 3, 5)) data # A tibble: 3 x 4 # sales1 sales2 price1 price2 # <dbl> <dbl> <dbl> <dbl> #1 1 2 3 3 #2 2 3 2 3 #3 3 4 2 5 Why doesn't the following code work? data %>% mutate ( for (i in seq_along(1:2))

iteratively apply ggplot function within a map function

阅读更多关于 iteratively apply ggplot function within a map function

问题 I would like to generate a series of histograms for all variables in a dataset, but I am clearly not preparing the data correctly for use in the map function. library(tidyverse) mtcars %>% select(wt, disp, hp) %>% map(., function(x) ggplot(aes(x = x)) + geom_histogram() ) I can accomplish this task with a for loop (h/t but am trying to do the same thing within the tidyverse. foo <- function(df) { nm <- names(df) for (i in seq_along(nm)) { print( ggplot(df, aes_string(x = nm[i])) + geom

Create a new column based on an index column

阅读更多关于 Create a new column based on an index column

问题 I have a dataset containing n observation and a column containing observation indices, e.g. col1 col2 col3 ID 12 0 4 1 6 5 3 1 5 21 42 2 and want to create a new column based on my index like col1 col2 col3 ID col_new 12 0 4 1 12 6 5 3 1 6 5 21 42 2 21 without for loops. Actually I'm doing col_new <- rep(NA, length(ID)) for (i in 1:length(ID)) { col_new[i] <- df[i, ID[i]] } Is there a better or ( tidyverse ) way? 回答1: We can use row/column indexing from base R which should be very fast df1

Installing tidyverse on Ubuntu 18.x & R 3.4.4/3.5.1

阅读更多关于 Installing tidyverse on Ubuntu 18.x & R 3.4.4/3.5.1

问题 I attempted to install tidyverse (and the packages that make up tidyverse) and got the following output: > install.packages('tidyverse', dependencies=TRUE, type="source") Installing package into ‘/home/aos11409/R/x86_64-pc-linux-gnu-library/3.4’ (as ‘lib’ is unspecified) also installing the dependencies ‘dbplyr’, ‘modelr’ trying URL 'https://cloud.r-project.org/src/contrib/dbplyr_1.2.2.tar.gz' Content type 'application/x-gzip' length 263687 bytes (257 KB) =====================================

How to separate a column in dplyr based on regex

阅读更多关于 How to separate a column in dplyr based on regex

问题 I have the following data frame: df <- structure(list(X2 = c("BB_137.HVMSC", "BB_138.combined.HVMSC", "BB_139.combined.HVMSC", "BB_140.combined.HVMSC", "BB_141.HVMSC", "BB_142.combined.HMSC-bm")), .Names = "X2", row.names = c(NA, -6L), class = c("tbl_df", "tbl", "data.frame")) Which looks like this > df # A tibble: 6 x 1 X2 <chr> 1 BB_137.HVMSC 2 BB_138.combined.HVMSC 3 BB_139.combined.HVMSC 4 BB_140.combined.HVMSC 5 BB_141.HVMSC 6 BB_142.combined.HMSC-bm What I want to do is to separate into

In nested data frame, pass information from one list column to function applied in another

阅读更多关于 In nested data frame, pass information from one list column to function applied in another

问题 I am working on a report for which I have to export a large number of similar data frames into nice looking tables in Word. My goal is to achieve this in one go, using flextable to generate the tables and purrr / tidyverse to apply all the formatting procedures to all rows in a nested data frame. This is what my data frame looks like: df <- data.frame(school = c("A", "B", "A", "B", "A", "B"), students = c(round(runif(6, 1, 10), 0)), grade = c(1, 1, 2, 2, 3, 3)) I want to generate separate

In nested data frame, pass information from one list column to function applied in another

阅读更多关于 In nested data frame, pass information from one list column to function applied in another