tidyverse | 易学教程

Using tidy eval for multiple dplyr filter conditions

阅读更多关于 Using tidy eval for multiple dplyr filter conditions

问题 I'm new to tidy eval and trying to write generic functions- one thing I'm struggling with right now is writing multiple filter conditions for categorical variables. This is what I'm using right now- create_expr <- function(name, val){ if(!is.null(val)) val <- paste0("c('", paste0(val, collapse = "','"), "')") paste(name, "%in%", val) } my_filter <- function(df, cols, conds){ # Args: # df: dataframe which is to be filtered # cols: list of column names which are to be filtered # conds:

How to use dplyr programming syntax to create and evaluate variable names

阅读更多关于 How to use dplyr programming syntax to create and evaluate variable names

问题 I would like to dynamically input a variable name using dplyr programming syntax, however, as many have described this can be quite confusing. I've played around with various combinations of quo/enquo !! etc. to no avail. Here is the simplest form of my code library(tidyverse) df <- tibble( color1 = c("blue", "blue", "blue", "blue", "blue"), color2 = c("black", "black", "black", "black", "black"), value = 1:5 ) num <- 2 df %>% mutate(color3 = !!(paste0("color", num))) #> # A tibble: 5 x 4 #>

Creating models and augmenting data without losing additional columns in dplyr/broom

阅读更多关于 Creating models and augmenting data without losing additional columns in dplyr/broom

Consider the following data / example. Each dataset contains a number of samples with one observation and one estimate: library(tidyverse) library(broom) data = read.table(text = ' dataset sample_id observation estimate A A1 4.8 4.7 A A2 4.3 4.5 A A3 3.1 2.9 A A4 2.1 2 A A5 1.1 1 B B1 4.5 4.3 B B2 3.9 4.1 B B3 2.9 3 B B4 1.8 2 B B5 1 1.2 ', header = TRUE) I want to calculate a linear model per dataset to remove any linear bias between observation and estimate, and get the fitted values next to the original ones: data %>% group_by(dataset) %>% do(lm(observation ~ estimate, data = .) %>% augment

R: create dummy variables based on a categorical variable of lists [duplicate]

阅读更多关于 R: create dummy variables based on a categorical variable *of lists* [duplicate]

问题 This question already has answers here : How can I split a character string into column vectors with a 1/0 value flag? (7 answers) Closed 7 months ago . I have a data frame with a categorical variable holding lists of strings, with variable length (it is important because otherwise this question would be a duplicate of this or this), e.g.: df <- data.frame(x = 1:5) df$y <- list("A", c("A", "B"), "C", c("B", "D", "C"), "E") df x y 1 1 A 2 2 A, B 3 3 C 4 4 B, D, C 5 5 E And the desired form is

Creating models and augmenting data without losing additional columns in dplyr/broom

阅读更多关于 Creating models and augmenting data without losing additional columns in dplyr/broom

问题 Consider the following data / example. Each dataset contains a number of samples with one observation and one estimate: library(tidyverse) library(broom) data = read.table(text = ' dataset sample_id observation estimate A A1 4.8 4.7 A A2 4.3 4.5 A A3 3.1 2.9 A A4 2.1 2 A A5 1.1 1 B B1 4.5 4.3 B B2 3.9 4.1 B B3 2.9 3 B B4 1.8 2 B B5 1 1.2 ', header = TRUE) I want to calculate a linear model per dataset to remove any linear bias between observation and estimate, and get the fitted values next

Add row in each group using dplyr and add_row()

阅读更多关于 Add row in each group using dplyr and add_row()

问题 If I add a new row to the ìris dataset with: iris <- as_tibble(iris) > iris %>% add_row(.before=0) # A tibble: 151 × 5 Sepal.Length Sepal.Width Petal.Length Petal.Width Species <dbl> <dbl> <dbl> <dbl> <chr> 1 NA NA NA NA <NA> <--- Good! 2 5.1 3.5 1.4 0.2 setosa 3 4.9 3.0 1.4 0.2 setosa It works. So, why can't I add a new row on top of each "subset" with: iris %>% group_by(Species) %>% add_row(.before=0) Error: is.data.frame(df) is not TRUE 回答1: If you want to use a grouped operation, you need

Separate contents of field

阅读更多关于 Separate contents of field

I'm sure this is very simple, and I think it's a case of using separate and gather. I have a single field in a dataframe, authorlist,an edited export of a pubmed search. It contains the authors of the publications. It can, obviously, contain either a single author or a collaboration of authors. For example this is just a selection of the options available: Author Drijgers RL, Verhey FR, Leentjens AF, Kahler S, Aalten P. What I'd like to do is create a single list of ALL authors so that I'd have something like Author Drijgers RL Verhey FR Leentjens AF Kahler S Aalten P How do I do that? I

counting the number of times a value appears in a column in relation to other columns in r

阅读更多关于 counting the number of times a value appears in a column in relation to other columns in r

I am new to r and I have a dataframe very close to the one below and I would love to find a general way that tells me how many times plus 1, the number "0" appears for each country (intro4) and id. Intro4 number id 221 TAN 0 19 222 TAN 0 73 223 TAN 0 73 224 TOG 0 37 225 TOG 0 58 226 UGA 0 96 227 UGA 0 112 228 UGA 0 96 229 ZAM 0 40 230 ZAM 0 99 231 ZAM 0 139 I can do it by hand by it is a big data frame and would take forever, count () gives me the frequency but doesn't divide it between different countries. I have found a way to do it but I will have to select and filter for each individual

How to apply same operation to multiple data frames in dplyr-R?

阅读更多关于 How to apply same operation to multiple data frames in dplyr-R?

I would like to apply the same operation to multiple data frames in 'R' but cannot get how to deal with this matter. This is an example of pipe operation in dplyr : library(dplyr) iris %>% mutate(Sepal=rowSums(select(.,starts_with("Sepal"))), Length=rowSums(select(.,ends_with("Length"))), Width=rowSums(select(.,ends_with("Width")))) iris2 <- iris iris3 <- iris Could you suggest how to apply the same pipe function to iris , iris2 and isis3 ? I need to use dplyr piping operation. I suppose map function may help but as I have not fully understand its concept, I got errors to apply it. Sample

Using nested function with lapply

阅读更多关于 Using nested function with lapply

This code works (takes hours minutes and seconds and converts to seconds only): library(lubridate) library(tidyverse) original_date_time<-"2018-01-3111:59:59" period_to_seconds(hms(paste(hour(original_date_time), minute(original_date_time),second(original_date_time), sep = ":"))) I have this tibble: df<-data.frame("id"=c(1,2,3,4,5), "Time"=c("1999-12-31 10:10:10","1999-12-31 09:05:13","1999-12-31 00:05:25","1999-12-31 07:04","1999-12-31 03:05:07")) tib<-as_tibble(df) tib result: # A tibble: 5 x 2 id Time <dbl> <fct> 1 1 1999-12-31 10:10:10 2 2 1999-12-31 09:05:13 3 3 1999-12-31 00:05:25 4 4