dplyr | 易学教程

Conditionally replace values of multiple columns, from values of other multiple columns

阅读更多关于 Conditionally replace values of multiple columns, from values of other multiple columns

问题 Suppose I have this dataset: set.seed (1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)), c=rep(c("d","e","f"),20)) %>% head() Then I want to add many columns (in this example I only added two), to identify distinct cases between each group (in this case, column "a"). set.seed(1234); data.frame(cbind(a=rep(c("si","no"),30),b=rnorm(60)),c=rep(c("d","e","f"),20)) %>% group_by(a) %>% dplyr::mutate_at(vars(c(b,c)), .funs= list(dups_hash_ing= ~n_distinct(.))) This code leaves the

Conditionally replace values of multiple columns, from values of other multiple columns

阅读更多关于 Conditionally replace values of multiple columns, from values of other multiple columns

How to merge two different groupings if they are not disjoint with dplyr

阅读更多关于 How to merge two different groupings if they are not disjoint with dplyr

问题 Suppose that I have two sets of identifiers id1 and id2 in a data frame. How can I create a new identifier id3 that works as follows: I consider id1 as the stricter key, so that observations are first grouped in id1 and then in id2 . If there are two sets of rows with different values of id2 that have some of its elements with the same id1 , these two sets should have the same value for id3 (the exact value in id3 doesn't matter much). df <- data.frame(id1 = c(1, 1, 2, 2, 5, 6), id2 = c(4, 3,

How to merge two different groupings if they are not disjoint with dplyr

阅读更多关于 How to merge two different groupings if they are not disjoint with dplyr

How to filter dataframe with multiple conditions?

阅读更多关于 How to filter dataframe with multiple conditions?

问题 I have this dataframe that I'll like to subset (if possible, with dplyr or base R functions): df <- data.frame(x = c(1,1,1,2,2,2), y = c(30,10,8,10,18,5)) x y 1 30 1 10 1 8 2 10 2 18 2 5 Assuming x are factors (so 2 conditions/levels), how can I subset/filter this dataframe so that I get only df$y values that are greater than 15 for df$x == 1 , and df$y values that are greater than 5 for df$x == 2 ? This is what I'd like to get: df2 <- data.frame(x = c(1,2,2), y = c(30,10,18)) x y 1 30 2 10 2

How to filter dataframe with multiple conditions?

阅读更多关于 How to filter dataframe with multiple conditions?

Error with select function from dplyr

阅读更多关于 Error with select function from dplyr

问题 When I use the select function from dplyr, it doesn't work and gives me an error stating that the column names that I want to select are unused arguments. However, if I specify dplyr before the function call like s: "dplyr::select" then it works as normal: Here is a sample df: sampledf <- structure(list(CRN = c(5497L, 6515L, 7248L, 36956L, 37021L), varA = structure(c(2L, 2L, 2L, 2L, 2L), .Label = c("A", "B"), class = "factor"), varB = c(NA_integer_, NA_integer_, NA_integer_, NA_integer_, NA

Difference between Distinct vs Unique

阅读更多关于 Difference between Distinct vs Unique

问题 What are the differences between distinct and unique in R using dplyr in consideration to: Speed Capabilities (valid inputs, parameters, etc) & Uses Output For example: library(dplyr) data(iris) # creating data with duplicates iris_dup <- bind_rows(iris, iris) d <- distinct(iris_dup) u <- unique(iris_dup) all(d==u) # returns True In this example distinct and unique perform the same function. Are there examples of times you should use one but not the other? Are there any tricks or common uses

Difference between Distinct vs Unique

阅读更多关于 Difference between Distinct vs Unique

Difference between Distinct vs Unique

阅读更多关于 Difference between Distinct vs Unique