dplyr | 易学教程

write table in database with dplyr

阅读更多关于 write table in database with dplyr

问题 Is there a way to make dplyr hooked up to a database pipe data to a new table within that database, never downloading the data locally? I'd like to do something along the lines of: tbl(con, "mytable") %>% group_by(dt) %>% tally() %>% write_to(name = "mytable_2", schema = "transformed") 回答1: While I whole heartedly agree with the suggestion to learn SQL, you can take advantage of the fact that dplyr doesn't pull data until it absolutely has to and build the query using dplyr , add the TO TABLE

How to mimic geom_boxplot() with outliers using geom_boxplot(stat = “identity”)

阅读更多关于 How to mimic geom_boxplot() with outliers using geom_boxplot(stat = “identity”)

问题 I would like to pre-compute by-variable summaries of data (with plyr and passing a quantile function) and then plot with geom_boxplot(stat = "identity") . This works great except it (a) does not plot outliers as points and (b) extends the "whiskers" to the max and min of the data being plotted. Example: library(plyr) library(ggplot2) set.seed(4) df <- data.frame(fact = sample(letters[1:2], 12, replace = TRUE), val = c(1:10, 100, 101)) df # fact val # 1 b 1 # 2 a 2 # 3 a 3 # 4 a 4 # 5 b 5 # 6

In dplyr, what are the intrinsic differences between setdiff and anti_join?

阅读更多关于 In dplyr, what are the intrinsic differences between setdiff and anti_join?

问题 I'm still working through the lessons on DataCamp for R, so please forgive me if this question seems naïve. Consider the following (very contrived) sample: library(dplyr) library(tibble) type <- c("Dog", "Cat", "Cat", "Cat") name <- c("Ella", "Arrow", "Gabby", "Eddie") pets = tibble(name, type) name <- c("Ella", "Arrow", "Dog") type <- c("Dog", "Cat", "Calvin") favorites = tibble(name, type) anti_join(favorites, pets, by = "name") setdiff(favorites, pets, by = "name") Both of these return

Using cummean with group_by and ignoring NAs

阅读更多关于 Using cummean with group_by and ignoring NAs

问题 df <- data.frame(category=c("cat1","cat1","cat2","cat1","cat2","cat2","cat1","cat2"), value=c(NA,2,3,4,5,NA,7,8)) I'd like to add a new column to the above dataframe which takes the cumulative mean of the value column, not taking into account NAs. Is it possible to do this with dplyr ? I've tried df <- df %>% group_by(category) %>% mutate(new_col=cummean(value)) but cummean just doesn't know what to do with NAs. EDIT: I do not want to count NAs as 0. 回答1: You could use ifelse to treat NA s as

Using cummean with group_by and ignoring NAs

阅读更多关于 Using cummean with group_by and ignoring NAs

Count values less than x and find nearest values to x by multiple groups

阅读更多关于 Count values less than x and find nearest values to x by multiple groups

问题 Sample data frame data uid bas_id dist2mouth type 2020 2019 W3A9101601 2.413629 1 2021 2020 W3A9101601 2.413629 1 2022 2021 W3A9101602 2.413629 1 2023 2022 W3A9101602 3.313893 1 2032 2031 W3A9101602 3.313893 1 2033 2032 W3A9101602 3.313893 1 2034 2033 W3A9101602 3.313893 1 15023 15022 W3A9101601 1.349000 2 15025 15024 W3A9101601 3.880000 2 15026 15025 W3A9101602 3.880000 2 15027 15026 W3A9101602 0.541101 2 16106 17097 W3A9101602 1.349000 2 For each row I'd like to calculate how many rows of

Count values less than x and find nearest values to x by multiple groups

阅读更多关于 Count values less than x and find nearest values to x by multiple groups

Count values less than x and find nearest values to x by multiple groups

阅读更多关于 Count values less than x and find nearest values to x by multiple groups

remove everything after the last underscore of a column in R [duplicate]

阅读更多关于 remove everything after the last underscore of a column in R [duplicate]

问题 This question already has answers here : R regex find last occurrence of delimiter (4 answers) Closed 4 years ago . I have a dataframe and for a particular column I want to strip out everything after the last underscore. So: test <- data.frame(label=c('test_test_test', 'test_tom_cat', 'tset_eat_food', 'tisk - tisk'), stuff=c('blah', 'blag', 'gah', 'nah') , numbers=c(1,2,3, 4)) should become result <- data.frame(label=c('test_test', 'test_tom', 'tset_eat', 'tisk - tisk'), stuff=c('blah', 'blag

tidyverse: row wise calculations by group

阅读更多关于 tidyverse: row wise calculations by group

问题 I am trying to do an inventory calculation in R which requires a row wise calculation for each Mat-Plant combination. Here's a test data set - df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B" ), Plant = c("P1", "P1", "P1", "P2", "P2", "P2", "P1", "P1"), Day = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), UU = c(0L, 10L, 0L, 0L, 0L, 120L, 10L, 0L), CumDailyFcst = c(11L, 22L, 33L, 0L, 5L, 10L, 20L, 50L)), .Names = c("Mat", "Plant", "Day", "UU", "CumDailyFcst"), class = "data.frame", row