dplyr

tidyverse: row wise calculations by group

北城以北 提交于 2021-02-07 06:25:12
问题 I am trying to do an inventory calculation in R which requires a row wise calculation for each Mat-Plant combination. Here's a test data set - df <- structure(list(Mat = c("A", "A", "A", "A", "A", "A", "B", "B" ), Plant = c("P1", "P1", "P1", "P2", "P2", "P2", "P1", "P1"), Day = c(1L, 2L, 3L, 1L, 2L, 3L, 1L, 2L), UU = c(0L, 10L, 0L, 0L, 0L, 120L, 10L, 0L), CumDailyFcst = c(11L, 22L, 33L, 0L, 5L, 10L, 20L, 50L)), .Names = c("Mat", "Plant", "Day", "UU", "CumDailyFcst"), class = "data.frame", row

Calculate all the absolute differences between 6 columns of a table using mutate? [duplicate]

眉间皱痕 提交于 2021-02-07 04:13:28
问题 This question already has answers here : Pairwise subtraction in a dataframe R (2 answers) Closed 7 months ago . I have a table with 6 columns Z1 to Z6, and I want to calculate the absolute value of the difference between each of these columns. So far, I enumerate all the differences in a mutate command: FactArray <- FactArray %>% mutate(diff12 = abs(Z1-Z2), diff13 = abs(Z1-Z3), diff14 = abs(Z1-Z4), diff15 = abs(Z1-Z5), diff16 = abs(Z1-Z6), diff23 = abs(Z2-Z3), diff24 = abs(Z2-Z4), diff25 =

Complex cumulative sum with double resets

て烟熏妆下的殇ゞ 提交于 2021-02-07 04:08:50
问题 I'm trying to follow some rules about when to group data to chart. How would I go from this data frame: # A tibble: 11 x 8 assay year qtr invalid valid total_assays hfr predicted_inv <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 test_case 2016. 1. 2. 36. 38. 0.0350 1.33 2 test_case 2016. 2. 1. 34. 35. 0.0350 1.23 3 test_case 2016. 3. 0. 25. 25. 0.0350 0.875 4 test_case 2016. 4. 2. 23. 25. 0.0350 0.875 5 test_case 2017. 1. 1. 29. 30. 0.0350 1.05 6 test_case 2017. 2. 2. 24. 26. 0.0350 0.910

Complex cumulative sum with double resets

戏子无情 提交于 2021-02-07 04:01:14
问题 I'm trying to follow some rules about when to group data to chart. How would I go from this data frame: # A tibble: 11 x 8 assay year qtr invalid valid total_assays hfr predicted_inv <chr> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> <dbl> 1 test_case 2016. 1. 2. 36. 38. 0.0350 1.33 2 test_case 2016. 2. 1. 34. 35. 0.0350 1.23 3 test_case 2016. 3. 0. 25. 25. 0.0350 0.875 4 test_case 2016. 4. 2. 23. 25. 0.0350 0.875 5 test_case 2017. 1. 1. 29. 30. 0.0350 1.05 6 test_case 2017. 2. 2. 24. 26. 0.0350 0.910

dplyr arrange() function sort by missing values

喜欢而已 提交于 2021-02-06 15:28:31
问题 I am attempting to work through Hadley Wickham's R for Data Science and have gotten tripped up on the following question: "How could you use arrange() to sort all missing values to the start? (Hint: use is.na())" I am using the flights dataset included in the nycflights13 package. Given that arrange() sorts all unknown values to the bottom of the dataframe, I am not sure how one would do the opposite across the missing values of all variables. I realize that this question can be answered with

dplyr arrange() function sort by missing values

风格不统一 提交于 2021-02-06 15:26:16
问题 I am attempting to work through Hadley Wickham's R for Data Science and have gotten tripped up on the following question: "How could you use arrange() to sort all missing values to the start? (Hint: use is.na())" I am using the flights dataset included in the nycflights13 package. Given that arrange() sorts all unknown values to the bottom of the dataframe, I am not sure how one would do the opposite across the missing values of all variables. I realize that this question can be answered with

What is the difference between . and .data?

て烟熏妆下的殇ゞ 提交于 2021-02-06 11:11:29
问题 I'm trying to develop a deeper understanding of using the dot (".") with dplyr and using the .data pronoun with dplyr . The code I was writing that motivated this post, looked something like this: cat_table <- tibble( variable = vector("character"), category = vector("numeric"), n = vector("numeric") ) for(i in c("cyl", "vs", "am")) { cat_stats <- mtcars %>% count(.data[[i]]) %>% mutate(variable = names(.)[1]) %>% rename(category = 1) cat_table <- bind_rows(cat_table, cat_stats) } # A tibble:

What is the difference between . and .data?

馋奶兔 提交于 2021-02-06 11:08:35
问题 I'm trying to develop a deeper understanding of using the dot (".") with dplyr and using the .data pronoun with dplyr . The code I was writing that motivated this post, looked something like this: cat_table <- tibble( variable = vector("character"), category = vector("numeric"), n = vector("numeric") ) for(i in c("cyl", "vs", "am")) { cat_stats <- mtcars %>% count(.data[[i]]) %>% mutate(variable = names(.)[1]) %>% rename(category = 1) cat_table <- bind_rows(cat_table, cat_stats) } # A tibble:

What is the difference between . and .data?

有些话、适合烂在心里 提交于 2021-02-06 11:07:24
问题 I'm trying to develop a deeper understanding of using the dot (".") with dplyr and using the .data pronoun with dplyr . The code I was writing that motivated this post, looked something like this: cat_table <- tibble( variable = vector("character"), category = vector("numeric"), n = vector("numeric") ) for(i in c("cyl", "vs", "am")) { cat_stats <- mtcars %>% count(.data[[i]]) %>% mutate(variable = names(.)[1]) %>% rename(category = 1) cat_table <- bind_rows(cat_table, cat_stats) } # A tibble:

R dplyr left join - multiple returned values and new rows: how to ask for the first match only?

时间秒杀一切 提交于 2021-02-06 09:34:24
问题 Let's say I have a list of suburb names, crime rate and their council names on a separate table. I know that left_join(table1, table2, by=Suburb) will return the table with newly added rows due to the multiple matches for council. The problem is that suburbs 3 and 4 overlap into two councils. Is there a way to only get the left join to only return the first match only rather than creating new rows to facilitate for the extra ones? In addition, on Table 2, is there a function to only keep the