tidyverse | 易学教程

dplyr - sum of multiple columns using regular expressions

阅读更多关于 dplyr - sum of multiple columns using regular expressions

问题 For the dataset mtcars2 mtcars2 = mtcars mtcars2 = mtcars2 %>% mutate(cyl9=cyl, disp9=disp, gear2=gear) I want to get a new column which is the sum of multiple columns, by using regular expressions to capture the pattern. This is a solution, however this is done by hard-coding select(mtcars2, cyl9) + select(mtcars2, disp9) + select(mtcars2, gear2) I tried something like this but it gives me a number instead of a vector mtcars2 %>% select(matches("[0-9]")) %>% sum Please dplyr solutions only,

Calculate the mean between several columns of df2 that can vary according to the variable `var1` of df1 and add the value to a new variable in df1

阅读更多关于 Calculate the mean between several columns of df2 that can vary according to the variable `var1` of df1 and add the value to a new variable in df1

问题 I have a data frame df1 that summarises the depth of different fishes over time at different places. On the other hand, I have df2 that summarises the intensity of the currents over time (EVERY THREE HOURS) from the surface to 39 meters depth at intervals of 8 meters ( m0-7 , m8-15 , m16-23 , m24-31 and m32-39 ) in a specific place. As an example: df1<-data.frame(Datetime=c("2016-08-01 15:34:07","2016-08-01 16:25:16","2016-08-01 17:29:16","2016-08-01 18:33:16","2016-08-01 20:54:16","2016-08

How to fix 'Quosures can only be unquoted within a quasiquotation context' error in R function

阅读更多关于 How to fix 'Quosures can only be unquoted within a quasiquotation context' error in R function

问题 I am trying to write my first function using rlang and I am having some trouble fixing the following error. I've read the vignette, but didn't see a good example of what I'm trying to do. library(babynames) library(tidyverse) name_graph <- function(data, name, sex){ name <- enquo(name) sex <- enquo(sex) data %>% filter_(name == !!name, sex == !!sex) %>% select(year, prop) %>% ggplot()+ geom_line(mapping = aes(year, prop)) } name_graph(babynames, Robert, M) I'm expecting my distribution graph,

Left joining in R between two timestamps

阅读更多关于 Left joining in R between two timestamps

问题 My goal is to perform a left join on intervals where the bike_id matches and the created_at timestamp in records is BETWEEN start and end in the intervals table > class(records) [1] "data.table" "data.frame" > class(intervals) [1] "data.table" "data.frame" > records bike_id created_at resolved_at 1 28780 2019-05-03 08:29:18 2019-05-03 08:35:37 2 28780 2019-05-03 21:05:28 2019-05-03 21:07:28 3 28780 2019-05-04 21:13:39 2019-05-04 21:15:40 4 28780 2019-05-07 17:24:20 2019-05-07 17:26:39 5 28780

Passing column names through multiple functions with dplyr

阅读更多关于 Passing column names through multiple functions with dplyr

问题 I wrote a simple function to create tables of percentages in dplyr : library(dplyr) df = tibble( Gender = sample(c("Male", "Female"), 100, replace = TRUE), FavColour = sample(c("Red", "Blue"), 100, replace = TRUE) ) quick_pct_tab = function(df, col) { col_quo = enquo(col) df %>% count(!! col_quo) %>% mutate(Percent = (100 * n / sum(n))) } df %>% quick_pct_tab(FavColour) # Output: # A tibble: 2 x 3 FavColour n Percent <chr> <int> <dbl> 1 Blue 58 58 2 Red 42 42 This works great. However, when I

mutate_at evaluation error when using group_by

阅读更多关于 mutate_at evaluation error when using group_by

问题 mutate_at() shows an evaluation error when used with group_by() and when imputing a numerical vector for column position as the first (.vars) argument. Issue shows up when using R 3.4.2 and dplyr 0.7.4 version Works fine when using R 3.3.2 and dplyr 0.5.0 Works fine if .vars is character vector (column name) Example: # Create example dataframe Id <- c('10_1', '10_2', '11_1', '11_2', '11_3', '12_1') Month <- c(2, 3, 4, 6, 7, 8) RWA <- c(0, 0, 0, 1.579, NA, 0.379) dftest = data.frame(Id, Month,

Non-standard eval in dplyr::mutate

阅读更多关于 Non-standard eval in dplyr::mutate

问题 In theory this should work, as I've read the tidyverse guide on NSE, but it throws me an error as seen in the bottom of this example. Why is this? I understand how to do a simple quasiquotation of an object, but I do not understand how to evaluate a fraction of two quasiquoted objects. Can anyone help with this? tmp <- structure(list(qa11a = structure(c(1616, 7293, 1528, 1219, 2049, 286), label = "Total voters removed from Nov. 2008 to Nov. 2010", class = c("labelled","numeric")), state_abbv

Separate string after last underscore

阅读更多关于 Separate string after last underscore

问题 This is indeed a duplicate for this question r-split-string-using-tidyrseparate, but I cannot use the MWE for my purpose, because I do not know how to adjust the regular Expression. I basically want the same thing, but split the variable after the last underscore. Reason: I have data where some columns show up several times for the same factor/type. I figured I can melt the data separate the value variable before the type string and spread it out again to a wide format with less columns. My

Struggling to Create a Pivot Table in R

阅读更多关于 Struggling to Create a Pivot Table in R

问题 I am very, very new to any type of coding language. I am used to Pivot tables in Excel, and trying to replicate a pivot I have done in Excel in R. I have spent a long time searching the internet/ YouTube, but I just can't get it to work. I am looking to produce a table in which I the left hand side column shows a number of locations, and across the top of the table it shows different pages that have been viewed. I want to show in the table the number of views per location which each of these

Match character vector in a dataframe with another character vector and trim character

阅读更多关于 Match character vector in a dataframe with another character vector and trim character

问题 Here is a dataframe and a vector. df1 <- tibble(var1 = c("abcd", "efgh", "ijkl", "qrst")) vec <- c("abcd", "mnop", "ijkl") Now, for all the values in var1 that matches with the values in vec, keep only first 3 characters in var1 such that the desired solution is: df2 <- tibble(var1 = c("abc", "efgh", "ijk", "qrst")) Since, "abcd" matches, we keep only 3 characters i.e. "abc" in df2, but "efgh" doesn't exist in vec, so we keep it as is i.e "efgh" in df2. How can I use dplyr and/or stringr to