tidyverse | 易学教程

Apply a list to a function that outputs a dataframe

阅读更多关于 Apply a list to a function that outputs a dataframe

My single-argument function outputs a dataframe library(tidyverse) myfun <-function(x) {mtcars %>% filter_(x) %>% group_by(cyl) %>% summarise(mean(disp), mean(drat)) %>% mutate(group=x)} When feeding a single-argument into this function, it outputs, as expected, a dataframe: myfun('mpg>15') cyl mean(disp) mean(drat) group 4 105 4.07 mpg>15 6 183 3.59 mpg>15 8 105 3.20 mpg>15 How to apply such a function to a list of arguments so that the output is one dataframe combining all the results over the list. For example, I'd like to apply myfun to a list c('mpg>15', 'drat>4.2') and, as the result, to

Replace NA in all columns of a dplyr chain

阅读更多关于 Replace NA in all columns of a dplyr chain

问题 The question replace NA in a dplyr chain results into the solution dt %.% group_by(a) %.% mutate(b = ifelse(is.na(b), mean(b, na.rm = T), b)) with dplyr. I want to impute all colums with dplyr chain. There is no single column to group by, rather I want all numeric columns to have all NAs replaced by the means such as column means. What is the most elegant way to replace all NAs with column means with tidyverse/dp? 回答1: We can use mutate_all with ifelse dt %>% group_by(a) %>% mutate_all(funs

How do you use approx() inside of mutate_at()?

阅读更多关于 How do you use approx() inside of mutate_at()?

I'm having issues getting approx() to work inside of a mutate_at(). I did manage to get what I want using a very long mutate() function, but for future reference I was wondering if there was a more graceful and less copy-pasting mutate_at() way to do this. The overarching problem is merging a dataset with data from 1 year intervals to one with 3 year intervals, and interpolating years with no data in the dataset with 3 year intervals. There are missing values in between the years, and one year that requires some form of extrapolation. library("tidyverse") demodf <- data.frame(groupvar =

Correlation matrix with dplyr, tidyverse and broom - P-value matrix

阅读更多关于 Correlation matrix with dplyr, tidyverse and broom - P-value matrix

all. I want to obtain the p-value from a correlation matrix using dplyr and/or broom packages and testing multiple variables at the same time . I'm aware of other methods, but dplyr seems easier and more intuitive for me. In addition, dplyr will need to correlate each variable to obtain the specific p-value, what makes the process easier and faster. I checked other links, but they did not work for this question ( example 1 , example 2 , example 3 ) When I use this code, the correlation coefficients are reported. However, the P-values are not. agreg_base_tipo_a %>% dplyr::select(S2.RT, BIS

r tidyverse spread() using multiple key value pairs not collapsing rows

阅读更多关于 r tidyverse spread() using multiple key value pairs not collapsing rows

问题 I am trying to spread() a couple of key/value pairs but the common value column does not collapse. I think that it may have to do with some previous processing, or more likely I do not know the right way to spread two or more key/value pairs to get the result I expect. I'm starting with this data set: library(tidyverse) df <- tibble(order = 1:7, line_1 = c(23,8,21,45,68,31,24), line_2 = c(63,25,25,24,48,24,63), line_3 = c(62,12,10,56,67,25,35)) There are 2 pre-spread steps to define order of

Aggregating if each observation can belong to multiple groups

阅读更多关于 Aggregating if each observation can belong to multiple groups

I want to aggregate Date by group. However, each observation can belong to several groups (e.g. observation 1 belongs to group A and B). I could not find a nice way to achieve this with data.table . Currently I created for each of the possible groups a logical variable which takes the value TRUE if the observation belongs to that group. I am looking for a better way to do this than presented below. I would also like to know how I could achieve this with the tidyverse . library(data.table) # Data set.seed(1) TF <- c(TRUE, FALSE) time <- rep(1:4, each = 5) df <- data.table(time = time, x = rnorm

Separate string after last underscore

阅读更多关于 Separate string after last underscore

This is indeed a duplicate for this question r-split-string-using-tidyrseparate , but I cannot use the MWE for my purpose, because I do not know how to adjust the regular Expression. I basically want the same thing, but split the variable after the last underscore. Reason: I have data where some columns show up several times for the same factor/type. I figured I can melt the data separate the value variable before the type string and spread it out again to a wide format with less columns. My Problem is that my variable names have different several underscores and I would like to learn how to

Struggling to Create a Pivot Table in R

阅读更多关于 Struggling to Create a Pivot Table in R

I am very, very new to any type of coding language. I am used to Pivot tables in Excel, and trying to replicate a pivot I have done in Excel in R. I have spent a long time searching the internet/ YouTube, but I just can't get it to work. I am looking to produce a table in which I the left hand side column shows a number of locations, and across the top of the table it shows different pages that have been viewed. I want to show in the table the number of views per location which each of these pages. The data frame 'specificreports' shows all views over the past year for different pages on an

mutate_at evaluation error when using group_by

阅读更多关于 mutate_at evaluation error when using group_by

mutate_at() shows an evaluation error when used with group_by() and when imputing a numerical vector for column position as the first (.vars) argument. Issue shows up when using R 3.4.2 and dplyr 0.7.4 version Works fine when using R 3.3.2 and dplyr 0.5.0 Works fine if .vars is character vector (column name) Example: # Create example dataframe Id <- c('10_1', '10_2', '11_1', '11_2', '11_3', '12_1') Month <- c(2, 3, 4, 6, 7, 8) RWA <- c(0, 0, 0, 1.579, NA, 0.379) dftest = data.frame(Id, Month, RWA) # Define column to fill NAs nacol = c('RWA') # Fill NAs with last period dftest_2 <- dftest %>%

tidyr - unique way to get combinations (using tidyverse only)

阅读更多关于 tidyr - unique way to get combinations (using tidyverse only)

I wanted to get all unique pairwise combinations of a unique string column of a dataframe using the tidyverse (ideally). Here is a dummy example: library(tidyverse) a <- letters[1:3] %>% tibble::as_tibble() a #> # A tibble: 3 x 1 #> value #> <chr> #> 1 a #> 2 b #> 3 c tidyr::crossing(a, a) %>% magrittr::set_colnames(c("words1", "words2")) #> # A tibble: 9 x 2 #> words1 words2 #> <chr> <chr> #> 1 a a #> 2 a b #> 3 a c #> 4 b a #> 5 b b #> 6 b c #> 7 c a #> 8 c b #> 9 c c Is there a way to remove 'duplicate' combinations here. That is have the output be the following in this example: # A tibble: