tidyverse | 易学教程

combine rows in data frame containing NA to make complete row

阅读更多关于 combine rows in data frame containing NA to make complete row

I know this is a duplicate Q but I can't seem to find the post again Using the following data df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4)) A B C D E 1 NA 3 NA 5 1 2 NA 2 NA 2 NA NA 3 NA 2 4 5 NA 4 Grouping by A , I'd like the following output using a tidyverse solution A B C D E 1 2 3 2 5 2 4 5 3 4 I have many groups in A . I think I saw an answer using coalesce but am unsure how to get it work. I'd like a solution that works with characters as well. Thanks! Not tidyverse but here's one base R solution df <- data.frame(A=c(1,1),B=c(NA,2),C=c(3,NA)

Efficient way to Fill Time-Series per group

阅读更多关于 Efficient way to Fill Time-Series per group

问题 I was looking for a way to fill a time series data set by time, per group. The very very inefficient way I was using was to split the data set per group and apply a custom time-series fill function (create sequence between max and min, and merge) in all elements of that list. Needless to say, this operations would not go pass the splitting. My dataset looks like, source grp cnt 1: 83 2017-06-06 13:00:00 1 2: 83 2017-06-06 23:00:00 1 3: 83 2017-06-07 03:00:00 1 4: 83 2017-06-07 07:00:00 2 5:

Spread with duplicate identifiers (using tidyverse and %>%) [duplicate]

阅读更多关于 Spread with duplicate identifiers (using tidyverse and %>%) [duplicate]

This question already has an answer here: Reshaping data in R with “login” “logout” times 5 answers My data looks like this: I am trying to make it look like this: I would like to do this in tidyverse using %>%-chaining. df <- structure(list(id = c(2L, 2L, 4L, 5L, 5L, 5L, 5L), start_end = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 1L), .Label = c("end", "start"), class = "factor"), date = structure(c(6L, 7L, 3L, 8L, 9L, 10L, 11L), .Label = c("1979-01-03", "1979-06-21", "1979-07-18", "1989-09-12", "1991-01-04", "1994-05-01", "1996-11-04", "2005-02-01", "2009-09-17", "2010-10-01", "2012-10-06" ), class

summarise_at using different functions for different variables

阅读更多关于 summarise_at using different functions for different variables

问题 When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance: library(tidyverse) df <- tribble( ~category, ~x, ~y, ~z, #---------------------- 'a', 4, 6, 8, 'a', 7, 3, 0, 'a', 7, 9, 0, 'b', 2, 8, 8, 'b', 5, 1, 8, 'b', 8, 0, 1, 'c', 2, 1, 1, 'c', 3, 8, 0, 'c', 1, 9, 1 ) df %>% group_by(category) %>% summarize( x=mean(x), y=median(y), z=first(z) ) results in output: # A tibble: 3 x 4 category x y z <chr> <dbl> <dbl> <dbl> 1 a

data.table equivalent of tidyr::complete()

阅读更多关于 data.table equivalent of tidyr::complete()

问题 tidyr::complete() adds rows to a data.frame for combinations of column values that are missing from the data. Example: library(dplyr) library(tidyr) df <- data.frame(person = c(1,2,2), observation_id = c(1,1,2), value = c(1,1,1)) df %>% tidyr::complete(person, observation_id, fill = list(value=0)) yields # A tibble: 4 × 3 person observation_id value <dbl> <dbl> <dbl> 1 1 1 1 2 1 2 0 3 2 1 1 4 2 2 1 where the value of the combination person == 1 and observation_id == 2 that is missing in df

Print tibble with column breaks as in v1.3.0

阅读更多关于 Print tibble with column breaks as in v1.3.0

问题 Using the latest version of tibble the output of wide tibbles is not properly displayed when setting width = Inf . Based on my tests with previous versions wide tibbles were printed nicely until versions later than 1.3.0. This is what I would like the output to be printed like: ...but this is what it looks like using the latest version of tibble: I tinkered around with the old sources but to no avail. I would like to incorporate this in a package so the solution should pass R CMD check. When

using lm in list column to predict new values using purrr

阅读更多关于 using lm in list column to predict new values using purrr

问题 I am trying to add a column of predictions to a dataframe that has a list column that contains an lm model. I adopted some of the code from this post. I have made a toy example here: library(dplyr) library(purrr) library(tidyr) library(broom) set.seed(1234) exampleTable <- data.frame( ind = c(rep(1:5, 5)), dep = rnorm(25), groups = rep(LETTERS[1:5], each = 5) ) %>% group_by(groups) %>% nest(.key=the_data) %>% mutate(model = the_data %>% map(~lm(dep ~ ind, data = .))) %>% mutate(Pred = map2

Duplicating (and modifying) discrete axis in ggplot2

阅读更多关于 Duplicating (and modifying) discrete axis in ggplot2

问题 I want to duplicate the left-side Y-axis on a ggplot2 plot onto the right side, and then change the tick labels for a discrete (categorical) axis. I've read the answer to this question, however as can be seen on the package's repo page, the switch_axis_position() function has been removed from the cowplot package (the author cited (forthcoming?) native functionality in ggplot2). I've seen the reference page on secondary axes in ggplot2, however all the examples in that document use scale_y

Removing NA observations with dplyr::filter()

阅读更多关于 Removing NA observations with dplyr::filter()

问题 My data looks like this: library(tidyverse) df <- tribble( ~a, ~b, ~c, 1, 2, 3, 1, NA, 3, NA, 2, 3 ) I can remove all NA observations with drop_na() : df %>% drop_na() Or remove all NA observations in a single column ( a for example): df %>% drop_na(a) Why can't I just use a regular != filter pipe? df %>% filter(a != NA) Why do we have to use a special function from tidyr to remove NAs? 回答1: For example: you can use: df %>% filter(!is.na(a)) to remove the NA in column a. 回答2: From @Ben Bolker

combine rows in data frame containing NA to make complete row

阅读更多关于 combine rows in data frame containing NA to make complete row

问题 I know this is a duplicate Q but I can't seem to find the post again Using the following data df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4)) A B C D E 1 NA 3 NA 5 1 2 NA 2 NA 2 NA NA 3 NA 2 4 5 NA 4 Grouping by A , I'd like the following output using a tidyverse solution A B C D E 1 2 3 2 5 2 4 5 3 4 I have many groups in A . I think I saw an answer using coalesce but am unsure how to get it work. I'd like a solution that works with characters as