tidyverse

combine rows in data frame containing NA to make complete row

喜欢而已 提交于 2019-11-27 09:29:56
I know this is a duplicate Q but I can't seem to find the post again Using the following data df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4)) A B C D E 1 NA 3 NA 5 1 2 NA 2 NA 2 NA NA 3 NA 2 4 5 NA 4 Grouping by A , I'd like the following output using a tidyverse solution A B C D E 1 2 3 2 5 2 4 5 3 4 I have many groups in A . I think I saw an answer using coalesce but am unsure how to get it work. I'd like a solution that works with characters as well. Thanks! Not tidyverse but here's one base R solution df <- data.frame(A=c(1,1),B=c(NA,2),C=c(3,NA)

Efficient way to Fill Time-Series per group

梦想与她 提交于 2019-11-27 07:19:57
问题 I was looking for a way to fill a time series data set by time, per group. The very very inefficient way I was using was to split the data set per group and apply a custom time-series fill function (create sequence between max and min, and merge) in all elements of that list. Needless to say, this operations would not go pass the splitting. My dataset looks like, source grp cnt 1: 83 2017-06-06 13:00:00 1 2: 83 2017-06-06 23:00:00 1 3: 83 2017-06-07 03:00:00 1 4: 83 2017-06-07 07:00:00 2 5:

Spread with duplicate identifiers (using tidyverse and %>%) [duplicate]

淺唱寂寞╮ 提交于 2019-11-27 04:53:34
This question already has an answer here: Reshaping data in R with “login” “logout” times 5 answers My data looks like this: I am trying to make it look like this: I would like to do this in tidyverse using %>%-chaining. df <- structure(list(id = c(2L, 2L, 4L, 5L, 5L, 5L, 5L), start_end = structure(c(2L, 1L, 2L, 2L, 1L, 2L, 1L), .Label = c("end", "start"), class = "factor"), date = structure(c(6L, 7L, 3L, 8L, 9L, 10L, 11L), .Label = c("1979-01-03", "1979-06-21", "1979-07-18", "1989-09-12", "1991-01-04", "1994-05-01", "1996-11-04", "2005-02-01", "2009-09-17", "2010-10-01", "2012-10-06" ), class

summarise_at using different functions for different variables

北城以北 提交于 2019-11-27 01:58:12
问题 When I use group_by and summarise in dplyr, I can naturally apply different summary functions to different variables. For instance: library(tidyverse) df <- tribble( ~category, ~x, ~y, ~z, #---------------------- 'a', 4, 6, 8, 'a', 7, 3, 0, 'a', 7, 9, 0, 'b', 2, 8, 8, 'b', 5, 1, 8, 'b', 8, 0, 1, 'c', 2, 1, 1, 'c', 3, 8, 0, 'c', 1, 9, 1 ) df %>% group_by(category) %>% summarize( x=mean(x), y=median(y), z=first(z) ) results in output: # A tibble: 3 x 4 category x y z <chr> <dbl> <dbl> <dbl> 1 a

data.table equivalent of tidyr::complete()

僤鯓⒐⒋嵵緔 提交于 2019-11-26 23:07:58
问题 tidyr::complete() adds rows to a data.frame for combinations of column values that are missing from the data. Example: library(dplyr) library(tidyr) df <- data.frame(person = c(1,2,2), observation_id = c(1,1,2), value = c(1,1,1)) df %>% tidyr::complete(person, observation_id, fill = list(value=0)) yields # A tibble: 4 × 3 person observation_id value <dbl> <dbl> <dbl> 1 1 1 1 2 1 2 0 3 2 1 1 4 2 2 1 where the value of the combination person == 1 and observation_id == 2 that is missing in df

Print tibble with column breaks as in v1.3.0

我们两清 提交于 2019-11-26 21:26:52
问题 Using the latest version of tibble the output of wide tibbles is not properly displayed when setting width = Inf . Based on my tests with previous versions wide tibbles were printed nicely until versions later than 1.3.0. This is what I would like the output to be printed like: ...but this is what it looks like using the latest version of tibble: I tinkered around with the old sources but to no avail. I would like to incorporate this in a package so the solution should pass R CMD check. When

using lm in list column to predict new values using purrr

送分小仙女□ 提交于 2019-11-26 21:03:37
问题 I am trying to add a column of predictions to a dataframe that has a list column that contains an lm model. I adopted some of the code from this post. I have made a toy example here: library(dplyr) library(purrr) library(tidyr) library(broom) set.seed(1234) exampleTable <- data.frame( ind = c(rep(1:5, 5)), dep = rnorm(25), groups = rep(LETTERS[1:5], each = 5) ) %>% group_by(groups) %>% nest(.key=the_data) %>% mutate(model = the_data %>% map(~lm(dep ~ ind, data = .))) %>% mutate(Pred = map2

Duplicating (and modifying) discrete axis in ggplot2

99封情书 提交于 2019-11-26 17:10:07
问题 I want to duplicate the left-side Y-axis on a ggplot2 plot onto the right side, and then change the tick labels for a discrete (categorical) axis. I've read the answer to this question, however as can be seen on the package's repo page, the switch_axis_position() function has been removed from the cowplot package (the author cited (forthcoming?) native functionality in ggplot2). I've seen the reference page on secondary axes in ggplot2, however all the examples in that document use scale_y

Removing NA observations with dplyr::filter()

我怕爱的太早我们不能终老 提交于 2019-11-26 16:58:31
问题 My data looks like this: library(tidyverse) df <- tribble( ~a, ~b, ~c, 1, 2, 3, 1, NA, 3, NA, 2, 3 ) I can remove all NA observations with drop_na() : df %>% drop_na() Or remove all NA observations in a single column ( a for example): df %>% drop_na(a) Why can't I just use a regular != filter pipe? df %>% filter(a != NA) Why do we have to use a special function from tidyr to remove NAs? 回答1: For example: you can use: df %>% filter(!is.na(a)) to remove the NA in column a. 回答2: From @Ben Bolker

combine rows in data frame containing NA to make complete row

血红的双手。 提交于 2019-11-26 14:45:48
问题 I know this is a duplicate Q but I can't seem to find the post again Using the following data df <- data.frame(A=c(1,1,2,2),B=c(NA,2,NA,4),C=c(3,NA,NA,5),D=c(NA,2,3,NA),E=c(5,NA,NA,4)) A B C D E 1 NA 3 NA 5 1 2 NA 2 NA 2 NA NA 3 NA 2 4 5 NA 4 Grouping by A , I'd like the following output using a tidyverse solution A B C D E 1 2 3 2 5 2 4 5 3 4 I have many groups in A . I think I saw an answer using coalesce but am unsure how to get it work. I'd like a solution that works with characters as