tidyverse

How to unnest column-list?

China☆狼群 提交于 2019-11-29 08:25:04
I have a tibble like: tibble(a = c('first', 'second'), b = list(c('colA' = 1, 'colC' = 2), c('colA'= 3, 'colB'=2))) # A tibble: 2 x 2 a b <chr> <list> 1 first <dbl [2]> 2 second <dbl [2]> Which a would like to turn into this form: # A tibble: 2 x 4 a colA colB colC <chr> <dbl> <dbl> <dbl> 1 first 1. NA 2. 2 second 3. 2. NA I tried to use unnest() , but I am having issues preserving the elements' names from the nested values. You can do this by coercing the elements in the list column to data frames arranged as you like, which will unnest nicely: library(tidyverse) tibble(a = c('first', 'second

Importing multiple .csv files with variable column types into R

試著忘記壹切 提交于 2019-11-28 13:14:37
How can I properly build an lapply to read (from out of one directory) all the .csv files, load all the columns as strings and then bind them into one data frame. Per this , I have a way to get all the .csv files loaded and bound into a dataframe. Unfortunately they are getting hung up on the variablity of how the columns are getting type cast. Thus giving me this error: Error: Can not automatically convert from character to integer in column I have tried supplementing the code with the arguments for data type and am trying to just keep everything as characters; I am getting stuck now on being

round_any equivalent for dplyr?

爷,独闯天下 提交于 2019-11-28 07:30:13
问题 I am trying to make a switch to the "new" tidyverse ecosystem and try to avoid loading the old packages from Wickham et al. I used to rely my coding previously. I found round_any function from plyr useful in many cases where I needed custom rounding for plots, tables, etc. E.g. x <- c(1.1, 1.0, 0.99, 0.1, 0.01, 0.001) library(plyr) round_any(x, 0.1, floor) # [1] 1.1 1.0 0.9 0.1 0.0 0.0 Is there an equivalent for round_any function from plyr package in tidyverse ? 回答1: ggplot::cut_width as

Filling missing dates in a grouped time series - a tidyverse-way?

僤鯓⒐⒋嵵緔 提交于 2019-11-28 03:18:21
问题 Given a data.frame that contains a time series and one or ore grouping fields. So we have several time series - one for each grouping combination. But some dates are missing. So, what's the easiest (in terms of the most "tidyverse way") of adding these dates with the right grouping values? Normally I would say I generate a data.frame with all dates and do a full_join with my time series. But now we have to do it for each combination of grouping values -- and fill in the grouping values. Let's

using lm in list column to predict new values using purrr

浪子不回头ぞ 提交于 2019-11-27 23:23:28
I am trying to add a column of predictions to a dataframe that has a list column that contains an lm model. I adopted some of the code from this post . I have made a toy example here: library(dplyr) library(purrr) library(tidyr) library(broom) set.seed(1234) exampleTable <- data.frame( ind = c(rep(1:5, 5)), dep = rnorm(25), groups = rep(LETTERS[1:5], each = 5) ) %>% group_by(groups) %>% nest(.key=the_data) %>% mutate(model = the_data %>% map(~lm(dep ~ ind, data = .))) %>% mutate(Pred = map2(model, the_data, predict)) exampleTable <- exampleTable %>% mutate(ind=row_number()) that gives me a

Print tibble with column breaks as in v1.3.0

最后都变了- 提交于 2019-11-27 23:17:22
Using the latest version of tibble the output of wide tibbles is not properly displayed when setting width = Inf . Based on my tests with previous versions wide tibbles were printed nicely until versions later than 1.3.0. This is what I would like the output to be printed like: ...but this is what it looks like using the latest version of tibble: I tinkered around with the old sources but to no avail. I would like to incorporate this in a package so the solution should pass R CMD check. When I just copied a load of functions from tibble v1.3.0 I managed to restore the old behavior but could

data.table equivalent of tidyr::complete()

狂风中的少年 提交于 2019-11-27 22:16:27
tidyr::complete() adds rows to a data.frame for combinations of column values that are missing from the data. Example: library(dplyr) library(tidyr) df <- data.frame(person = c(1,2,2), observation_id = c(1,1,2), value = c(1,1,1)) df %>% tidyr::complete(person, observation_id, fill = list(value=0)) yields # A tibble: 4 × 3 person observation_id value <dbl> <dbl> <dbl> 1 1 1 1 2 1 2 0 3 2 1 1 4 2 2 1 where the value of the combination person == 1 and observation_id == 2 that is missing in df has been filled in with a value of 0. What would be the equivalent of this in data.table ? I reckon that

Mutate multiple variable to create multiple new variables

泄露秘密 提交于 2019-11-27 16:13:27
Let's say I have a tibble where I need to take multiple variables and mutate them into new multiple new variables. As an example, here is a simple tibble: tb <- tribble( ~x, ~y1, ~y2, ~y3, ~z, 1,2,4,6,2, 2,1,2,3,3, 3,6,4,2,1 ) I want to subtract variable z from every variable with a name starting with "y", and mutate the results as new variables of tb. Also, suppose I don't know how many "y" variables I have. I want the solution to fit nicely within tidyverse / dplyr workflow. In essence, I don't understand how to mutate multiple variables into multiple new variables. I'm not sure if you can

Duplicating (and modifying) discrete axis in ggplot2

大憨熊 提交于 2019-11-27 15:29:16
I want to duplicate the left-side Y-axis on a ggplot2 plot onto the right side, and then change the tick labels for a discrete (categorical) axis. I've read the answer to this question , however as can be seen on the package's repo page , the switch_axis_position() function has been removed from the cowplot package (the author cited (forthcoming?) native functionality in ggplot2). I've seen the reference page on secondary axes in ggplot2, however all the examples in that document use scale_y_continuous rather than scale_y_discrete . And, indeed, when I try to use the discrete function, I get

Removing NA observations with dplyr::filter()

99封情书 提交于 2019-11-27 14:58:21
My data looks like this: library(tidyverse) df <- tribble( ~a, ~b, ~c, 1, 2, 3, 1, NA, 3, NA, 2, 3 ) I can remove all NA observations with drop_na() : df %>% drop_na() Or remove all NA observations in a single column ( a for example): df %>% drop_na(a) Why can't I just use a regular != filter pipe? df %>% filter(a != NA) Why do we have to use a special function from tidyr to remove NAs? JeffZheng For example: you can use: df %>% filter(!is.na(a)) to remove the NA in column a. emehex From @Ben Bolker: [T]his has nothing specifically to do with dplyr::filter() From @Marat Talipov: [A]ny