tidyverse

Joining list of data.frames from map() call

谁都会走 提交于 2019-12-07 08:04:09
问题 Is there a "tidyverse" way to join a list of data.frames (a la full_join() , but for >2 data.frames)? I have a list of data.frames as a result of a call to map() . I've used Reduce() to do something like this before, but would like to merge them as part of a pipeline - just haven't found an elegant way to do that. Toy example: library(tidyverse) ## Function to make a data.frame with an ID column and a random variable column with mean = df_mean make.df <- function(df_mean){ data.frame(id = 1

Computation failed in `stat_smooth()`: object 'C_crspl' not found

ⅰ亾dé卋堺 提交于 2019-12-07 04:45:06
问题 I am trying to add a geom_smooth() to a qplot() with the following code: library(ggplot2) library(ggplot2movies) qplot(votes, rating, data = movies) + geom_smooth() However, the smoother is missing from the plot. I also receive the following warning message: Computation failed in stat_smooth() : object 'C_crspl' not found Does anybody know what is wrong here? This is my setup: > sessionInfo() R version 3.4.1 (2017-06-30) Platform: x86_64-pc-linux-gnu (64-bit) Running under: Ubuntu 16.04.1 LTS

Applying group_by and summarise(sum) but keep columns with non-relevant conflicting data?

微笑、不失礼 提交于 2019-12-07 01:34:22
问题 My question is very similar to Applying group_by and summarise on data while keeping all the columns' info but I would like to keep columns which get excluded because they conflict after grouping. Label <- c("203c","203c","204a","204a","204a","204a","204a","204a","204a","204a") Type <- c("wholefish","flesh","flesh","fleshdelip","formula","formuladelip", "formula","formuladelip","wholefish", "wholefishdelip") Proportion <- c(1,1,0.67714,0.67714,0.32285,0.32285,0.32285, 0.32285, 0.67714,0.67714

Opposite of unnest_tokens

☆樱花仙子☆ 提交于 2019-12-07 01:26:42
问题 This is most likely a stupid question, but I've googled and googled and can't find a solution. I think it's because I don't know the right way to word my question to search. I have a data frame that I have converted to tidy text format in R to get rid of stop words. I would now like to 'untidy' that data frame back to its original format. What's the opposite / inverse command of unnest_tokens? Edit: here is what the data I'm working with look like. I'm trying to replicate analyses from Silge

Calculating Occupancy in hospital from dates with time.

放肆的年华 提交于 2019-12-06 15:13:58
I am looking to calculate occupancy in emergency department (ED) with tidyverse. Occupancy is understood here in this particular problem as Admitted but did not leave the hospital within the same hour they were admitted. A clearer example would be: if I came at ED at 12:00:00 and did not leave within the hour I was admitted, then I am occupying the bospital. So for this I need to create a new column Occupancy. (A little insight to give - I want to plot occupancy by hour of the day. Yet I know how to plot this, but do not know how to calculate occupancy. Thus no need for you to be bogged down

dplyr - sum of multiple columns using regular expressions

二次信任 提交于 2019-12-06 14:55:39
For the dataset mtcars2 mtcars2 = mtcars mtcars2 = mtcars2 %>% mutate(cyl9=cyl, disp9=disp, gear2=gear) I want to get a new column which is the sum of multiple columns, by using regular expressions to capture the pattern. This is a solution, however this is done by hard-coding select(mtcars2, cyl9) + select(mtcars2, disp9) + select(mtcars2, gear2) I tried something like this but it gives me a number instead of a vector mtcars2 %>% select(matches("[0-9]")) %>% sum Please dplyr solutions only, since i need to apply these functions to a sql table later on. Thanks! Update.. I need the solution to

Calculate the mean between several columns of df2 that can vary according to the variable `var1` of df1 and add the value to a new variable in df1

匆匆过客 提交于 2019-12-06 14:53:11
I have a data frame df1 that summarises the depth of different fishes over time at different places. On the other hand, I have df2 that summarises the intensity of the currents over time (EVERY THREE HOURS) from the surface to 39 meters depth at intervals of 8 meters ( m0-7 , m8-15 , m16-23 , m24-31 and m32-39 ) in a specific place. As an example: df1<-data.frame(Datetime=c("2016-08-01 15:34:07","2016-08-01 16:25:16","2016-08-01 17:29:16","2016-08-01 18:33:16","2016-08-01 20:54:16","2016-08-01 22:48:16"),Site=c("BD","HG","BD","BD","BD","BD"),Ind=c(16,17,19,16,17,16), Depth=c(5.3,24,36.4,42,NA

How to conditionally mutate multiple columns using “contains” and “ifelse”?

流过昼夜 提交于 2019-12-06 10:03:43
I want to mutate multiple columns containing the string "account". Specifically, I want these columns to take "NA" when a certain condition is met, and another value when the condition is not met. Below I present my attempt inspired on here and here . So far, unsuccessful. Still trying, nevertheless any help would be much appreciated. My data df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16, 16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25, 28.25, 28.25, 28.25), mean_account_0 = c(31.174, 30.1922101449275, 30.1922101449275, 33.3055555555556,

Create a variable in `df1` depending on one variable of `df1` (`df1$var1`) and one variable of `df2` that is changeable depending on `df1$var1`

荒凉一梦 提交于 2019-12-06 08:36:16
I have data frame df1 that summarises fish depths over time. df1$Site tells you the site where the fish was, df1$Ind tells you the individual and df1$Depth tells you the depth where the fish was at a specific df1$Datetime . On the other hand, I have df2 that summarises the intensity of the currents over time (EVERY THREE HOURS) from the surface to 39 meters depth at intervals of 8 meters ( m0-7 , m8-15 , m16-23 , m24-31 and m32-39 ). As an example: df1<-data.frame(Datetime=c("2016-08-01 15:34:07","2016-08-01 16:25:16","2016-08-01 17:29:16","2016-08-01 18:33:16","2016-08-01 20:54:16","2016-08

Passing column names through multiple functions with dplyr

我们两清 提交于 2019-12-06 07:02:15
I wrote a simple function to create tables of percentages in dplyr : library(dplyr) df = tibble( Gender = sample(c("Male", "Female"), 100, replace = TRUE), FavColour = sample(c("Red", "Blue"), 100, replace = TRUE) ) quick_pct_tab = function(df, col) { col_quo = enquo(col) df %>% count(!! col_quo) %>% mutate(Percent = (100 * n / sum(n))) } df %>% quick_pct_tab(FavColour) # Output: # A tibble: 2 x 3 FavColour n Percent <chr> <int> <dbl> 1 Blue 58 58 2 Red 42 42 This works great. However, when I tried to build on top of this, writing a new function that calculated the same percentages with