tidyverse | 易学教程

how to efficiently subset a dataframe into several chunks to be passed to a list of lists

阅读更多关于 how to efficiently subset a dataframe into several chunks to be passed to a list of lists

问题 I would appreciate any help to efficiently subset a data frame into several chunks to be passed to a list of lists based on the variable imput . My code below works for a few subsets, but I have 100 subsets to create and the code becomes too much and difficult to handle. Therefore, I need a more efficient approach which accomplishes the same outcome without too much code. The approach imputation_groups <- split(dat, dat$imput) discussed here allows me to split my data into a list of several

R ggplot connecting one point on a map with multiple points on the same map

阅读更多关于 R ggplot connecting one point on a map with multiple points on the same map

问题 I am trying to connect one location on the map of the US with multiple locations on the map of the US. library(tidyverse) library(flights13) First, I am selecting all flights from Newark airport (EWR) on January 1, 2013 and grabbing the geographic coordinates for the destinations (if available in the tibble airports - and add EWR's location on top): ewr <- flights %>% filter(year== 2013, month== 1, day== 1, origin== "EWR") %>% select(dest) %>% distinct(dest) %>% arrange(dest) # Join it with

Calculating Occupancy in hospital from dates with time.

阅读更多关于 Calculating Occupancy in hospital from dates with time.

问题 I am looking to calculate occupancy in emergency department (ED) with tidyverse. Occupancy is understood here in this particular problem as Admitted but did not leave the hospital within the same hour they were admitted. A clearer example would be: if I came at ED at 12:00:00 and did not leave within the hour I was admitted, then I am occupying the bospital. So for this I need to create a new column Occupancy. (A little insight to give - I want to plot occupancy by hour of the day. Yet I know

Calculating Occupancy in hospital from dates with time.

阅读更多关于 Calculating Occupancy in hospital from dates with time.

How to conditionally mutate multiple columns using “contains” and “ifelse”?

阅读更多关于 How to conditionally mutate multiple columns using “contains” and “ifelse”?

问题 I want to mutate multiple columns containing the string "account". Specifically, I want these columns to take "NA" when a certain condition is met, and another value when the condition is not met. Below I present my attempt inspired on here and here. So far, unsuccessful. Still trying, nevertheless any help would be much appreciated. My data df<-as.data.frame(structure(list(low_account = c(1, 1, 0.5, 0.5, 0.5, 0.5), high_account = c(16, 16, 56, 56, 56, 56), mid_account_0 = c(8.5, 8.5, 28.25,

Create a variable in `df1` depending on one variable of `df1` (`df1$var1`) and one variable of `df2` that is changeable depending on `df1$var1`

阅读更多关于 Create a variable in `df1` depending on one variable of `df1` (`df1$var1`) and one variable of `df2` that is changeable depending on `df1$var1`

问题 I have data frame df1 that summarises fish depths over time. df1$Site tells you the site where the fish was, df1$Ind tells you the individual and df1$Depth tells you the depth where the fish was at a specific df1$Datetime . On the other hand, I have df2 that summarises the intensity of the currents over time (EVERY THREE HOURS) from the surface to 39 meters depth at intervals of 8 meters ( m0-7 , m8-15 , m16-23 , m24-31 and m32-39 ). As an example: df1<-data.frame(Datetime=c("2016-08-01 15:34

tidyr - unique way to get combinations (using tidyverse only)

阅读更多关于 tidyr - unique way to get combinations (using tidyverse only)

问题 I wanted to get all unique pairwise combinations of a unique string column of a dataframe using the tidyverse (ideally). Here is a dummy example: library(tidyverse) a <- letters[1:3] %>% tibble::as_tibble() a #> # A tibble: 3 x 1 #> value #> <chr> #> 1 a #> 2 b #> 3 c tidyr::crossing(a, a) %>% magrittr::set_colnames(c("words1", "words2")) #> # A tibble: 9 x 2 #> words1 words2 #> <chr> <chr> #> 1 a a #> 2 a b #> 3 a c #> 4 b a #> 5 b b #> 6 b c #> 7 c a #> 8 c b #> 9 c c Is there a way to

Aggregating if each observation can belong to multiple groups

阅读更多关于 Aggregating if each observation can belong to multiple groups

问题 I want to aggregate Date by group. However, each observation can belong to several groups (e.g. observation 1 belongs to group A and B). I could not find a nice way to achieve this with data.table . Currently I created for each of the possible groups a logical variable which takes the value TRUE if the observation belongs to that group. I am looking for a better way to do this than presented below. I would also like to know how I could achieve this with the tidyverse . library(data.table) #

How do pipes work with purrr map() function and the “.” (dot) symbol

阅读更多关于 How do pipes work with purrr map() function and the “.” (dot) symbol

问题 When using both pipes and the map() function from purrr, I am confused about how data and variables are passed along. For instance, this code works as I expect: library(tidyverse) cars %>% select_if(is.numeric) %>% map(~hist(.)) Yet, when I try something similar using ggplot, it behaves in a strange way. cars %>% select_if(is.numeric) %>% map(~ggplot(cars, aes(.)) + geom_histogram()) I'm guessing this is because the "." in this case is passing a vector to aes(), which is expecting a column

tidyverse interfering with ggplot2? cannot access map_data

阅读更多关于 tidyverse interfering with ggplot2? cannot access map_data

问题 Running these commands in the console, the output is: > cty0 = ggplot2::map_data("county") > library(tidyverse) Loading tidyverse: ggplot2 Loading tidyverse: tibble Loading tidyverse: tidyr Loading tidyverse: readr Loading tidyverse: purrr Loading tidyverse: dplyr Conflicts with tidy packages ----------------------------------------------------------------------------------------------- filter(): dplyr, stats lag(): dplyr, stats map(): purrr, maps > cty0 = ggplot2::map_data("county") Error: