tidyr | 易学教程

converting a long-formated dataframe to wide format tidyverse [duplicate]

阅读更多关于 converting a long-formated dataframe to wide format tidyverse [duplicate]

问题 This question already has answers here : How to reshape data from long to wide format (12 answers) Closed 3 months ago . Below I first successfully long-format my dat , but when I try to convert it back to its original wide-format I don't get the same output. Is there a fix for this? library(tidyverse) ACGR <- read.csv('https://raw.githubusercontent.com/rnorouzian/e/master/ACGR%202010-11%20to%202016-17.csv', na = "---") dat <- ACGR %>% pivot_longer(names_to = "year", values_to = "grad_rate",

R fill missing dates by category

阅读更多关于 R fill missing dates by category

问题 x<-data.frame(product=c(rep("A",3),rep("B",4)), xdate=as.Date(c("2020-01-01","2020-01-02","2020-01-04",'2020-01-02','2020-01-04','2020-01-07','2020-01-08')), number=sample(1:10,7)) In sample data I want to fill missing dates by category. In the sample data it means that for category A I want all missing dates between it's minimum date 2020-01-01 and maximum '2020-01-04 and the same logic for category B. I am aware of function complete but it seems like it's insufficient for what I am looking

Convert nested data.frame to a hierarchical list

阅读更多关于 Convert nested data.frame to a hierarchical list

问题 Is there a neat way to convert a nested data.frame to a hierarchical list? I do it below with a for loop, but ideally there is a neater solution that generalizes to an arbitrary number of nested columns. nested_df <- expand.grid(V1 = c('a','b','c'), V2 = c('z','y'))%>% group_by_all()%>% do(x=runif(10))%>% ungroup nested_ls <- list() for(v1 in unique(nested_df$V1)){ for(v2 in unique(nested_df$V2)){ nested_ls[[v1]][[v2]] <- nested_df%>% filter(V1==v1 & V2==v2)%>% pull(x)%>% unlist } } str

`unnest_wider` multiple columns

阅读更多关于 `unnest_wider` multiple columns

问题 I have a tibble with multiple columns with multiple list columns I'd like to unnest_wider . df1 <- tibble( gr = c('a', 'b', 'c'), values1 = list(1:2, 3:4, 5:6), values2 = list(1:2, 3:4, 5:6) ) I have tried many approaches that have not worked including adding a vector into col df1 %>% # unnest_wider doesn't take multiple inputs unnest_wider(col = c(values, values2), names_sep = c("_1", "_2"), names_repair = "unique") and trying mutate_at df1 %>% # mutate_at doesn't send data mutate_at(vars

proportion of factors and dummies

阅读更多关于 proportion of factors and dummies

问题 I have a data set full of factors and dummies, I want to see the proportion of each value after dplyr::group_by(cyl) mtcars; rownames(mtcars) <- NULL df <- mtcars[,c(2,8,9)] head(df) cyl vs am 1 6 0 1 2 6 0 1 3 4 1 1 4 6 1 0 5 8 0 0 6 6 1 0 Expected answer I have in cyl 6 6 6 6 for vs column two of them is 1 two of them 0 1 0 6 50% 50% 4 100% 0% 8 0% 100% same as this for column am too 回答1: Here's a first crack: (df %>% pivot_longer(-cyl) ## spread out variables (vs, am) %>% group_by(cyl,name

How to separate values in a column and convert to numeric values?

阅读更多关于 How to separate values in a column and convert to numeric values?

问题 I have a dataset where the values are collapsed so each row has multiple inputs per one column. For example: Gene Score1 Gene1 NA, NA, NA, 0.03, -0.3 Gene2 NA, 0.2, 0.1 I am trying to unpack this to then select the maximum absolute value per row for the Score1 column - and also keep track of if the maximum absolute value was previously negative by creating a new column. So output of the example is: Gene Score1 Negatives1 Gene1 0.3 1 Gene1 0.2 0 #Score1 is now the maximum absolute value and if

accessing nested lists in R

阅读更多关于 accessing nested lists in R

问题 I have created a double nested structure for some data. How can I Access the data on the 2nd Level ( or for that matter the nth Level?) library(gapminder) library(purrr) library(tidyr) gapminder nest_data <- gapminder %>% group_by(continent) %>% nest(.key = by_continent) nest_2<-nest_data %>% mutate(by_continent = map(by_continent, ~.x %>% group_by(country) %>% nest(.key = by_country))) How can I now get the data for China into a dataframe or tibble from nest_2? I can get the data for all of

Optimize the runtime: change the weight of edges in an igraph takes long time. Is there a way to optimize it?

阅读更多关于 Optimize the runtime: change the weight of edges in an igraph takes long time. Is there a way to optimize it?

问题 I am searching for a set of edges in an igraph built from an osmar object and would like to change the weight of these. Since my graph is quite big, this task takes quite a long time. Since I run this function in a loop the runtime grows even bigger. Is there a way I could optimize this? Here is the code: library(osmar) library(igraph) library(tidyr) library(dplyr) ### Get data ---- src <- osmsource_api(url = "https://api.openstreetmap.org/api/0.6/") muc_bbox <- center_bbox(11.575278, 48

Optimize the runtime: change the weight of edges in an igraph takes long time. Is there a way to optimize it?

阅读更多关于 Optimize the runtime: change the weight of edges in an igraph takes long time. Is there a way to optimize it?

recode using dplyr::mutate across not working in a function

阅读更多关于 recode using dplyr::mutate across not working in a function

问题 I'm trying to use dplyr::mutate(across()) to recode specified columns in a tbl . Using these on their own works fine, but I can't get them to work in a function: library(dplyr) library(tidyr) df1 <- tibble(Q7_1=1:5, Q7_1_TEXT=c("let's","see","grogu","this","week"), Q8_1=6:10, Q8_1_TEXT=rep("grogu",5), Q8_2=11:15, Q8_2_TEXT=c("grogu","is","the","absolute","best")) # this works df2 <- df1 %>% mutate(across(starts_with("Q8") & ends_with("TEXT"), ~recode(., "grogu"="mando"))) # runs without error