tidyr | 易学教程

Spread with duplicate identifiers for rows [duplicate]

阅读更多关于 Spread with duplicate identifiers for rows [duplicate]

This question already has an answer here: Using spread with duplicate identifiers for rows 3 answers There has been questions on this topic before here , but I am still struggling with spreading this. I would like so each state to have its own column of temperatures values. Here is a dput() of my data. I'll call it df structure(list(date = c("2018-01-21", "2018-01-21", "2018-01-20", "2018-01-20", "2018-01-19", "2018-01-19", "2018-01-18", "2018-01-18", "2018-01-17", "2018-01-17", "2018-01-16", "2018-01-16", "2018-01-15", "2018-01-15", "2018-01-14", "2018-01-14", "2018-01-12", "2018-01-12",

tidyr spread function generates sparse matrix when compact vector expected

阅读更多关于 tidyr spread function generates sparse matrix when compact vector expected

I'm learning dplyr, having come from plyr, and I want to generate (per group) columns (per interaction) from the output of xtabs. Short summary: I'm getting A B 1 NA NA 2 when I wanted A B 1 2 xtabs data looks like this: > xtabs(data=data.frame(P=c(F,T,F,T,F),A=c(F,F,T,T,T))) A P FALSE TRUE FALSE 1 2 TRUE 1 1 now do( wants it's data in data frames, like this: > xtabs(data=data.frame(P=c(F,T,F,T,F),A=c(F,F,T,T,T))) %>% as.data.frame P A Freq 1 FALSE FALSE 1 2 TRUE FALSE 1 3 FALSE TRUE 2 4 TRUE TRUE 1 Now I want a single row output with columns being the interaction of levels. Here's what I'm

tidyr: multiple unnesting with varying NA counts

阅读更多关于 tidyr: multiple unnesting with varying NA counts

I'm confused about some tidyr behavior. I can unnest a single response like this: library(tidyr) resp1 <- c("A", "B; A", "B", NA, "B") resp2 <- c("C; D; F", NA, "C; F", "D", "E") resp3 <- c(NA, NA, "G; H; I", "H; I", "I") data <- data.frame(resp1, resp2, resp3, stringsAsFactors = F) tidy <- data %>% transform(resp1 = strsplit(resp1, "; ")) %>% unnest() # Source: local data frame [6 x 3] # # resp2 resp3 resp1 # (chr) (chr) (chr) # 1 C; D; F NA A # 2 NA NA B # 3 NA NA A # 4 C; F G; H; I B # 5 D H; I NA # 6 E I B But I need to unnest multiple columns in my dataset, and the columns have varying

tidyr separate only first n instances [duplicate]

阅读更多关于 tidyr separate only first n instances [duplicate]

This question already has an answer here: How to strsplit different number of strings in certain column by do function 1 answer I have a data.frame in R, which, for simplicity, has one column that I want to separate. It looks like this: V1 Value_is_the_best_one This_is_the_prettiest_thing_I've_ever_seen Here_is_the_next_example_of_what_I_want My real data is very large (millions of rows), so I'd like to use tidyr's separate function (because it's amazingly fast) to separate out JUST the first few instances. I'd like the result to be the following: V1 V2 V3 V4 Value is the best_one This is the

Unnesting a list of lists in a data frame column

阅读更多关于 Unnesting a list of lists in a data frame column

问题 To unnest a data frame I can use: df <- data_frame( x = 1, y = list(a = 1, b = 2) ) tidyr::unnest(df) But how can I unnest a list inside of a list inside of a data frame column? df <- data_frame( x = 1, y = list(list(a = 1, b = 2)) ) tidyr::unnest(df) Error: Each column must either be a list of vectors or a list of data frames [y] 回答1: With purrr , which is nice for lists, library(purrr) df %>% dmap(unlist) ## # A tibble: 2 x 2 ## x y ## <dbl> <dbl> ## 1 1 1 ## 2 1 2 which is more or less

Comparison between dplyr::do / purrr::map, what advantages? [closed]

阅读更多关于 Comparison between dplyr::do / purrr::map, what advantages? [closed]

When using broom I was used to combine dplyr::group_by and dplyr::do to perform actions on grouped data thanks to @drob. For example, fitting a linear model to cars depending on their gear system: library("dplyr") library("tidyr") library("broom") # using do() mtcars %>% group_by(am) %>% do(tidy(lm(mpg ~ wt, data = .))) # Source: local data frame [4 x 6] # Groups: am [2] # am term estimate std.error statistic p.value # (dbl) (chr) (dbl) (dbl) (dbl) (dbl) # 1 0 (Intercept) 31.416055 2.9467213 10.661360 6.007748e-09 # 2 0 wt -3.785908 0.7665567 -4.938848 1.245595e-04 # 3 1 (Intercept) 46.294478

Using tidyr spread function to create columns with binary value

阅读更多关于 Using tidyr spread function to create columns with binary value

I am aware of spread function in tidyr package but this is something I am unable to achieve. I have a data.frame with 2 columns as defined below. I need to transpose the column Subject into binary columns with 1 and 0. Below is the data.frame studentInfo <- data.frame(StudentID = c(1,1,1,2,3,3), Subject = c("Maths", "Science", "English", "Maths", "History", "History")) > studentInfo StudentID Subject 1 1 Maths 2 1 Science 3 1 English 4 2 Maths 5 3 History 6 3 History And the output I am expecting is: StudentID Maths Science English History 1 1 1 1 1 0 2 2 1 0 0 0 3 3 0 0 0 1 Please assist how

Separate a column into multiple columns using tidyr::separate with sep=“”

阅读更多关于 Separate a column into multiple columns using tidyr::separate with sep=“”

df <- data.frame(category = c("X", "Y"), sequence = c("AAT.G", "CCG-T"), stringsAsFactors = FALSE) df category sequence 1 X AAT.G 2 Y CCG-T I want to separate the column sequence into 5 columns (one for each character). I tried to do that with tidyr::separate but it internally uses stringi::stri_split_regex which doesn't accept an empty string as a separator (although the sep argument should take a regex). library(tidyr) separate(df, sequence, into = paste0("V", 1:5), sep="") Error: Values not split into 5 pieces at 1, 2 In addition: Warning messages: 1: In stringi::stri_split_regex(value, sep

How to use the spread function properly in tidyr

阅读更多关于 How to use the spread function properly in tidyr

How do I change the following table from: Type Name Answer n TypeA Apple Yes 5 TypeA Apple No 10 TypeA Apple DK 8 TypeA Apple NA 20 TypeA Orange Yes 6 TypeA Orange No 11 TypeA Orange DK 8 TypeA Orange NA 23 Change to: Type Name Yes No DK NA TypeA Apple 5 10 8 20 TypeA Orange 6 11 8 23 I used the following codes to get the first table. df_1 <- df %>% group_by(Type, Name, Answer) %>% tally() Then I tried to use the spread command to get to the 2nd table, but I got the following error message: "Error: All columns must be named" df_2 <- spread(df_1, Answer) Following on the comment from ayk, I'm

From long to wide data with multiple columns

阅读更多关于 From long to wide data with multiple columns

Suggestions for how to smoothly get from foo to foo2 (preferably with tidyr or reshape2 packages)? This is kind of like this question , but not exactly I think, because I don't want to auto-number columns, just widen multiple columns. It's also kind of like this question , but again, I don't think I want the columns to vary with a row value as in that answer. Or, a valid answer to this question is to convince me it's exactly like one of the others. The solution in the second question of "two dcasts plus a merge" is the most attractive right now, because it is comprehensible to me. foo: foo =