tidyr | 易学教程

Splitting column by separator from right to left in R

阅读更多关于 Splitting column by separator from right to left in R

问题 I'm working on a dataset where one column ( Place ) consists of a location sentence. librabry(tidyverse) example <- tibble(Datum = c("October 1st 2017", "October 2st 2017", "October 3rd 2017"), Place = c("Tabiyyah Jazeera village, 20km south east of Deir Ezzor, Deir Ezzor Governorate, Syria", "Abu Kamal, Deir Ezzor Governorate, Syria", "شارع القطار al Qitar [train] street, al-Tawassiya area, north of Raqqah city centre, Raqqah governorate, Syria")) I would like to split the Place column by

In R: get multiple rows by splitting a column using tidyr and reshape2 [duplicate]

阅读更多关于 In R: get multiple rows by splitting a column using tidyr and reshape2 [duplicate]

问题 This question already has an answer here: Split comma-separated strings in a column into separate rows 5 answers What is the most simpel way using tidyr or reshape2 to turn this data: data <- data.frame( A=c(1,2,3), B=c("b,g","g","b,g,q")) Into (e.g. make a row for each comma separated value in variable B ): A B 1 1 b 2 1 g 3 2 g 4 3 b 5 3 g 6 3 q 回答1: Try library(splitstackshape) cSplit(data, 'B', ',', 'long') Or using base R lst <- setNames(strsplit(as.character(data$B), ','), data$A) stack

SparklyR separate one Spark DataFrame column into two columns

阅读更多关于 SparklyR separate one Spark DataFrame column into two columns

问题 I have a dataframe containing a column named COL which is structured in this way: VALUE1###VALUE2 The following code is working library(sparklyr) library(tidyr) library(dplyr) mParams<- collect(filter(input_DF, TYPE == ('MIN'))) mParams<- separate(mParams, COL, c('col1','col2'), '\\###', remove=FALSE) If I remove the collect , I get this error: Error in UseMethod("separate_") : no applicable method for 'separate_' applied to an object of class "c('tbl_spark', 'tbl_sql', 'tbl_lazy', 'tbl')" Is

How to split column into two in R using separate [duplicate]

阅读更多关于 How to split column into two in R using separate [duplicate]

问题 This question already has an answer here: Split data frame string column into multiple columns 14 answers I have a dataset with a column of locations like this (41.797634883, -87.708426986). I'm trying to split it into latitude and longitude. I tried using the separate method from the tidyr package library(dplyr) library(tidyr) df <- data.frame(x = c('(4, 9)', '(9, 10)', '(20, 100)', '(100, 200)')) df %>% separate(x, c('Latitude', 'Longitude')) but I'm getting this error Error: Values not

Changing Values from Wide to Long: 1) Group_By, 2) Spread/Dcast [duplicate]

阅读更多关于 Changing Values from Wide to Long: 1) Group_By, 2) Spread/Dcast [duplicate]

问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (6 answers) Closed 5 months ago . I've got a list of names of phone numbers, which I want to group by name, and bring them from a long format to a wide one, with the phone number filling across the columns Name Phone_Number John Doe 0123456 John Doe 0123457 John Doe 0123458 Jim Doe 0123459 Jim Doe 0123450 Jane Doe 0123451 Jill Doe 0123457 Name Phone_Number1 Phone_Number2 Phone

Using tidyr::complete with group_by

阅读更多关于 Using tidyr::complete with group_by

问题 Does anyone know if tidyr::complete() supports grouping via group_by() ? To be precise: I have some data frame that looks like this df <- data.frame( "ID" = rep(1:2, each = 2), "Col1" = c("A", NA, "AA", NA), "Col2" = c("B", "C", "BB", "CC")) Now i'd like to use complete() and group_by() to compute all possible combinations per group ! df %>% group_by(ID) %>% complete(Col1, Col2) Error in .Call("dplyr_left_join_impl", PACKAGE = "dplyr", x, y, by_x, : negative length vectors are not allowed

Gather multiple date/value columns using tidyr

阅读更多关于 Gather multiple date/value columns using tidyr

问题 I have a data set containing (amongst others) multiple columns with dates and corresponding values (repeated measurements). Is there a way to turn this into a long data set containing (the others and) only two columns - one for dates and one for values - using tidyr ? The following code produces an example data frame: df <- data.frame( id = 1:10, age = sample(100, 10), date1 = as.Date('2015-09-22') - sample(100, 10), value1 = sample(100, 10), date2 = as.Date('2015-09-22') - sample(100, 10),

How do I use tidyr to fill in completed rows within each value of a grouping variable?

阅读更多关于 How do I use tidyr to fill in completed rows within each value of a grouping variable?

问题 Say I have data on people who choose between several options. I have one row per person, and I want to have one row per person and choice option. So, if I have 10 people who have 3 choices, right now I have 10 rows, and I want to have 30. All of the other variables should be copied to each of the new rows. So, for example, if I have a variable for gender, that should be constant within ID. (I am setting my data up this way to analyze with mnlogit .) This seems like the situation that two

tidyr spread function generates sparse matrix when compact vector expected

阅读更多关于 tidyr spread function generates sparse matrix when compact vector expected

问题 I'm learning dplyr, having come from plyr, and I want to generate (per group) columns (per interaction) from the output of xtabs. Short summary: I'm getting A B 1 NA NA 2 when I wanted A B 1 2 xtabs data looks like this: > xtabs(data=data.frame(P=c(F,T,F,T,F),A=c(F,F,T,T,T))) A P FALSE TRUE FALSE 1 2 TRUE 1 1 now do( wants it's data in data frames, like this: > xtabs(data=data.frame(P=c(F,T,F,T,F),A=c(F,F,T,T,T))) %>% as.data.frame P A Freq 1 FALSE FALSE 1 2 TRUE FALSE 1 3 FALSE TRUE 2 4 TRUE

Tidy data.frame with repeated column names

阅读更多关于 Tidy data.frame with repeated column names

I have a program that gives me data in this format toy file_path Condition Trial.Num A B C ID A B C ID A B C ID 1 root/some.extension Baseline 1 2 3 5 car 2 1 7 bike 4 9 0 plane 2 root/thing.extension Baseline 2 3 6 45 car 5 4 4 bike 9 5 4 plane 3 root/else.extension Baseline 3 4 4 6 car 7 5 4 bike 68 7 56 plane 4 root/uniquely.extension Treatment 1 5 3 7 car 1 7 37 bike 9 8 7 plane 5 root/defined.extension Treatment 2 6 7 3 car 4 6 8 bike 9 0 8 plane My goal is to tidy the format into something that at least can be easier to finally tidy with reshape having unique column names tidy_toy file