tidyr

Gather multiple columns with tidyr [duplicate]

不羁的心 提交于 2019-12-01 10:11:45
This question already has an answer here: Gather multiple sets of columns [duplicate] 5 answers Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7 answers I have a shopping cart data, which look like the sample dataframe below: sample_df<-data.frame( clientid=1:10, ProductA=c("chair","table","plate","plate","table","chair","table","plate","chair","chair"), QuantityA=c(1,2,1,1,1,1,2,3,1,2), ProductB=c("table","doll","shoes","","door","","computer","computer","","plate"), QuantityB=c(3,1,2,"",2,"",1,1,"",1) ) #sample data frame clientid ProductA

Gather multiple columns with tidyr [duplicate]

两盒软妹~` 提交于 2019-12-01 07:54:12
问题 This question already has answers here : Gather multiple sets of columns [duplicate] (5 answers) Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed 3 years ago . I have a shopping cart data, which look like the sample dataframe below: sample_df<-data.frame( clientid=1:10, ProductA=c("chair","table","plate","plate","table","chair","table","plate","chair","chair"), QuantityA=c(1,2,1,1,1,1,2,3,1,2), ProductB=c("table","doll","shoes",

Fit a different model for each row of a list-columns data frame

核能气质少年 提交于 2019-12-01 05:48:15
What is the best way to fit different model formulae that vary by the row of a data frame with the list-columns data structure in tidyverse? In R for Data Science, Hadley presents a terrific example of how to use the list-columns data structure and fit many models easily ( http://r4ds.had.co.nz/many-models.html#gapminder ). I am trying to find a way to fit many models with slightly different formulae. In the below example adapted from his original example, what is the best way to fit a different model for each continent? library(gapminder) library(dplyr) library(tidyr) library(purrr) library

Use put two value columns in spread() function in R [duplicate]

橙三吉。 提交于 2019-12-01 04:41:20
This question already has an answer here: Transpose / reshape dataframe without “timevar” from long to wide format 6 answers Convert data from long format to wide format with multiple measure columns 5 answers I just posted a question recently asking how to reshape data from a long table to a wide table. Then I found spread() is a quite handy function for doing this. So now I need some further development on my previous post. Let's suppose we have a table like this: id1 | id2 | info | action_time | action_comment | 1 | a | info1 | time1 | comment1 | 1 | a | info1 | time2 | comment2 | 1 | a |

Separating column using separate (tidyr) via dplyr on a first encountered digit

江枫思渺然 提交于 2019-12-01 03:11:48
I'm trying to separate a rather messy column into two columns containing period and description . My data resembles the extract below: set.seed(1) dta <- data.frame(indicator=c("someindicator2001", "someindicator2011", "some text 20022008", "another indicator 2003"), values = runif(n = 4)) Desired results Desired results should look like that: indicator period values 1 someindicator 2001 0.2655087 2 someindicator 2011 0.3721239 3 some text 20022008 0.5728534 4 another indicator 2003 0.9082078 Characteristics Indicator descriptions are in one column Numeric values (counting from first digit

Reshaping data in R with “login” “logout” times

时光怂恿深爱的人放手 提交于 2019-12-01 02:36:58
I'm new to R, and am working on a side project for my own purposes. I have this data (reproducable dput of this is at the end of the question): X datetime user state 1 1 2016-02-19 19:13:26 User1 joined 2 2 2016-02-19 19:21:18 User2 joined 3 3 2016-02-19 19:21:33 User1 joined 4 4 2016-02-19 19:35:38 User1 joined 5 5 2016-02-19 19:44:15 User1 joined 6 6 2016-02-19 19:48:55 User1 joined 7 7 2016-02-19 19:52:40 User1 joined 8 8 2016-02-19 19:53:15 User3 joined 9 9 2016-02-19 20:02:34 User3 joined 10 10 2016-02-19 20:13:48 User3 joined 19 637 2016-02-19 19:13:32 User1 left 20 638 2016-02-19 19:25

Use put two value columns in spread() function in R [duplicate]

与世无争的帅哥 提交于 2019-12-01 02:00:38
问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (6 answers) Convert data from long format to wide format with multiple measure columns (5 answers) Closed last year . I just posted a question recently asking how to reshape data from a long table to a wide table. Then I found spread() is a quite handy function for doing this. So now I need some further development on my previous post. Let's suppose we have a table like this: id1

Sparklyr: how to explode a list column into their own columns in Spark table?

心不动则不痛 提交于 2019-12-01 00:22:55
My question is similar with the one in here , but I'm having problems implementing the answer, and I cannot comment in that thread. So, I have a big CSV file that contains a nested data, which contains 2 columns separated by whitespace (say first column is Y, second column is X). Column X itself is also a comma-separated value. 21.66 2.643227,1.2698358,2.6338573,1.8812188,3.8708665,... 35.15 3.422151,-0.59515584,2.4994135,-0.19701914,4.0771823,... 15.22 2.8302398,1.9080592,-0.68780196,3.1878228,4.6600842,... ... I want to read this CSV into 2 different Spark tables using sparklyr . So far this

counting values after and before change in value, within groups, generating new variables for each unique shift

◇◆丶佛笑我妖孽 提交于 2019-11-30 23:54:17
I am looking for a way to, within id groups, count unique occurrences of value shifts in TF in the data data tbl . I want to count both forward and backwards from when TF changes between 1 and 0 or o and 1 . The counting is to be stored in a new variable PM## , so that the PM## s holds each unique shift in TF , in both plus and minus. The MWE below leads to an outcome with 7 PM, but my production data can have 15 or more shifts. If a TF values does not change between NA 's I want to mark it 0 . This question is similar to a question I previously asked , but the last part about TF standing

tidyr use separate_rows over multiple columns

别说谁变了你拦得住时间么 提交于 2019-11-30 23:52:56
问题 I have a data.frame where some cells contain strings of comma separate values: d <- data.frame(a=c(1:3), b=c("name1, name2, name3", "name4", "name5, name6"), c=c("name7","name8, name9", "name10" )) I want to separate those strings where each name is split into its own cell. This is easy with tidyr::separate_rows(d, b, sep=",") if it is done for one column a time. But I can't do this for both columns "b" and "c" at the same time, since it requires that the number of names in each string is the