tidyr | 易学教程

Gather multiple columns with tidyr [duplicate]

阅读更多关于 Gather multiple columns with tidyr [duplicate]

This question already has an answer here: Gather multiple sets of columns [duplicate] 5 answers Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7 answers I have a shopping cart data, which look like the sample dataframe below: sample_df<-data.frame( clientid=1:10, ProductA=c("chair","table","plate","plate","table","chair","table","plate","chair","chair"), QuantityA=c(1,2,1,1,1,1,2,3,1,2), ProductB=c("table","doll","shoes","","door","","computer","computer","","plate"), QuantityB=c(3,1,2,"",2,"",1,1,"",1) ) #sample data frame clientid ProductA

Gather multiple columns with tidyr [duplicate]

阅读更多关于 Gather multiple columns with tidyr [duplicate]

问题 This question already has answers here : Gather multiple sets of columns [duplicate] (5 answers) Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed 3 years ago . I have a shopping cart data, which look like the sample dataframe below: sample_df<-data.frame( clientid=1:10, ProductA=c("chair","table","plate","plate","table","chair","table","plate","chair","chair"), QuantityA=c(1,2,1,1,1,1,2,3,1,2), ProductB=c("table","doll","shoes",

Fit a different model for each row of a list-columns data frame

阅读更多关于 Fit a different model for each row of a list-columns data frame

What is the best way to fit different model formulae that vary by the row of a data frame with the list-columns data structure in tidyverse? In R for Data Science, Hadley presents a terrific example of how to use the list-columns data structure and fit many models easily ( http://r4ds.had.co.nz/many-models.html#gapminder ). I am trying to find a way to fit many models with slightly different formulae. In the below example adapted from his original example, what is the best way to fit a different model for each continent? library(gapminder) library(dplyr) library(tidyr) library(purrr) library

Use put two value columns in spread() function in R [duplicate]

阅读更多关于 Use put two value columns in spread() function in R [duplicate]

This question already has an answer here: Transpose / reshape dataframe without “timevar” from long to wide format 6 answers Convert data from long format to wide format with multiple measure columns 5 answers I just posted a question recently asking how to reshape data from a long table to a wide table. Then I found spread() is a quite handy function for doing this. So now I need some further development on my previous post. Let's suppose we have a table like this: id1 | id2 | info | action_time | action_comment | 1 | a | info1 | time1 | comment1 | 1 | a | info1 | time2 | comment2 | 1 | a |

Separating column using separate (tidyr) via dplyr on a first encountered digit

阅读更多关于 Separating column using separate (tidyr) via dplyr on a first encountered digit

I'm trying to separate a rather messy column into two columns containing period and description . My data resembles the extract below: set.seed(1) dta <- data.frame(indicator=c("someindicator2001", "someindicator2011", "some text 20022008", "another indicator 2003"), values = runif(n = 4)) Desired results Desired results should look like that: indicator period values 1 someindicator 2001 0.2655087 2 someindicator 2011 0.3721239 3 some text 20022008 0.5728534 4 another indicator 2003 0.9082078 Characteristics Indicator descriptions are in one column Numeric values (counting from first digit

Reshaping data in R with “login” “logout” times

阅读更多关于 Reshaping data in R with “login” “logout” times

I'm new to R, and am working on a side project for my own purposes. I have this data (reproducable dput of this is at the end of the question): X datetime user state 1 1 2016-02-19 19:13:26 User1 joined 2 2 2016-02-19 19:21:18 User2 joined 3 3 2016-02-19 19:21:33 User1 joined 4 4 2016-02-19 19:35:38 User1 joined 5 5 2016-02-19 19:44:15 User1 joined 6 6 2016-02-19 19:48:55 User1 joined 7 7 2016-02-19 19:52:40 User1 joined 8 8 2016-02-19 19:53:15 User3 joined 9 9 2016-02-19 20:02:34 User3 joined 10 10 2016-02-19 20:13:48 User3 joined 19 637 2016-02-19 19:13:32 User1 left 20 638 2016-02-19 19:25

Use put two value columns in spread() function in R [duplicate]

阅读更多关于 Use put two value columns in spread() function in R [duplicate]

问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (6 answers) Convert data from long format to wide format with multiple measure columns (5 answers) Closed last year . I just posted a question recently asking how to reshape data from a long table to a wide table. Then I found spread() is a quite handy function for doing this. So now I need some further development on my previous post. Let's suppose we have a table like this: id1

Sparklyr: how to explode a list column into their own columns in Spark table?

阅读更多关于 Sparklyr: how to explode a list column into their own columns in Spark table?

My question is similar with the one in here , but I'm having problems implementing the answer, and I cannot comment in that thread. So, I have a big CSV file that contains a nested data, which contains 2 columns separated by whitespace (say first column is Y, second column is X). Column X itself is also a comma-separated value. 21.66 2.643227,1.2698358,2.6338573,1.8812188,3.8708665,... 35.15 3.422151,-0.59515584,2.4994135,-0.19701914,4.0771823,... 15.22 2.8302398,1.9080592,-0.68780196,3.1878228,4.6600842,... ... I want to read this CSV into 2 different Spark tables using sparklyr . So far this

counting values after and before change in value, within groups, generating new variables for each unique shift

阅读更多关于 counting values after and before change in value, within groups, generating new variables for each unique shift

I am looking for a way to, within id groups, count unique occurrences of value shifts in TF in the data data tbl . I want to count both forward and backwards from when TF changes between 1 and 0 or o and 1 . The counting is to be stored in a new variable PM## , so that the PM## s holds each unique shift in TF , in both plus and minus. The MWE below leads to an outcome with 7 PM, but my production data can have 15 or more shifts. If a TF values does not change between NA 's I want to mark it 0 . This question is similar to a question I previously asked , but the last part about TF standing

tidyr use separate_rows over multiple columns

阅读更多关于 tidyr use separate_rows over multiple columns

问题 I have a data.frame where some cells contain strings of comma separate values: d <- data.frame(a=c(1:3), b=c("name1, name2, name3", "name4", "name5, name6"), c=c("name7","name8, name9", "name10" )) I want to separate those strings where each name is split into its own cell. This is easy with tidyr::separate_rows(d, b, sep=",") if it is done for one column a time. But I can't do this for both columns "b" and "c" at the same time, since it requires that the number of names in each string is the