reshape2

In tidyr, what criteria does the function `gather` use to map a dataframe from wide to long?

↘锁芯ラ 提交于 2019-12-03 15:51:09
I'm trying to figure out the arguments for gather in the tidyr package. I looked at the documentation, and the syntax looks like: gather(data, key, value, ..., na.rm = FALSE, convert = FALSE) There is an example in the help files: stocks <- data.frame( time = as.Date('2009-01-01') + 0:9, X = rnorm(10, 0, 1), Y = rnorm(10, 0, 2), Z = rnorm(10, 0, 4) ) gather(stocks, stock, price, -time) I'm curious about the last line: gather(stocks, stock, price, -time) Here, stocks is clearly the data we want to modify, which is fine. So I can read that stock and price are arguments to a key value pair -- but

Select a value for based on a highest value in another column

最后都变了- 提交于 2019-12-03 12:55:42
I don't understand why I can't find a solution for this, since I feel that this is a pretty basic question. Need to ask for help, then. I want to rearrange airquality dataset by month with maximum temp value for each month. In addition I want to find the corresponding day for each monthly maximum temperature. What is the laziest (code-wise) way to do this? I have tried following without a success: require(reshape2) names(airquality) <- tolower(names(airquality)) mm <- melt(airquality, id.vars = c("month", "day"), meas = c("temp")) dcast(mm, month + day ~ variable, max) aggregate(formula = temp

Fill area between two lines, with high/low and dates

放肆的年华 提交于 2019-12-03 03:11:24
Forword: I provide a reasonably satisfactory answer to my own question. I understand this is acceptable practice. Naturally my hope is to invite suggestions and improvements. My purpose is to plot two time series (stored in a dataframe with dates stored as class 'Date') and to fill the area between the data points with two different colors according to whether one is above the other. For instance, to plot an index of Bonds and an index of Stocks, and to fill the area in red when the Stock index is above the bond index, and to fill the area in blue otherwise. I have used ggplot2 for this

R reshape2 dcast: transform data

允我心安 提交于 2019-12-02 19:13:09
问题 How can I transform data X to Y as in X = data.frame( ID = c(1,1,1,2,2), NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"), SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"), TEST = c(1,2,3,1,2), SCORE = c(70,80,90,65,75) ) Y = data.frame( ID = c(1,2), NAME = c("MIKE","LUCY"), SEX = c("MALE","FEMALE"), TEST_1 =c(70,65), TEST_2 =c(80,75), TEST_3 =c(90,NA) ) The dcast function in reshape2 seems to work but it can not include other columns in the data like ID, NAME and SEX in the example above.

Reshape long to wide with multiple groupings

无人久伴 提交于 2019-12-02 09:42:33
问题 My data looks like this: Smoker PtNo Day Hour FEV1 timename 1 0 1 1 0 3.26 d1h0 2 0 1 1 2 3.05 d1h2 3 0 1 1 4 3.02 d1h4 4 0 1 1 6 3.27 d1h6 5 0 1 2 0 3.28 d2h0 6 0 1 2 2 3.07 d2h2 7 0 1 2 4 3.35 d2h4 8 0 1 2 6 3.07 d2h6 9 0 1 3 0 3.28 d3h0 10 0 1 3 2 3.44 d3h2 I want to reshape it into wide format like this: PtNo Smoker FEV1.d1h0 FEV1.d1h2 FEV1d1.h3 etc. Where PtNo and Smoker and independent variables not varying by time, and FEV1 is the measured time-varying variable. I get various error

Converting columns into rows in r [duplicate]

主宰稳场 提交于 2019-12-02 09:38:13
问题 This question already has answers here : Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed last year . I have the below data formed using code test <- data.frame(dis = c(10,20,30,40),dur=c(30,40,60,90),method=c("car","car","Bicycle","Bicycle"),to_lon=c(-1.980,-1.5678,-1.324,-1.456),to_lat=c(55.3009,55.3416,55.1123,55.2234),from_lon=c(-1.4565,-1.3424,-1.4566,-1.1111),from_lat=c(76.8888,65.8999,76.9088,25.3344)) dis dur method to

R reshape2 dcast: transform data

放肆的年华 提交于 2019-12-02 08:30:46
How can I transform data X to Y as in X = data.frame( ID = c(1,1,1,2,2), NAME = c("MIKE","MIKE","MIKE","LUCY","LUCY"), SEX = c("MALE","MALE","MALE","FEMALE","FEMALE"), TEST = c(1,2,3,1,2), SCORE = c(70,80,90,65,75) ) Y = data.frame( ID = c(1,2), NAME = c("MIKE","LUCY"), SEX = c("MALE","FEMALE"), TEST_1 =c(70,65), TEST_2 =c(80,75), TEST_3 =c(90,NA) ) The dcast function in reshape2 seems to work but it can not include other columns in the data like ID, NAME and SEX in the example above. Assuming all other columns by a ID column are consistent, like Mike can only be a male with ID 1, how can we

R: Converting wide format to long format with multiple 3 time period variables [duplicate]

心不动则不痛 提交于 2019-12-02 07:24:19
问题 This question already has answers here : Reshaping multiple sets of measurement columns (wide format) into single columns (long format) (7 answers) Closed last year . Apologies if this is a simple question, but I haven't been able to find a simple solution after searching. I'm fairly new to R, and am having trouble converting wide format to long format using either the melt (reshape2) or gather(tidyr) functions. The dataset that I'm working with contains 22 different time variables that are

melt + strsplit, or opposite to aggregate

烈酒焚心 提交于 2019-12-02 07:16:41
I have a little question that seems to be so easy in concept, but I cannot find the way to do it... Say I have a data.frame df2 with a column listing car brands and another column with all the models per brand separated by ','. I have obtained df2 aggregating another data.frame named df1 with the primary key being the model. How should I proceed to do the opposite task (i.e.: from df2 to df1)? My guess is something like melt(df2, id=unlist(strsplit('models',','))) ... Many thanks! Here is a MWE: df1 <- data.frame(model=c('a1','a2','a3','b1','b2','c1','d1','d2','d3','d4'), brand=c('a','a','a',

Reshape long to wide with multiple groupings

别等时光非礼了梦想. 提交于 2019-12-02 07:04:13
My data looks like this: Smoker PtNo Day Hour FEV1 timename 1 0 1 1 0 3.26 d1h0 2 0 1 1 2 3.05 d1h2 3 0 1 1 4 3.02 d1h4 4 0 1 1 6 3.27 d1h6 5 0 1 2 0 3.28 d2h0 6 0 1 2 2 3.07 d2h2 7 0 1 2 4 3.35 d2h4 8 0 1 2 6 3.07 d2h6 9 0 1 3 0 3.28 d3h0 10 0 1 3 2 3.44 d3h2 I want to reshape it into wide format like this: PtNo Smoker FEV1.d1h0 FEV1.d1h2 FEV1d1.h3 etc. Where PtNo and Smoker and independent variables not varying by time, and FEV1 is the measured time-varying variable. I get various error messages using reshape and the melt / dcast functions in the reshape2 package. Any suggestions? (Please