reshape2

“unpacking” a factor list from a data.frame

寵の児 提交于 2019-11-28 00:11:34
I'm new to R / having the option to easily re-organize data, and have hunted around for a solution but can't find exactly what I'd like to do. Reshape2's melt/cast doesn't quite seem to work and I haven't mastered plyr well enough to factor it in here. Basically I have a data.frame with a structure outlined below, with a category column in which each element is a variable-length list of categories (more compact because the # columns is much larger, and I actually have multiple category_lists that I'd like to keep separate): >mydf ID category_list xval yval 1 ID1 cat1, cat2, cat3 xnum1 ynum1 2

R reshape a vector into multiple columns

喜你入骨 提交于 2019-11-27 22:50:56
Let's say I have a vector in R as follows: d<-seq(1,100) I want to reshape this vector into a 10x10 matrix, so that I will have this data instead: [,1] [,2] [,3] .. [,10] 1 2 3 .. 10 11 12 13 .. 20 21 22 23 .. 30 .. 91 92 93 .. 100 I tried to use reshape function, but it didn't work. Can someone help please? You can do dim(d) <- c(10, 10) d <- t(d) or d <- matrix(d, nrow = 10, byrow = TRUE) If you want to convert a predifined list to a matrix (e.g. a 5*4 matrix), do yourMatrix <- matrix(unlist(yourList), nrow = 5, ncol = 4) It is worth noting that the matrix is created by columns, which means

Adding Percentages to a Grouped Barchart Columns in GGplot2

喜你入骨 提交于 2019-11-27 22:50:15
问题 Hoping someone can help me with labelling columns of a grouped barchart with percentages. I couldn't find an existing post that I could make work successfuly. Below is the code for a basic example dataframe. Service<-c("AS","AS","PS","PS","RS","RS","ES","ES") Year<-c("2015","2016","2015","2016","2015","2016","2015","2016") Q1<-c("Dissatisfied","Satisfied","Satisfied","Satisfied","Dissatisfied","Dissatisfied","Satisfied","Satisfied") Q2<-c("Dissatisfied","Dissatisfied","Satisfied",

Reshape multiple categorical variables to binary response variables

生来就可爱ヽ(ⅴ<●) 提交于 2019-11-27 22:21:10
I am trying to convert the following format: mydata <- data.frame(movie = c("Titanic", "Departed"), actor1 = c("Leo", "Jack"), actor2 = c("Kate", "Leo"))) movie actor1 actor2 1 Titanic Leo Kate 2 Departed Jack Leo to binary response variables: movie Leo Kate Jack 1 Titanic 1 1 0 2 Departed 1 0 1 I tried the solution described in Convert row data to binary columns but I could get it to work for two variables, not three. I would really appreciate if there is a clean way to do this. How much spice is too much? Here is a solution via tidyr : library(dplyr) library(tidyr) mydata %>% gather(actor

Transposing data frames

邮差的信 提交于 2019-11-27 21:36:45
Happy Weekends. I've been trying to replicate the results from this blog post in R. I am looking for a method of transposing the data without using t , preferably using tidyr or reshape . In example below, metadata is obtained by transposing data . metadata <- data.frame(colnames(data), t(data[1:4, ]) ) colnames(metadata) <- t(metadata[1,]) metadata <- metadata[-1,] metadata$Multiplier <- as.numeric(metadata$Multiplier) Though it achieves what I want, I find it little unskillful. Is there any efficient workflow to transpose the data frame? dput of data data <- structure(list(Series.Description

reshape vs. reshape2 in R

狂风中的少年 提交于 2019-11-27 19:14:53
I am attempting to understand why development had shifted from reshape to reshape2 package. They seem to be functionally the same, however, I am unable to upgrade to reshape2 currently due to an older version of R running on the server. I am concerned about the possibility of a major bug that would have shifted development to a whole new package instead of simply continuing development of reshape . Does anyone know if there is a major flaw in the reshape package? reshape2 let Hadley make a rebooted reshape that was way, way faster, while avoiding busting up people's dependencies and habits.

How to use “cast” in reshape without aggregation

假装没事ソ 提交于 2019-11-27 18:33:24
问题 In many uses of cast I've seen, an aggregation function such as mean is used. How about if you simply want to reshape without information loss. For example, if I want to take this long format: ID condition Value John a 2 John a 3 John b 4 John b 5 John a 6 John a 2 John b 1 John b 4 To this wide-format without any aggregation: ID a b John 2 4 John 3 5 Alex 6 1 Alex 2 4 I suppose that this is assuming that observations are paired and you were missing value would mess this up but any insight is

Comparing gather (tidyr) to melt (reshape2)

倾然丶 夕夏残阳落幕 提交于 2019-11-27 17:22:58
I love the reshape2 package because it made life so doggone easy. Typically Hadley has made improvements in his previous packages that enable streamlined, faster running code. I figured I'd give tidyr a whirl and from what I read I thought gather was very similar to melt from reshape2 . But after reading the documentation I can't get gather to do the same task that melt does. Data View Here's a view of the data (actual data in dput form at end of post): teacher yr1.baseline pd yr1.lesson1 yr1.lesson2 yr2.lesson1 yr2.lesson2 yr2.lesson3 1 3 1/13/09 2/5/09 3/6/09 4/27/09 10/7/09 11/18/09 3/4/10

Reshape Data Long to Wide - understanding reshape parameters

一曲冷凌霜 提交于 2019-11-27 15:48:09
I have a long format dataframe dogs that I'm trying to reformat to wide using the reshape() function. It currently looks like so: dogid month year trainingtype home school timeincomp 12345 1 2014 1 1 1 340 12345 2 2014 1 1 1 360 31323 12 2015 2 7 3 440 31323 1 2014 1 7 3 500 31323 2 2014 1 7 3 520 The dogid column is a bunch of ids, one for each dog. The month column varies for 1 to 12 for the 12 months, and year from 2014 to 2015. Trainingtype varies for 1 to 2. Each dog has a timeincomp value for every month-year-trainingtype combination, so 48 entries per dog. Home and school vary from 1-8

Unlisting columns by groups

十年热恋 提交于 2019-11-27 14:51:33
I have a dataframe in the following format: id | name | logs ---+--------------------+----------------------------------------- 84 | "zibaroo" | "C47931038" 12 | "fabien kelyarsky" | c("C47331040", "B19412225", "B18511449") 96 | "mitra lutsko" | c("F19712226", "A18311450") 34 | "PaulSandoz" | "A47431044" 65 | "BeamVision" | "D47531045" As you see the column "logs" includes vectors of strings in each cell. Is there an efficient way to convert the data frame to the long format (one observation per row) without the intermediary step of separating "logs" into several columns? This is important