reshape2

reshape alternating columns in less time and using less memory

五迷三道 提交于 2019-12-02 05:53:12
问题 How can I do this reshape faster and so that it takes up less memory? My aim is to reshape a dataframe that is 500,000 rows by 500 columns with 4 Gb RAM. Here's a function that will make some reproducible data: make_example <- function(ndoc, ntop){ # doc numbers V1 = seq(1:ndoc) # filenames V2 <- list("vector", size = ndoc) for (i in 1:ndoc){ V2[i] <- paste(sample(c(rep(0:9,each=5),LETTERS,letters),5,replace=TRUE),collapse='') } # topic proportions tvals <- data.frame(matrix(runif(1:(ndoc

Reshape messy longitudinal survey data containing multiple different variables, wide to long

血红的双手。 提交于 2019-12-02 05:05:28
I hope that I'm not recreating the wheel, and do not think that the following can be answered using reshape . I have messy longitudinal survey data, that I want to convert from wide to long format. By messy I mean: I have a mixture of variable types (numeric, factor, logical) Not all variables have been collected at every timepoint. For example: data <- read.table(header=T, text=' id inlove.1 inlove.2 income.2 income.3 mood.1 mood.3 random 1 TRUE FALSE 87717.76 82281.25 happy happy filler 2 TRUE TRUE 70795.53 54995.19 so-so happy filler 3 FALSE FALSE 48012.77 47650.47 sad so-so filler ') I

How can I sum values of columns with dcast()?

一世执手 提交于 2019-12-02 04:38:11
问题 I'm stuck with a dcast function; I'm trying to create a sum table for individuals of many species per counting year. I have a data frame with 3 columns: (1) the year (factor), (2) the names of the species (factor), and (3) the counts (numeric). Year Species Counts 2002 SP1 2 2002 SP1 3 2004 SP1 2 2002 SP2 8 2002 SP2 2 2002 SP3 1 2002 SP3 1 2003 SP3 2 2004 SP3 1 I'm trying to get this kind of table with sums: 2002 2003 2004 SP1 5 0 2 SP2 10 0 0 SP3 2 2 1 Aggregate does not do what I want. I'm

R: Converting wide format to long format with multiple 3 time period variables [duplicate]

那年仲夏 提交于 2019-12-02 04:14:54
This question already has an answer here: Reshaping multiple sets of measurement columns (wide format) into single columns (long format) 7 answers Apologies if this is a simple question, but I haven't been able to find a simple solution after searching. I'm fairly new to R, and am having trouble converting wide format to long format using either the melt (reshape2) or gather(tidyr) functions. The dataset that I'm working with contains 22 different time variables that are each 3 time periods. The problem occurs when I try to convert all of these from wide to long format at once. I have had

Reshape data frame by row [duplicate]

假装没事ソ 提交于 2019-12-02 04:12:31
问题 This question already has answers here : Transpose / reshape dataframe without “timevar” from long to wide format (6 answers) Closed last year . I have a data frame similar to the following example: > df <- data.frame(imp = c("Johny", "Johny", "Lisa", "Max"), item = c(5025, 1101, 2057, 1619)) > df imp item [1,] "Johny" "5025" [2,] "Johny" "1101" [3,] "Lisa" "2057" [4,] "Max" "1619" I would like to have an unique row for each user . The final result should be something like this: > df imp

reshape alternating columns in less time and using less memory

心已入冬 提交于 2019-12-02 03:42:18
How can I do this reshape faster and so that it takes up less memory? My aim is to reshape a dataframe that is 500,000 rows by 500 columns with 4 Gb RAM. Here's a function that will make some reproducible data: make_example <- function(ndoc, ntop){ # doc numbers V1 = seq(1:ndoc) # filenames V2 <- list("vector", size = ndoc) for (i in 1:ndoc){ V2[i] <- paste(sample(c(rep(0:9,each=5),LETTERS,letters),5,replace=TRUE),collapse='') } # topic proportions tvals <- data.frame(matrix(runif(1:(ndoc*ntop)), ncol = ntop)) # topic number tnumvals <- data.frame(matrix(sample(1:ntop, size = ndoc*ntop,

From long to wide form without id.var?

允我心安 提交于 2019-12-02 02:40:06
问题 I have some data in long form that looks like this: dat1 = data.frame( id = rep(LETTERS[1:2], each=4), value = 1:8 ) In table form: id value A 1 A 2 A 3 A 4 B 5 B 6 B 7 B 8 And I want it to be in short form and look like this: dat1 = data.frame(A = 1:4, B = 5:8) In table form: A B 1 5 2 6 3 7 4 8 Now I could solve this by looping with cbind() and stuff, but I want to use some kind of reshape/melt function as these are the best way to do this kind of thing I think. However, from spending >30

Reshape data frame by row [duplicate]

╄→尐↘猪︶ㄣ 提交于 2019-12-02 01:46:50
This question already has an answer here: Transpose / reshape dataframe without “timevar” from long to wide format 6 answers I have a data frame similar to the following example: > df <- data.frame(imp = c("Johny", "Johny", "Lisa", "Max"), item = c(5025, 1101, 2057, 1619)) > df imp item [1,] "Johny" "5025" [2,] "Johny" "1101" [3,] "Lisa" "2057" [4,] "Max" "1619" I would like to have an unique row for each user . The final result should be something like this: > df imp item1 item2 [1,] "Johny" "5025" "1101" [2,] "Lisa" "2057" NA [3,] "Max" "1619" NA ## Add an ID column to distinguish multiple

From long to wide form without id.var?

↘锁芯ラ 提交于 2019-12-02 01:20:54
I have some data in long form that looks like this: dat1 = data.frame( id = rep(LETTERS[1:2], each=4), value = 1:8 ) In table form: id value A 1 A 2 A 3 A 4 B 5 B 6 B 7 B 8 And I want it to be in short form and look like this: dat1 = data.frame(A = 1:4, B = 5:8) In table form: A B 1 5 2 6 3 7 4 8 Now I could solve this by looping with cbind() and stuff, but I want to use some kind of reshape/melt function as these are the best way to do this kind of thing I think. However, from spending >30 minutes trying to get melt() and reshape() to work, reading answers on SO, it seems that these functions

reshape from base vs dcast from reshape2 with missing values

橙三吉。 提交于 2019-12-01 20:29:15
Whis this data frame, df <- expand.grid(id="01", parameter=c("blood", "saliva"), visit=c("V1", "V2", "V3")) df$value <- c(1:6) df$sex <- rep("f", 6) df > df id parameter visit value sex 1 01 blood V1 1 f 2 01 saliva V1 2 f 3 01 blood V2 3 f 4 01 saliva V2 4 f 5 01 blood V3 5 f 6 01 saliva V3 6 f When I reshape it in the "wide" format, I get identical results with both the base reshape function and the dcast function from reshape2 . reshape(df, timevar="visit", idvar=c("id", "parameter", "sex"), direction="wide") id parameter sex value.V1 value.V2 value.V3 1 01 blood f 1 3 5 2 01 saliva f 2 4 6