reshape2

How to reshape a dataframe with “reoccurring” columns?

£可爱£侵袭症+ 提交于 2019-12-17 19:25:34
问题 I am new to data analysis with R. I recently got a pre-formatted environmental observation-model dataset, an example subset of which is shown below: date site obs mod site obs mod 2000-09-01 00:00:00 campus NA 61.63 city centre 66 56.69 2000-09-01 01:00:00 campus 52 62.55 city centre NA 54.75 2000-09-01 02:00:00 campus 52 63.52 city centre 56 54.65 Basically, the data include the time series of hourly observed and modelled concentrations of a pollutant at various sites in "reoccurring columns

How do I convert a wide dataframe to a long dataframe for a multilevel structure with 'quadruple nesting'?

被刻印的时光 ゝ 提交于 2019-12-17 17:03:15
问题 I conducted a study that, in retrospect (one lives, one learns :-)) appears to generate multilevel data. Now I'm trying to restructure the dataset from wide to long so that I can analyse it using e.g. lme4. In doing so, I encounter an, um, challenge, that I've ran into a few times before, but for which I've never found a good solution. I've searched again this time, but I probably use the wrong keywords - or this problem is much rarer than I thought. Basically, in this dataset, the

“unpacking” a factor list from a data.frame

纵然是瞬间 提交于 2019-12-17 16:37:40
问题 I'm new to R / having the option to easily re-organize data, and have hunted around for a solution but can't find exactly what I'd like to do. Reshape2's melt/cast doesn't quite seem to work and I haven't mastered plyr well enough to factor it in here. Basically I have a data.frame with a structure outlined below, with a category column in which each element is a variable-length list of categories (more compact because the # columns is much larger, and I actually have multiple category_lists

Reshape multiple categorical variables to binary response variables

徘徊边缘 提交于 2019-12-17 11:45:17
问题 I am trying to convert the following format: mydata <- data.frame(movie = c("Titanic", "Departed"), actor1 = c("Leo", "Jack"), actor2 = c("Kate", "Leo")) movie actor1 actor2 1 Titanic Leo Kate 2 Departed Jack Leo to binary response variables: movie Leo Kate Jack 1 Titanic 1 1 0 2 Departed 1 0 1 I tried the solution described in Convert row data to binary columns but I could get it to work for two variables, not three. I would really appreciate if there is a clean way to do this. 回答1: An

can the value.var in dcast be a list or have multiple value variables?

送分小仙女□ 提交于 2019-12-17 09:29:10
问题 In the help files for dcast.data.table , there is a note stating that a new feature has been implemented: "dcast.data.table allows value.var column to be of type list" I take this to mean that one can have multiple value variables within a list, i.e. in this format: dcast.data.table(dt, x1~x2, value.var=list('var1','var2','var3')) But we get an error: 'value.var' must be a character vector of length 1. Is there such a feature, and if not, what would be other one-liner alternatives? EDIT: In

Compute mean and standard deviation by group for multiple variables in a data.frame

≡放荡痞女 提交于 2019-12-17 08:48:36
问题 Edit -- This question was originally titled << Long to wide data reshaping in R >> I'm just learning R and trying to find ways to apply it to help out others in my life. As a test case, I'm working on reshaping some data, and I'm having trouble following the examples I've found online. What I'm starting with looks like this: ID Obs 1 Obs 2 Obs 3 1 43 48 37 1 27 29 22 1 36 32 40 2 33 38 36 2 29 32 27 2 32 31 35 2 25 28 24 3 45 47 42 3 38 40 36 And what I want to end up with will look like this

dcast error: ‘Aggregation function missing: defaulting to length’

老子叫甜甜 提交于 2019-12-16 20:06:17
问题 My df looks like this: Id Task Type Freq 3 1 A 2 3 1 B 3 3 2 A 3 3 2 B 0 4 1 A 3 4 1 B 3 4 2 A 1 4 2 B 3 I want to restructure by Id and get: Id A B … Z 3 5 3 4 4 6 I tried: df_wide <- dcast(df, Id + Task ~ Type, value.var="Freq") and got the error: Aggregation function missing: defaulting to length I can't figure out what to put in the fun.aggregate . What's the problem? 回答1: The reason why you are getting this warning is in the description of fun.aggregate (see ?dcast ): aggregation

melting data.table seems to crash RStudio?

喜夏-厌秋 提交于 2019-12-13 19:17:32
问题 After a bit of googling and stalking stackoverflow, I think I may have stumbled onto a bug with the reshape2 package (or data.table , I'm not sure). Specifically, I am unable to melt a particular data.table of mine. Here's a reproducible example (you can find a copy of that particular .Rdata file here): library(data.table) library(ggplot2) library(reshape2) load("mpi_cv_vanilla_random_gov_17h55--100_class_widths.Rdata") melt(par.grid.results) Before I run melt , my usessionInfo is as follows:

Reshaping EPA wind speed & direction data with dcast in R

戏子无情 提交于 2019-12-13 16:22:16
问题 I am trying to convert long format wind data into wide format. Both wind speed and wind direction are listed within the Parameter.Name column. These values need to be cast by both Local.Site.Name, and Date.Local variables. If there are multiple observations per unique Local.Site.Name + Date.Local row, then I want the mean value of those observations. The built-in argument 'fun.aggregate = mean' works just fine for wind speed, but mean wind direction cannot be computed this way because the

Update a column in df2 by matching patterns in columns in df1 & df2 using R

空扰寡人 提交于 2019-12-13 07:35:58
问题 I have 2 data frames like this TEAM <- c("PE","PE","MPI","TDT","HPT","ATD") CODE <- c(NA,"F","A","H","G","D") df1 <- data.frame(TEAM,CODE) CODE <- c(NA,"F100","A234","D664","H435","G123","A666","D345","G324",NA) TEAM <- c(NA,NA,NA,NA,NA,NA,NA,NA,NA,NA) df2 <- data.frame(CODE,TEAM) I am trying to update the TEAM in df2 by matching the first letter in code column in df1 with the code column in df2 My desired output for df2 CODE TEAM 1 NA PE 2 F100 PE 3 A234 MPI 4 D664 ATD 5 H435 TDT 6 G123 HPT