plyr

Demean R data frame

强颜欢笑 提交于 2021-01-27 16:26:44
问题 I would like to demean multiple columns in an R data.frame . Using an example from this question set.seed(999) library(plyr) library(plm) # random data.frame dat <- expand.grid(id=factor(1:3), cluster=factor(1:6)) dat <- cbind(dat, x=runif(18), y=runif(18, 2, 5)) #demean x and y dat.2 <- ddply(dat, .(cluster), transform, x=x-mean(x), y=y-mean(y)) My problem is that I have (lots) more than 2 variables, and I would like to avoid hard-coding this analysis. I'm new to plyr in general; why does

Demean R data frame

半腔热情 提交于 2021-01-27 16:24:51
问题 I would like to demean multiple columns in an R data.frame . Using an example from this question set.seed(999) library(plyr) library(plm) # random data.frame dat <- expand.grid(id=factor(1:3), cluster=factor(1:6)) dat <- cbind(dat, x=runif(18), y=runif(18, 2, 5)) #demean x and y dat.2 <- ddply(dat, .(cluster), transform, x=x-mean(x), y=y-mean(y)) My problem is that I have (lots) more than 2 variables, and I would like to avoid hard-coding this analysis. I'm new to plyr in general; why does

counting N occurrences within a ceiling range of a matrix by-row

社会主义新天地 提交于 2021-01-27 06:05:30
问题 I would like to tally each time a value lies within a given range in a matrix by-row, and then sum these logical outcomes to derive a "measure of consistency" for each row. Reproducible example: m1 <- matrix(c(1,2,1,6,3,7,4,2,6,8,11,15), ncol=4, byrow = TRUE) # expected outcome, given a range of +/-1 either side exp.outcome<-matrix(c(TRUE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE, FALSE,FALSE,FALSE,FALSE), ncol=4, byrow=TRUE) Above I've indicated the the expected outcome, in the case where each

counting N occurrences within a ceiling range of a matrix by-row

廉价感情. 提交于 2021-01-27 06:05:12
问题 I would like to tally each time a value lies within a given range in a matrix by-row, and then sum these logical outcomes to derive a "measure of consistency" for each row. Reproducible example: m1 <- matrix(c(1,2,1,6,3,7,4,2,6,8,11,15), ncol=4, byrow = TRUE) # expected outcome, given a range of +/-1 either side exp.outcome<-matrix(c(TRUE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE, FALSE,FALSE,FALSE,FALSE), ncol=4, byrow=TRUE) Above I've indicated the the expected outcome, in the case where each

counting N occurrences within a ceiling range of a matrix by-row

放肆的年华 提交于 2021-01-27 06:04:08
问题 I would like to tally each time a value lies within a given range in a matrix by-row, and then sum these logical outcomes to derive a "measure of consistency" for each row. Reproducible example: m1 <- matrix(c(1,2,1,6,3,7,4,2,6,8,11,15), ncol=4, byrow = TRUE) # expected outcome, given a range of +/-1 either side exp.outcome<-matrix(c(TRUE,TRUE,TRUE,FALSE, TRUE,FALSE,TRUE,TRUE, FALSE,FALSE,FALSE,FALSE), ncol=4, byrow=TRUE) Above I've indicated the the expected outcome, in the case where each

How can I generate by-group summary statistics if my grouping variable is a factor?

老子叫甜甜 提交于 2021-01-27 05:16:27
问题 Suppose I wanted to get some summary statistics on the dataset mtcars (part of base R version 2.12.1). Below, I group the cars according to the number of engine cylinders they have and take the per-group means of the remaining variables in mtcars . > str(mtcars) 'data.frame': 32 obs. of 11 variables: $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ... $ cyl : num 6 6 4 6 8 6 8 4 4 6 ... $ disp: num 160 160 108 258 360 ... $ hp : num 110 110 93 110 175 105 245 62 95 123 ... $ drat:

pca , nmds , pcoa 图添加分组的椭圆

杀马特。学长 韩版系。学妹 提交于 2021-01-07 08:55:52
对于pca , nmds, pcoa 这些排序分析来说,我们可以从图中看出样本的排列规则,比如分成了几组。 为例样本分组更加的直观,我们可以根据实验设计时的样本分组情况,对属于同一个group的样本添加1个椭圆或者其他多边形。 新版本的ggplot2 中提供了stat_ellipse 这个stat, 可以方便的实现上面的效果。 代码示例: ggplot(faithful, aes(waiting, eruptions, color = eruptions > 3)) + geom_point() + stat_ellipse(level = 0.8) + stat_ellipse(level = 0.9) 效果图如下: 通过stat_ellipse 简单有方便,其中的level 参数指定了拟合椭圆的路径时的置信度,这个数值越大,椭圆覆盖的点就越多; 这里我添加两个椭圆,只是为了美观,ggplot2 图层叠加的语法使得添加多个椭圆这么方便,不得不为其设计者点赞; 在旧版本的ggplot2 中, 是没有stat_ellipse; 而官方的开发者在新版的ggplot2 中加入了这一功能,可想而知这个应用的受欢迎程度, 除了添加椭圆,也可以使用多边形来描述分组,也很美观,只不过代码没有椭圆那么简洁 代码示例: library(ggplot2) library(plyr) ggplot

Creating a new variable based on prior history

女生的网名这么多〃 提交于 2020-12-13 07:57:07
问题 I have data where I need to create a variable based on prior history, for example created<- c(2009,2010,2010,2011, 2012, 2011) person <- c(A, A, A, A, B, B) location<- c('London','Geneva', 'London', 'New York', 'London', 'London') df <- data.frame (created, person, location) I want to create a variable called 'existing' that takes into account the prior years and sees if he/she has lived in that place and gives a value of 0 if the place is old(and they lived there. Any suggestions? library

Creating a new variable based on prior history

为君一笑 提交于 2020-12-13 07:56:26
问题 I have data where I need to create a variable based on prior history, for example created<- c(2009,2010,2010,2011, 2012, 2011) person <- c(A, A, A, A, B, B) location<- c('London','Geneva', 'London', 'New York', 'London', 'London') df <- data.frame (created, person, location) I want to create a variable called 'existing' that takes into account the prior years and sees if he/she has lived in that place and gives a value of 0 if the place is old(and they lived there. Any suggestions? library

Creating a new variable based on prior history

别等时光非礼了梦想. 提交于 2020-12-13 07:56:01
问题 I have data where I need to create a variable based on prior history, for example created<- c(2009,2010,2010,2011, 2012, 2011) person <- c(A, A, A, A, B, B) location<- c('London','Geneva', 'London', 'New York', 'London', 'London') df <- data.frame (created, person, location) I want to create a variable called 'existing' that takes into account the prior years and sees if he/she has lived in that place and gives a value of 0 if the place is old(and they lived there. Any suggestions? library