aggregate

In R, how to make a boxplot?

笑着哭i 提交于 2019-12-20 07:46:12
问题 My input table has two columns like this: x y 1 187 2 235 3 857 3 253 2 955 1 267 I want to make a boxplot of the y values for each individual x value. The x values are limited to 1, 2, 3. Here is my R code: data=read.table("input.txt") arr=array(dim=3) for (i in 1:3) { arr[i]=data[data.x==i,"y"] // This line raises warnings. } boxplot(arr) How to correct my code? 回答1: foo <- data.frame(x=rep(1:5,each=20),y=rnorm(100)) with(foo,boxplot(y~x)) 来源: https://stackoverflow.com/questions/30750049/in

Aggregate solution over multiple facts

半城伤御伤魂 提交于 2019-12-20 06:29:04
问题 Trying to create a predicate ( timePeriod/2 ) that calculates the time period between two dates for a specific fact. I've managed to do this by myself, but face issues when 'other answers' exist in the same list (i.e. easier to explain with examples). I have the following knowledge-base facts; popStar('Jackson',1987,1991). popStar('Jackson',1992,1996). popStar('Michaels',1996,2000). popStar('Newcastle',2000,2007). popStar('Bowie',2008,2010). And the following function, calculates the time

R aggregate data by defining grouping

不打扰是莪最后的温柔 提交于 2019-12-20 06:25:16
问题 I am having trouble grouping and summing the follwing data in R: category freq 1 C1 9 2 C2 39 3 C3 3 4 A1 38 5 A2 2 6 A3 29 7 B1 377 8 B2 214 9 B3 790 10 B4 724 11 D1 551 12 D2 985 13 E5 19 14 E4 28 to look like this: category freq 1 A 69 2 B 2105 3 C 51 4 D 1536 5 E 47 I usually use ddply to aggregate data by an attribute but this just adds all values rows with the same attribute in a given column. I need to be able to specify multiple attributes that should be lumped into one category. 回答1:

Group by and then add a column for ratio based on condition

和自甴很熟 提交于 2019-12-20 05:46:13
问题 Say my dataframe in R looks like the one below. Sex is male/female. Familysize is the number of family members with the same surname. Surname is the surname. Sex FamilySize Surname male 1 Abbing female 3 Abbott male 3 Abbott male 3 Abbott male 1 Abelseth female 1 Abelseth male 2 Abelson female 2 Abelson male 1 Abrahamsson female 1 Abrahim I want to add a new column FemaleToFamilySizeRatio, that will give me the ratio of the number of Females in each family. The results would look like below:

Elasticsearch metric aggregation: number of elements in array

微笑、不失礼 提交于 2019-12-20 05:44:23
问题 I want to do a quite involved query/aggregation. I can't see how because I've just started working with ES. The documents I have look something like this: { "keyword": "some keyword", "items": [ { "name":"my first item", "item_property_1":"A", ( other properties here ) }, { "name":"my second item", "item_property_1":"B", ( other properties here ) }, { "name":"my third item", "item_property_1":"A", ( other properties here ) } ] ( other properties... ) }, { "keyword": "different keyword",

How to reshape dataframe and transpose recurring columns to dataframe rows?

╄→гoц情女王★ 提交于 2019-12-20 05:12:08
问题 I have a dataframe that has recurring columns (the interval is 5). my dataframe at the moment So this is how it looks: I have 5 type of columns and they repeat time over time. The recurring columns have a suffix in their name, this can be removed/renamed as well, so that they would all match. What I would like to do is to transpose these recurring columns to rows, so that I would have only 5 columns in the end (Dates, PX_LAST, PX_HIGH, PX_VOLUME, Name). Then I would be able to group the

How to reshape dataframe and transpose recurring columns to dataframe rows?

六眼飞鱼酱① 提交于 2019-12-20 05:12:07
问题 I have a dataframe that has recurring columns (the interval is 5). my dataframe at the moment So this is how it looks: I have 5 type of columns and they repeat time over time. The recurring columns have a suffix in their name, this can be removed/renamed as well, so that they would all match. What I would like to do is to transpose these recurring columns to rows, so that I would have only 5 columns in the end (Dates, PX_LAST, PX_HIGH, PX_VOLUME, Name). Then I would be able to group the

How to speed up cummulative sum within group?

ぐ巨炮叔叔 提交于 2019-12-20 04:30:53
问题 I have the following data frame: id<-c(1,1,1,1,1,3,3,3,3) spent<-c(10,20,30,40,50,60,70,80,90) date<-c("11-11-07","11-11-07","23-11-07","12-12-08","17-12-08","11-11-07","23-11-07","23- 11-07","16-01-08") df<-data.frame(id,date,spent) df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y") id date spent date2 1 1 11-11-07 10 2007-11-11 2 1 11-11-07 20 2007-11-11 3 1 23-11-07 30 2007-11-23 4 1 12-12-08 40 2008-12-12 5 1 17-12-08 50 2008-12-17 6 3 11-11-07 60 2007-11-11 7 3 23-11-07 70

Consolidating duplicate rows in a dataframe [duplicate]

怎甘沉沦 提交于 2019-12-20 03:34:10
问题 This question already has answers here : Collapse / concatenate / aggregate a column to a single comma separated string within each group (3 answers) Closed 2 years ago . This is a continuation of a past question I asked. Basically, I have a dataframe, df Beginning1 Protein2 Protein3 Protein4 Biomarker1 Pathway3 A G NA NA F Pathway6 A G NA NA E Pathway2 A B H NA F Pathway5 A B H NA E Pathway1 A D K NA F Pathway7 A B C D F Pathway4 A B C D E And now I want to consolidate the rows to look like

How to combine rows based on unique values in R? [duplicate]

三世轮回 提交于 2019-12-20 03:20:00
问题 This question already has answers here : Collapse text by group in data frame [duplicate] (2 answers) Closed 4 years ago . I'm a pretty beginner at R. I've a CSV file where data is as follows, for example: ID Values 820 D1,D2,FE 730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG 730 DV,GTH,LYT 567 EDR,TYU,EOP,OMN 567 FGH,KIH,IOP I want to remove the duplicates in ID and append their data into its Values column, like this: ID Values 820 D1,D2,FE 730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG,DV,GTH,LYT 567 EDR,TYU,EOP,OMN