aggregate | 易学教程

In R, how to make a boxplot?

阅读更多关于 In R, how to make a boxplot?

问题 My input table has two columns like this: x y 1 187 2 235 3 857 3 253 2 955 1 267 I want to make a boxplot of the y values for each individual x value. The x values are limited to 1, 2, 3. Here is my R code: data=read.table("input.txt") arr=array(dim=3) for (i in 1:3) { arr[i]=data[data.x==i,"y"] // This line raises warnings. } boxplot(arr) How to correct my code? 回答1: foo <- data.frame(x=rep(1:5,each=20),y=rnorm(100)) with(foo,boxplot(y~x)) 来源： https://stackoverflow.com/questions/30750049/in

Aggregate solution over multiple facts

阅读更多关于 Aggregate solution over multiple facts

问题 Trying to create a predicate ( timePeriod/2 ) that calculates the time period between two dates for a specific fact. I've managed to do this by myself, but face issues when 'other answers' exist in the same list (i.e. easier to explain with examples). I have the following knowledge-base facts; popStar('Jackson',1987,1991). popStar('Jackson',1992,1996). popStar('Michaels',1996,2000). popStar('Newcastle',2000,2007). popStar('Bowie',2008,2010). And the following function, calculates the time

R aggregate data by defining grouping

阅读更多关于 R aggregate data by defining grouping

问题 I am having trouble grouping and summing the follwing data in R: category freq 1 C1 9 2 C2 39 3 C3 3 4 A1 38 5 A2 2 6 A3 29 7 B1 377 8 B2 214 9 B3 790 10 B4 724 11 D1 551 12 D2 985 13 E5 19 14 E4 28 to look like this: category freq 1 A 69 2 B 2105 3 C 51 4 D 1536 5 E 47 I usually use ddply to aggregate data by an attribute but this just adds all values rows with the same attribute in a given column. I need to be able to specify multiple attributes that should be lumped into one category. 回答1:

Group by and then add a column for ratio based on condition

阅读更多关于 Group by and then add a column for ratio based on condition

问题 Say my dataframe in R looks like the one below. Sex is male/female. Familysize is the number of family members with the same surname. Surname is the surname. Sex FamilySize Surname male 1 Abbing female 3 Abbott male 3 Abbott male 3 Abbott male 1 Abelseth female 1 Abelseth male 2 Abelson female 2 Abelson male 1 Abrahamsson female 1 Abrahim I want to add a new column FemaleToFamilySizeRatio, that will give me the ratio of the number of Females in each family. The results would look like below:

Elasticsearch metric aggregation: number of elements in array

阅读更多关于 Elasticsearch metric aggregation: number of elements in array

问题 I want to do a quite involved query/aggregation. I can't see how because I've just started working with ES. The documents I have look something like this: { "keyword": "some keyword", "items": [ { "name":"my first item", "item_property_1":"A", ( other properties here ) }, { "name":"my second item", "item_property_1":"B", ( other properties here ) }, { "name":"my third item", "item_property_1":"A", ( other properties here ) } ] ( other properties... ) }, { "keyword": "different keyword",

How to reshape dataframe and transpose recurring columns to dataframe rows?

阅读更多关于 How to reshape dataframe and transpose recurring columns to dataframe rows?

问题 I have a dataframe that has recurring columns (the interval is 5). my dataframe at the moment So this is how it looks: I have 5 type of columns and they repeat time over time. The recurring columns have a suffix in their name, this can be removed/renamed as well, so that they would all match. What I would like to do is to transpose these recurring columns to rows, so that I would have only 5 columns in the end (Dates, PX_LAST, PX_HIGH, PX_VOLUME, Name). Then I would be able to group the

How to reshape dataframe and transpose recurring columns to dataframe rows?

阅读更多关于 How to reshape dataframe and transpose recurring columns to dataframe rows?

How to speed up cummulative sum within group?

阅读更多关于 How to speed up cummulative sum within group?

问题 I have the following data frame: id<-c(1,1,1,1,1,3,3,3,3) spent<-c(10,20,30,40,50,60,70,80,90) date<-c("11-11-07","11-11-07","23-11-07","12-12-08","17-12-08","11-11-07","23-11-07","23- 11-07","16-01-08") df<-data.frame(id,date,spent) df$date2<-as.Date(as.character(df$date), format = "%d-%m-%y") id date spent date2 1 1 11-11-07 10 2007-11-11 2 1 11-11-07 20 2007-11-11 3 1 23-11-07 30 2007-11-23 4 1 12-12-08 40 2008-12-12 5 1 17-12-08 50 2008-12-17 6 3 11-11-07 60 2007-11-11 7 3 23-11-07 70

Consolidating duplicate rows in a dataframe [duplicate]

阅读更多关于 Consolidating duplicate rows in a dataframe [duplicate]

问题 This question already has answers here : Collapse / concatenate / aggregate a column to a single comma separated string within each group (3 answers) Closed 2 years ago . This is a continuation of a past question I asked. Basically, I have a dataframe, df Beginning1 Protein2 Protein3 Protein4 Biomarker1 Pathway3 A G NA NA F Pathway6 A G NA NA E Pathway2 A B H NA F Pathway5 A B H NA E Pathway1 A D K NA F Pathway7 A B C D F Pathway4 A B C D E And now I want to consolidate the rows to look like

How to combine rows based on unique values in R? [duplicate]

阅读更多关于 How to combine rows based on unique values in R? [duplicate]

问题 This question already has answers here : Collapse text by group in data frame [duplicate] (2 answers) Closed 4 years ago . I'm a pretty beginner at R. I've a CSV file where data is as follows, for example: ID Values 820 D1,D2,FE 730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG 730 DV,GTH,LYT 567 EDR,TYU,EOP,OMN 567 FGH,KIH,IOP I want to remove the duplicates in ID and append their data into its Values column, like this: ID Values 820 D1,D2,FE 730 D1,D2,D3,PC,Io,He,Bt,Te,AR,PG,DV,GTH,LYT 567 EDR,TYU,EOP,OMN