aggregate

Ok to provide constructor + trivial operators for behaviorless aggregates?

烈酒焚心 提交于 2019-12-11 14:11:49
问题 This is a follow-up question to 2043381. Consider the following: struct DataBundle { std::string name; int age; DataBundle() : age(0) {} DataBundle(const std::string& name, int age) : name(name), age(age) {} void swap(DataBundle& rhs) {name.swap(rhs.name); std::swap(age, rhs.age);} DataBundle& operator=(DataBundle rhs) {swap(rhs); return *this;} bool operator==(const DataBundle& rhs) const {return (name == rhs.name) && (age == rhs.age);} bool operator!=(const DataBundle& rhs) const {return !(

R: how to aggregate by real values column with given error tolerance

我与影子孤独终老i 提交于 2019-12-11 14:06:20
问题 Assuming I have a data frame: t <- data.frame(d1=c( 694, 695, 696, 2243, 2244, 2651, 2652 ), d2=c(1.80950881, 1.80951007, 1.80951052, 1.46499982, 1.46500087, 1.14381419, 1.14381319 )) d1 d2 1 694 1.809509 2 695 1.809510 3 696 1.809511 4 2243 1.465000 5 2244 1.465001 6 2651 1.143814 7 2652 1.143813 I'd like to group by the column d2 real values that have very close but not exactly equal values. Thus, in this example, after aggregation, I'd like to obtain the following data set: d1 d2 1 694 1

How can I skip groups while subsetting with key by in data.table?

帅比萌擦擦* 提交于 2019-12-11 14:06:17
问题 I have this DT: dt=data.table(ID=c(rep(letters[1:2],each=4),'b'),value=seq(1,9)) ID value 1: a 1 2: a 2 3: a 3 4: a 4 5: b 5 6: b 6 7: b 7 8: b 8 9: b 9 I need to eliminate groups while subsetting but only when the data fulfils some condition. Something like this does not work: dt[,{if (.N==4) .SD else NULL v1},by="ID"] So that I need to remove the groups that do not meet the condition. In this example I would like to skip the groups which length is different than 4. So that I get: ID value 1

Terms aggregation based on unique key

扶醉桌前 提交于 2019-12-11 13:42:02
问题 I have an index full of documents. Each of them has a key "userid" with a distinct value per user, but each user may have multiple documents. Each user has additional properties (like "color", "animal"). I need to get the agg counts per property which would be: aggs: { colors: { terms: { field: color } }, animals: { terms: { field: animal } } } But I need these counts per unique userid, maybe: aggs: { group-by: { field: userid }, sub-aggs: { colors: { terms: { field: color } }, animals: {

Applying aggregate functions to multiple properties with LINQ GroupBy

|▌冷眼眸甩不掉的悲伤 提交于 2019-12-11 12:28:48
问题 I have a list of Object (it's called: sourceList ) Object contains: Id, Num1, Num2, Num3, Name, Lname Assume I have the following list: 1, 1, 5, 9, 'a', 'b' 1, 2, 3, 2, 'b', 'm' 2, 5, 8, 7, 'r', 'a' How can I return another list (of object2 ) that returns a new list: Id, sum of num1, sum of num2 For the example above, it should return a list of object2 that contains: 1, 3, 8 2, 5, 8 I tried: Dim a = sourceList.GroupBy(Function(item) item.Id). Select(Function(x) x.Sum(Function(y) y.Num1))

How to group time by every n minutes in R

為{幸葍}努か 提交于 2019-12-11 12:26:37
问题 I have a dataframe with a lot of time series: 1 0:03 B 1 2 0:05 A 1 3 0:05 A 1 4 0:05 B 1 5 0:10 A 1 6 0:10 B 1 7 0:14 B 1 8 0:18 A 1 9 0:20 A 1 10 0:23 B 1 11 0:30 A 1 I want to group the time series into every 6 minutes and count the frequency of A and B: 1 0:06 A 2 2 0:06 B 2 3 0:12 A 1 4 0:12 B 1 5 0:18 A 1 6 0:24 A 1 7 0:24 B 1 8 0:18 A 1 9 0:30 A 1 Also, the class of the time series is character. What should I do? 回答1: Here's an approach to convert times to POSIXct , cut the times by 6

Aggregating data based on unique triads in R

拟墨画扇 提交于 2019-12-11 12:10:58
问题 I was referred here Counting existing permutations in R for previous related question but I can't apply it to my problem. Here is the data I have One <- c(rep("X",6),rep("Y",3),rep("Z",2)) Two <- c(rep("A",4),rep("B",6),rep("C",1)) Three <- c(rep("J",5),rep("K",2),rep("L",4)) Number <- runif(11) df <- data.frame(One,Two,Three,Number) One Two Three Number 1 X A J 0.10511669 2 X A J 0.62467760 3 X A J 0.24232663 4 X A J 0.38358854 5 X B J 0.04658226 6 X B K 0.26789844 7 Y B K 0.07685341 8 Y B L

How to correctly use pandas agg function when running groupby on a column of type timestamp/datetime/datetime64?

 ̄綄美尐妖づ 提交于 2019-12-11 11:56:22
问题 I'm trying to understand why calling count() directly on a group returns the correct answer (in this example, 2 rows in that group), but calling count via a lambda in the agg() function returns the beginning of epoch ("1970-01-01 00:00:00.000000002"). # Using groupby(lambda x: True) in the code below just as an illustrative example. # It will always create a single group. x = DataFrame({'time': [np.datetime64('2005-02-25'), np.datetime64('2006-03-30')]}).groupby(lambda x: True) display(x

Linq Aggregate function

白昼怎懂夜的黑 提交于 2019-12-11 11:53:33
问题 I have a List like "test", "bla", "something", "else" But when I use the Aggrate on it and in the mean time call a function it seems to me that after 2 'iterations' the result of the first gets passed in? I am using it like : myList.Aggregate((current, next) => someMethod(current) + ", "+ someMethod(next)); and while I put a breakpoint in the someMethod function where some transformation on the information in the myList occurs, I notice that after the 3rd call I get a result from a former

Aggregate 5-Minute data to hourly sums with present NA's

做~自己de王妃 提交于 2019-12-11 10:34:14
问题 My problem is as follows: I've got a time series with 5-Minute precipitation data like: Datum mm 1 2004-04-08 00:05:00 NA 2 2004-04-08 00:10:00 NA 3 2004-04-08 00:15:00 NA 4 2004-04-08 00:20:00 NA 5 2004-04-08 00:25:00 NA 6 2004-04-08 00:30:00 NA with this structure: 'data.frame': 1098144 obs. of 2 variables: $ Datum: POSIXlt, format: "2004-04-08 00:05:00" "2004-04-08 00:10:00" "2004-04-08 00:15:00" "2004-04-08 00:20:00" ... $ mm : num NA NA NA NA NA NA NA NA NA NA ... As you can see, the