aggregate

How to remove inconsistencies from dataframe (time series)

别来无恙 提交于 2019-12-11 10:33:28
问题 Let's say that we have this dataframe: x<- as.data.frame(cbind(c("A","A","A","B","B","B","B","B","C","C","C","C","C","D","D","D","D","D"), c(1,2,3,1,2,3,2,3,1,2,3,4,5,1,2,3,4,5), c(10,12.5,15,2,3.4,5.7,8,9.5,1,5.6,8.9,10,11,2,3.4,6,8,10.5), c(1,3,4,1,2,3,4,3,2,2,3,5,2,3,5,4,5,5))) colnames(x)<- c("ID", "Visit", "Time", "State") Column ID indicates subject ID. Column Visit indicates a series of visits Column Time indicates the time that has passed to reach a certain "State" Column State

Find min and max value from the array in mongodb

限于喜欢 提交于 2019-12-11 10:25:41
问题 I have Following Project Collection Project Collection : [ { Id : 1, name : p1, tasks : [{ taskId : t1, startDate : ISODate("2018-09-24T10:02:49.403Z"), endDate : ISODate("2018-09-26T10:02:49.403Z"), }, { taskId : t2, startDate : ISODate("2018-09-24T10:02:49.403Z"), endDate : ISODate("2018-09-29T10:02:49.403Z"), }, { taskId : t3, startDate : ISODate("2018-09-24T10:02:49.403Z"), endDate : ISODate("2018-09-27T10:02:49.403Z"), }] } ] How to get p1 project's startDate and EndDate depending on

MySQL top-N ranking and sum the rest of same group

社会主义新天地 提交于 2019-12-11 10:25:08
问题 I've researched most of the time with this topic, however I couldn't get a efficient and perfect answer regarding ranking (top 3) a MySQL table with group and aggregate using sum() to the rest. The data are as following: TS | Name | Count ============================= 1552286160 | Apple | 7 1552286160 | Orange | 8 1552286160 | Grape | 8 1552286160 | Pear | 9 1552286160 | Kiwi | 10 ... 1552286100 | Apple | 10 1552286100 | Orange | 12 1552286100 | Grape | 14 1552286100 | Pear | 16 1552286100 |

Listing a field from all the entities in a collection

…衆ロ難τιáo~ 提交于 2019-12-11 10:03:09
问题 I'm writing out an aggregated list of statuses. It works fine except for the situation where there are none. At the moment, null is rendered and the position is empty. item.Stuff.Where(e => Condition(e)) .Select(f => f.Status) .Aggregate("---", (a, b) => (a == "---") ? b : (a + b)); I've got a suggestion for solution and improvement as follows. [Flags] enum Status { None = 0, Active = 1, Inactive = 2, Pending = 4, Deleted = 8 } item.Stuff.Where(e => Condition(e)) .Aggregate(Status.None, (a, b

Sum columns based on multiple other factor columns

一曲冷凌霜 提交于 2019-12-11 08:57:17
问题 I have the following dataframe: df<-structure(list(totprivland = c(175L, 50L, 100L, 14L, 4L, 240L, 10L, 20L, 20L, 58L), ncushr8d1 = c(0L, 0L, 0L, 0L, 0L, 30L, 5L, 0L, 0L, 50L), ncu_CENREG1 = structure(c(4L, 4L, 4L, 4L, 1L, 3L, 3L, 3L, 4L, 4L), .Label = c("Northeast", "Midwest", "South", "West"), class = "factor"), ncushr8d2 = c(75L, 50L, 100L, 14L, 2L, 30L, 5L, 20L, 20L, 8L), ncu_CENREG2 = structure(c(4L, 4L, 4L, 4L, 1L, 2L, 1L, 4L, 3L, 4L), .Label = c("Northeast", "Midwest", "South", "West")

Elasticsearch - Show index-wide count for each returned result based from a given term

和自甴很熟 提交于 2019-12-11 08:46:01
问题 Firstly i apologise if the terminology i use is incorrect as i am learning elasticsearch day by day and maybe use incorrect phrases. After spending several days trying to figure this out and pulling my hair out i seem to be hitting brick walls every-time. I am trying to get elasticsearch to provide a document count for each returned result, I will provide an example below.. { "suggest": { "text": "aberdeen", "city": { "completion": { "field": "city_suggest", "size": "2" } }, "street": {

Sorting files into folders based on a pattern in their name using .bat

南楼画角 提交于 2019-12-11 08:35:25
问题 Consider a parent folder C:\Users\..\Parent Under parent there are 3 folders M1,M2,M3 C:\Users\..\Parent\M1 C:\Users\..\Parent\M2 C:\Users\..\Parent\M3. Under M1,M2,M3 there is 100 sub folders. C:\Users\..\Parent\M1\MattP001M1 C:\Users\..\Parent\M1\MattP002M1 so on till C:\Users\..\Parent\M1\MattP100M1. Similarly for M2,M3 as well. Under every folder(MattP001M1..MattP100M1) there are a ton of .wav files(close to 1500 on an avg). These wav files have a pattern in their naming. e.g: There are

Google Dataflow “elementCountExact” aggregation

这一生的挚爱 提交于 2019-12-11 08:14:57
问题 I'm trying to aggregate a PCollection<String> into PCollection<List<String>> with ~60 elements each. They will be sent to an API which accepts 60 elements per request. Currently I'm trying it by windowing, but there is only elementCountAtLeast, so I have to collect them into a list and count again and split in case it is too long. This is quite cumbersome and results in a lot of lists with just few elements: Repeatedly.forever(AfterFirst.of( AfterPane.elementCountAtLeast

Aggregate the time-series data by average function with time prefrance of HH = (HH-1):41 - HH:40 , In R

夙愿已清 提交于 2019-12-11 08:12:43
问题 I have temporal database and I wanted to obtain hourly average for time series data. I have used this code: aggregate(list(ambtemp = p28$ambtemp), list(d = cut(p28$dt, "1 hour")), mean) Sample data: 1 -1.64 2007-09-29 00:01:09 2 -1.76 2007-09-29 00:03:09 3 -1.83 2007-09-29 00:05:09 4 -1.86 2007-09-29 00:07:09 5 -1.94 2007-09-29 00:09:09 6 -1.87 2007-09-29 00:11:09 7 -1.87 2007-09-29 00:13:09 8 -1.80 2007-09-29 00:15:09 9 -1.64 2007-09-29 00:17:09 10 -1.60 2007-09-29 00:19:09 11 -1.90 2007-09

SQL GROUP BY with columns which contain mirrored values

帅比萌擦擦* 提交于 2019-12-11 07:32:33
问题 Sorry for the bad title. I couldn't think of a better way to describe my issue. I have the following table: Category | A | B A | 1 | 2 A | 2 | 1 B | 3 | 4 B | 4 | 3 I would like to group the data by Category , return only 1 line per category, but provide both values of columns A and B . So the result should look like this: category | resultA | resultB A | 1 | 2 B | 4 | 3 How can this be achieved? I tried this statement: SELECT category, a, b FROM table GROUP BY category but obviously, I get