aggregation

Elasticsearch - Show index-wide count for each returned result based from a given term

和自甴很熟 提交于 2019-12-11 08:46:01
问题 Firstly i apologise if the terminology i use is incorrect as i am learning elasticsearch day by day and maybe use incorrect phrases. After spending several days trying to figure this out and pulling my hair out i seem to be hitting brick walls every-time. I am trying to get elasticsearch to provide a document count for each returned result, I will provide an example below.. { "suggest": { "text": "aberdeen", "city": { "completion": { "field": "city_suggest", "size": "2" } }, "street": {

Perform OrderBy on the results of Apply with Aggregate OData Version 4

我只是一个虾纸丫 提交于 2019-12-11 08:05:57
问题 Consider I have an odata query like this: Sessions?$apply=filter(SomeColumn eq 1)/groupby((Application/Name), aggregate(TotalLaunchesCount with sum as Total)) Sessions and Application entities are linked by ApplicationId. I want to apply orderby on "Total" and get top 5 results as odata query response. I tried adding &$top=5 at the end of the above mentioned query. Its says: The query specified in the URI is not valid. Could not find a property named 'Total' on type 'Sessions'. Can anyone

Using multiple facets in MongoDB Spring Data

百般思念 提交于 2019-12-11 07:53:35
问题 I want to run multiple facets in one aggregation to save db round trips. Here is my spring data code : final BalancesDTO total = this.mongoTemplate.aggregate( newAggregation( /* * Get all fund transactions for this user */ match(where("userId").is(userId)), /* * Summarize Confirmed Debits */ facet( match(where("entryType").is(EntryType.DEBIT) .andOperator(where("currentStatus").is(TransactionStatus.CONFIRMED))), unwind("history"), match(where("history.status").is(TransactionStatus.CONFIRMED))

Select the most common value of a column based on matched pairs from two columns using `ddply`

喜你入骨 提交于 2019-12-11 07:33:38
问题 I'm trying to use ddply (a plyr function) to sort and identify the most frequent interaction type between any unique pairs of user from a social media data of the following form from <- c('A', 'A', 'A', 'B', 'B', 'B', 'B', 'C', 'C', 'C', 'C', 'D', 'D', 'D', 'D') to <- c('B', 'B', 'D', 'A', 'C', 'C', 'D', 'A', 'D', 'B', 'A', 'B', 'B', 'A', 'C') interaction_type <- c('like', 'comment', 'share', 'like', 'like', 'like', 'comment', 'like', 'like', 'share', 'like', 'comment', 'like', 'share', 'like

How to create an Aggregation from a list of AggregationOperation in Spring data MongoDB?

生来就可爱ヽ(ⅴ<●) 提交于 2019-12-11 07:12:28
问题 I want to create an Aggregation that can be used in MongoOperations's aggregate() function. So for creating the Aggregation, I used a list of AggregationOperation as follows: ApplicationContext ctx = new AnnotationConfigApplicationContext(MongoConfig.class); MongoOperations mongoOperation = (MongoOperations) ctx.getBean("mongoTemplate"); List<AggregationOperation> aggregationOperations = new ArrayList<AggregationOperation>(); aggregationOperations.add(new MatchOperation(Criteria.where(

How to add a new column and aggregate values in R

半世苍凉 提交于 2019-12-11 06:59:53
问题 I am completely new to gnuplot and am only trying this because I need to learn it. I have a values in three columns where the first represents the filename (date and time, one hour interval) and the remaining two columns represent two different entities Prop1 and Prop2. Datetime Prop1 Prop2 20110101_0000.txt 2 5 20110101_0100.txt 2 5 20110101_0200.txt 2 5 ... 20110101_2300.txt 2 5 20110201_0000.txt 2 5 20110101_0100.txt 2 5 ... 20110201_2300.txt 2 5 ... I need to aggregate the data by the

OData v4.0 aggregate queries (aggregate query syntax)

自古美人都是妖i 提交于 2019-12-11 06:35:42
问题 For example, I have an object model: Product { int ProductId, string Name, List<Sale> Sales } I want to use the aggregate queries to get total Amount of Sales: GET: Product?$apply=groupby(Name, aggregate(Sales(Amount with sum as Total))) (follow as oasis-open standard) --> Got error: UriQueryExpressionParser_CloseParenOrCommaExpected=" ')' or ',' expected at position {0} in '{1}'. ". position at Amount. I change the query to: GET: Product?$apply=groupby(Name, aggregate(Sales/Amount with sum

Pandas Grouping - Values as Percent of Grouped Totals Based on Another Column

喜欢而已 提交于 2019-12-11 04:53:52
问题 This question is an extension of a question I asked yesterday, but I will rephrase Using a data frame and pandas, I am trying to figure out what the tip percentage is for each category in a group by. So, using the tips database, I want to see, for each sex/smoker, what the tip percentage is is for female smoker / all female and for female non smoker / all female (and the same thing for men) When I do this, import pandas as pd df=pd.read_csv("https://raw.githubusercontent.com/wesm/pydata-book

Mongo 3.6 aggregation lookup with multiple conditions

风格不统一 提交于 2019-12-11 04:25:29
问题 Suppose I have a Mongo DB with only one collection data . In this collection, I have the following documents: { "type": "person", "value": { "id": 1, "name": "Person 1", "age": 10 } }, { "type": "person", "value": { "id": 2, "name": "Person 2", "age": 20 } }, { "type": "prescription", "value": { "drug": "Bromhexine", "patient": 2 } }, { "type": "prescription", "value": { "drug": "Aspirin", "patient": 1 } } With those records, I'd like to make a JOIN between documents with "type": person and

Calculate row mean, ignoring NAs in Spark Scala

狂风中的少年 提交于 2019-12-11 03:24:15
问题 I'm trying to find a way to calculate the mean of rows in a Spark Dataframe in Scala where I want to ignore NAs. In R, there is a very convenient function called rowMeans where one can specify to ignore NAs: rowmeans(df,na.rm=TRUE) I'm unable to find a corresponding function for Spark Dataframes, and I wonder if anyone has a suggestion or input if this would be possible. Replacing them with 0 won't due since this will affect the denominator. I found a similar question here, however my