问题
Am querying ElasticSearch using Java API and am getting lot of duplicate values. I want to get only the unique values from the query (distinct value). How can we get the distinct values from the Query Builder.
Please find my java code below, which is giving duplicate values.
QueryBuilder qb2=null;
List<Integer> link_id_array=new ArrayList<Integer>();
for(Replacement link_id:linkIDList) {
link_id_array.add(link_id.getLink_id());
}
qb2 = QueryBuilders.boolQuery()
.must(QueryBuilders.termsQuery("id", link_id_array));
Am using elastic search 6.2.3 version with RestHighLevelClient
回答1:
Way 1: You need to use the so-called aggregation API :
Sample query to get distinct emails client :
{
"query" : {
"match_all" : { }
},
"aggregations" : {
"label_agg" : {
"terms" : {
"field" : "Email_client",
"size" : 100
}
}
}
}
Java code sample=>
SearchRequestBuilder aggregationQuery =
client.prepareSearch("emails")
.setQuery(QueryBuilders.matchAllQuery())
.addAggregation(AggregationBuilders.terms("label_agg")
.field("Email_client").size(100));
SearchResponse response = aggregationQuery.execute().get();
Aggregation aggregation = response.getAggregations().get("label_agg");
StringTerms st = (StringTerms) aggregation;
return st.getBuckets().stream()
.map(bucket -> bucket.getKeyAsString())
.collect(toList());
Way 2 : Use cardinality of aggregation Api: Sample elasticquery:
{
"size": 0,
"aggs": {
"distinct": {
"cardinality": {
"field": "Email_client",
"size" : 100
}
}
}
Java code sample=>
AggregationBuilder agg11 = AggregationBuilders.cardinality("distinct").field("Email_client");
SearchResponse response11 = client.prepareSearch("emails")// we can give multiple index names here
.setSearchType(SearchType.DFS_QUERY_THEN_FETCH)
.setQuery(query11)
.addAggregation(agg11)
.setExplain(true)
.setSize(0)
.get();
来源:https://stackoverflow.com/questions/51138501/elasticsearch-java-api-to-get-distinct-values-from-the-query-builders