问题
I have products with categories field. Using the aggregation I can get the full categories with all subcategories. I want to limit the levels in the facet.
e.g. I have the facets like:
auto, tools & travel (115)
auto, tools & travel > luggage tags (90)
auto, tools & travel > luggage tags > luggage spotters (40)
auto, tools & travel > luggage tags > something else (50)
auto, tools & travel > car organizers (25)
Using aggregation like
"aggs": {
"cat_groups": {
"terms": {
"field": "categories.keyword",
"size": 10,
"include": "auto, tools & travel > .*"
}
}
}
I am getting buckets like
"buckets": [
{
"auto, tools & travel > luggage tags",
"doc_count": 90
},
{
"key": "auto, tools & travel > luggage tags > luggage spotters",
"doc_count": 40
},
{
"key": "auto, tools & travel > luggage tags > something else",
"doc_count": 50
},
{
"key": "auto, tools & travel > car organizers",
"doc_count": 25
}
]
But I want to limit the level. e.g. I want to get only the results for auto, tools & travel > luggage tags
. How can I limit the levels?
By the way, "exclude": ".* > .* > .*"
does not work for me.
I need to get buckets for different levels according to search. Sometimes first level, and sometimes second or third. When I want first level, I don't want the second levels to appear on buckets; and so on for other levels.
Elasticsearch version 6.4
回答1:
Finally I've been able to figure the below technique.
I have implemented a custom analyzer
using Path Hierarchy Tokenizer and I have created multi-field called categories
so that you can use categories.facets
for aggregations/facets and do normal text search using categories
.
The custom analyzer would only apply for categories.facets
Do note the property "fielddata": "true"
for my field categories.facet
Mapping
PUT myindex
{
"settings": {
"analysis": {
"analyzer": {
"my_analyzer": {
"tokenizer": "my_tokenizer"
}
},
"tokenizer": {
"my_tokenizer": {
"type": "path_hierarchy",
"delimiter": ">"
}
}
}
},
"mappings": {
"mydocs": {
"properties": {
"categories": {
"type": "text",
"fields": {
"facet": {
"type": "text",
"analyzer": "my_analyzer",
"fielddata": "true"
}
}
}
}
}
}
}
Sample Documents
POST myindex/mydocs/1
{
"categories" : "auto, tools & travel > luggage tags > luggage spotters"
}
POST myindex/mydocs/2
{
"categories" : "auto, tools & travel > luggage tags > luggage spotters"
}
POST myindex/mydocs/3
{
"categories" : "auto, tools & travel > luggage tags > luggage spotters"
}
POST myindex/mydocs/4
{
"categories" : "auto, tools & travel > luggage tags > something else"
}
Query
You can try the below query which you are looking for. Again I've implemented Filter Aggregation because you need only specific words along with Terms Aggregation.
{
"size": 0,
"aggs":{
"facets": {
"filter": {
"bool": {
"must": [
{ "match": { "categories": "luggage"} }
]
}
},
"aggs": {
"categories": {
"terms": {
"field": "categories.facet"
}
}
}
}
}
}
Response
{
"took": 43,
"timed_out": false,
"_shards": {
"total": 1,
"successful": 1,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 11,
"max_score": 0,
"hits": []
},
"aggregations": {
"facets": {
"doc_count": 4,
"categories": {
"doc_count_error_upper_bound": 0,
"sum_other_doc_count": 0,
"buckets": [
{
"key": "auto, tools & travel ",
"doc_count": 4
},
{
"key": "auto, tools & travel > luggage tags ",
"doc_count": 4
},
{
"key": "auto, tools & travel > luggage tags > luggage spotters",
"doc_count": 3
},
{
"key": "auto, tools & travel > luggage tags > something else",
"doc_count": 1
}
]
}
}
}
}
Final Answer Post Discussion On Chat
POST myindex/_search
{
"size": 0,
"aggs":{
"facets": {
"filter": {
"bool": {
"must": [
{ "match": { "categories": "luggage"} }
]
}
},
"aggs": {
"categories": {
"terms": {
"field": "categories.facet",
"exclude": ".*>{1}.*>{1}.*"
}
}
}
}
}
}
Note that I've added exclude
with a regular expression
in such a way that it would not consider any facets which is having more than one occurrence of >
Let me know this if it helps.
回答2:
Just add an integer field named level signifying your category's level in the hierarchy. Just count the number of occurrence of your delimiter '>' and save it as the value. Then add a rangeQuery to your boolQuery.
Add this to your schema:
"level": {
"type": "integer",
"store": "true",
"index": "true"
}
In your code you have something like this which counts the number of delimiter suggesting the level of hierarchy (no delimiter means main category):
public Builder(final String path) {
this.path = path;
this.level = StringUtils.countMatches(path, DELIMITER);
}
and then your query search could have something like:
{
"query": {
"bool": {
"filter": [
{
"prefix": {
"category": {
"value": "auto, tools & travel",
"boost": 1
}
}
},
{
"range": {
"level": {
"from": 2,
"to": 4,
"include_lower": true,
"include_upper": true,
"boost": 1
}
}
}
],
"adjust_pure_negative": true,
"boost": 1
}
}
}
来源:https://stackoverflow.com/questions/52940790/elasticsearch-aggregation-with-hierarchical-category-subcategory-limit-the-lev