问题
Now I bumped into other problem - how can I choose only the values of the field which fit fuzzy query? Let's say there are different names in the field university like: education : [MIT, Stanford University, Michingan university] but I want to select only stanford university. Let's say I can do aggregation on each fuzzy query, which would return ALL counts and all names of universities from field education. What I need - to get aggregations only of exact values which match fuzzy query. Let's say if I do a fuzzy query for Stanford University and a field education holds values of [MIT, Stanfordddd University, Michigan University], I would like a query to bring me back only a value of 'Stanfordddd University', not all three of them. Thanks!
回答1:
For this feature, your field education
must be of type nested and you make use of inner_hits feature to retrieve the only concerned value.
Below is the sample mapping as how your field education
would be in this case:
Mapping:
PUT my_index
{
"mappings":{
"mydocs":{
"properties":{
"education": {
"type": "nested"
}
}
}
}
}
Sample Documents:
POST my_index/mydocs/1
{
"education": [
{
"value": "Stanford University"
},
{
"value": "Harvard University"
}]
}
POST my_index/mydocs/2
{
"education": [
{
"value": "Stanford University"
},
{
"value": "Princeton University"
}]
}
Fuzzy Query on Nested Field:
POST my_index/_search
{
"query":{
"nested":{
"path":"name",
"query":{
"bool":{
"must":[
{
"span_near":{
"clauses":[
{
"span_multi":{
"match":{
"fuzzy":{
"name.value":{
"value":"Stanford",
"fuzziness":2
}
}
}
}
},
{
"span_multi":{
"match":{
"fuzzy":{
"name.value":{
"value":"University",
"fuzziness":2
}
}
}
}
}
],
"slop":0,
"in_order":false
}
}
]
}
},
"inner_hits":{}
}
}
}
Sample Response:
{
"took": 4,
"timed_out": false,
"_shards": {
"total": 5,
"successful": 5,
"skipped": 0,
"failed": 0
},
"hits": {
"total": 2,
"max_score": 0.6931472,
"hits": [
{
"_index": "my_index",
"_type": "mydocs",
"_id": "2",
"_score": 0.6931472,
"_source": {
"education": [
{
"value": "Stanford University"
},
{
"value": "Princeton University"
}
]
},
"inner_hits": {
"name": {
"hits": {
"total": 1,
"max_score": 0.6931472,
"hits": [
{
"_index": "my_index",
"_type": "mydocs",
"_id": "2",
"_nested": {
"field": "education",
"offset": 0
},
"_score": 0.6931472,
"_source": {
"value": "Stanford University"
}
}
]
}
}
}
},
{
"_index": "my_index",
"_type": "mydocs",
"_id": "1",
"_score": 0.6931472,
"_source": {
"education": [
{
"value": "Stanford University"
},
{
"value": "Harvard University"
}
]
},
"inner_hits": {
"name": {
"hits": {
"total": 1,
"max_score": 0.6931472,
"hits": [
{
"_index": "my_index",
"_type": "mydocs",
"_id": "1",
"_nested": {
"field": "education",
"offset": 0
},
"_score": 0.6931472,
"_source": {
"value": "Stanford University"
}
}
]
}
}
}
}
]
}
}
Notice the section inner_hits
where you'd see that only the relevant/concerned document having Stanford University
would be returned.
Elasticsearch by default returns the entire document as response. To certain extent you can perform filtering based on fields using _source
, however it doesn't allow you to filter values.
Hope this helps!
来源:https://stackoverflow.com/questions/53720080/elastic-search-exact-field-value-retrieval