问题
I have used parent & child mapping to normalize data but as far as I understand there is no way to get any fields from _parent document.
Here is the mapping of my index:
{
"mappings": {
"building": {
"properties": {
"name": {
"type": "string"
}
}
},
"flat": {
"_parent": {
"type": "building"
},
"properties": {
"name": {
"type": "string"
}
}
},
"room": {
"_parent": {
"type": "flat"
},
"properties": {
"name": {
"type": "string"
},
"floor": {
"type": "long"
}
}
}
}
}
Now, I'm trying to find the best way of storing flat_name
and building_name
in room type. I won't query these fields but I should be able to get them when I query other fields like floor
.
There will be millions of rooms and I don't have much memory so I suspect that these duplicate values may cause out of memory. For now, flat_name
and building_name
fields are has "index": "no"
property and I turned on compression for _source field.
Do you have any efficient suggestion for avoiding duplicate values like querying multiple queries or hacky way to get fields from _parent document or denormalized data is the only way to handle this kindle of problem?
来源:https://stackoverflow.com/questions/14060845/how-can-i-handle-duplicate-data-in-elasticsearch