问题
Let's assume that this is how a sample document looks like in mongo-db,
[
{
"_id": "1",
"attrib_1": "value_1",
"attrib_2": "value_2",
"months": {
"2": {
"month": "2",
"year": "2008",
"transactions": [
{
"field_1": "val_1",
"field_2": "val_2",
},
{
"field_1": "val_4",
"field_2": "val_5",
"field_3": "val_6"
},
]
},
"3": {
"month": "3",
"year": "2018",
"transactions": [
{
"field_1": "val_7",
"field_3": "val_9"
},
{
"field_1": "val_10",
"field_2": "val_11",
},
]
},
}
}
]
The desired output is something like this, (I am just showing it for months 2 & 3)
id | months | year | field_1 | field_2 | field_3 |
---|---|---|---|---|---|
1 | 2 | 2008 | val_1 | val_2 | |
1 | 2 | 2008 | val_4 | val_5 | val_6 |
1 | 3 | 2018 | val_7 | val_9 | |
1 | 3 | 2018 | val_10 | val_11 |
My attempt:
I tried something like this in Py-Mongo,
pipeline = [
{
# some filter logic here to filter data basically first
},
{
"$addFields": {
"latest": {
"$map": {
"input": {
"$objectToArray": "$months",
},
"as": "obj",
"in": {
"all_field_1" : {"$ifNull" : ["$$obj.v.transactions.field_1", [""]]},
"all_field_2": {"$ifNull" : ["$$obj.v.transactions.field_2", [""]]},
"all_field_3": {"$ifNull" : ["$$obj.v.transactions.field_3", [""]]},
"all_months" : {"$ifNull" : ["$$obj.v.month", ""]},
"all_years" : {"$ifNull" : ["$$obj.v.year", ""]},
}
}
}
}
},
{
"$project": {
"_id": 1,
"months": "$latest.all_months",
"year": "$latest.all_years",
"field_1": "$latest.all_field_1",
"field_2": "$latest.all_field_2",
"field_3": "$latest.all_field_3",
}
}
]
# and I executed it as
my_db.collection.aggregate(pipeline, allowDiskUse=True)
The above is actually bring the data but it's bringing them in lists. Is there a way to easily bring them one each row in mongo itself?
the above brings data in this way,
id | months | year | field_1 | field_2 | field_3 |
---|---|---|---|---|---|
1 | ["2", "3"] | ["2008", "2018"] | [["val_1", "val_4"], ["val_7", "val_10"]] | [["val_2", "val_5"], ["", "val_11"]] | [["", "val_6"], ["val_9", ""]] |
Would highly appreciate your valuable inputs regarding the same and a better way to do the same as well!
Thanks for your time.
My Mongo version is 3.4.6 and I am using PyMongo as my driver. You can see the query in action at mongo-db-playground
回答1:
This is might be bad idea to do all process in a aggregation query, you could do this in your client side,
I have created a query which is lengthy may cause performance issues in huge data,
$objectToArray
convertmonths
object to array$unwind
deconstruct months array$unwind
deconstructtransactions
array and provide index fieldindex
$group
by_id, year, month and index
, and get first object from transactions in fields$project
you can design your response if you want otherwise this is optional i have added in playground link
my_db.collection.aggregate([
{ # some filter logic here to filter data basically first },
{ $project: { months: { $objectToArray: "$months" } } },
{ $unwind: "$months" },
{
$unwind: {
path: "$months.v.transactions",
includeArrayIndex: "index"
}
},
{
$group: {
_id: {
_id: "$_id",
year: "$months.v.year",
month: "$months.v.month",
index: "$index"
},
fields: { $first: "$months.v.transactions" }
}
}
], allowDiskUse=True);
Playground
来源:https://stackoverflow.com/questions/65857866/how-to-retrieve-each-single-array-element-from-mongo-pipeline