I have a highly nested mongoDB set of objects and I want to count the number of subdocuments that match a given condition Edit: (in each document). For example:
{"_id":{"chr":"20","pos":"14371","ref":"A","alt":"G"}, "studies":[ { "study_id":"Study1", "samples":[ { "sample_id":"NA00001", "formatdata":[ {"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]} ] }, { "sample_id":"NA00002", "formatdata":[ {"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]} ] } ] } ] } {"_id":{"chr":"20","pos":"14372","ref":"T","alt":"AA"}, "studies":[ { "study_id":"Study3", "samples":[ { "sample_id":"SAMPLE1", "formatdata":[ {"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]} ] }, { "sample_id":"SAMPLE2", "formatdata":[ {"GT":"1|0","GQ":48,"DP":8,"HQ":[51,51]} ] } ] } ] } {"_id":{"chr":"20","pos":"14373","ref":"C","alt":"A"}, "studies":[ { "study_id":"Study3", "samples":[ { "sample_id":"SAMPLE3", "formatdata":[ {"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]} ] }, { "sample_id":"SAMPLE7", "formatdata":[ {"GT":"0|0","GQ":48,"DP":8,"HQ":[51,51]} ] } ] } ] }
I want to know how many subdocuments contain GT:"1|0", which in this case would be 1 in the first document, and two in the second, and 0 in the 3rd. I've tried the unwind and aggregate functions but I'm obviously not doing something correct. When I try to count the sub documents by the "GT" field, mongo complains:
db.collection.aggregate([{$group: {"$studies.samples.formatdata.GT":1,_id:0}}])
since my group's names cannot contain ".", yet if I leave them out:
db.collection.aggregate([{$group: {"$GT":1,_id:0}}])
it complains because "$GT cannot be an operator name"
Any ideas?