MongoDB doesn't handle aggregation with allowDiskUsage:True

the data structure is like:

way: {
    node: ['253333910', '3304026514']

and I'm trying to count the frequency of nodes' appearance in ways. Here is my code using pymongo:

node = db.way.aggregate([
    {'$unwind': '$node'},
        '$group': {
            '_id': '$node',
            'appear_count': {'$sum': 1}
    {'$sort': {'appear_count': -1}},
    {'$limit': 10}
    {'allowDiskUse': True}

it will report an error:

Traceback (most recent call last):
  File "<input>", line 1, in <module>
  File ".../OSM Wrangling/", line 78, in most_passed_node
    {'allowDiskUse': True}
  File ".../pymongo/", line 2181, in aggregate
  File ".../pymongo/", line 2088, in _aggregate
  File ".../pymongo/", line 464, in command
    self.validate_session(client, session)
  File ".../pymongo/", line 609, in validate_session
    if session._client is not client:
AttributeError: 'dict' object has no attribute '_client'

However, if I removed the {'allowDiskUse': True} and test it on a smaller set of data, it works well. It seems that the allowDiskUse statement brings something wrong? And there is no information about this mistake in the docs of MongoDB

How should I solve this problem and get the answer I want?


This is because in PyMongo v3.6 the method signature for collection.aggregate() has been changed. An optional parameter for session has been added. The method signature now is :

aggregate(pipeline, session=None, **kwargs)

Applying this to your code example, you can specify allowDiskUse as below:

node = db.way.aggregate(pipeline=[
                {'$unwind': '$node'},
                {'$group': {
                          '_id': '$node',
                          'appear_count': {'$sum': 1}
                 {'$sort': {'appear_count': -1}},
                 {'$limit': 10}

See also pymongo.client_session if you would like to know more about session.


