PyMongo raises [errno 49] can't assign requested address after a large number of queries

匿名 (未验证) 提交于 2019-12-03 03:03:02

问题:

I have a MongoDB collection with > 1,000,000 documents. I am performing an initial .find({ my_query }) to return a subset of those documents (~25,000 documents), which I then put into a list object.

I am then looping over each of the objects, parsing some values from the returned document in the list, and performing an additional query using those parsed values via the code:

def _perform_queries(query):     conn = pymongo.MongoClient('mongodb://localhost:27017')     try:         coll = conn.databases['race_results']         races = coll.find(query).sort("date", -1)     except BaseException, err:         print('An error occured in runner query: %s\n' % err)     finally:         conn.close()         return races 

In this case, my query dictionary is:

{"$and": [{"opponents":     {"$elemMatch": {"$and": [         {"runner.name": name},         {"runner.jockey": jockey}     ]}}},     {"summary.dist": "1"} ]} 

Here is my issue. I have created an index on opponents.runner.name and opponents.runner.jockey. This makes the queries really-really fast. However, after about 10,000 queries in a row, pymongo is raising an exception:

pymongo.errors.AutoReconnect: [Errno 49] Can't assign requested address 

When I remove the index, I don't see this error. But it takes about 0.5 seconds per query, which is unusable in my case.

Does anyone know why the [Errno 49] can't assign requested address could be occurring? I've seen a few other SO questions related to can't assign requested address but not in relation to pymongo and there answers don't lead me anywhere.

UPDATE:

Following Serge's advice below, here is the output of ulimit -a:

core file size          (blocks, -c) unlimited data seg size           (kbytes, -d) unlimited file size               (blocks, -f) unlimited max locked memory       (kbytes, -l) unlimited max memory size         (kbytes, -m) unlimited open files                      (-n) 2560 pipe size            (512 bytes, -p) 1 stack size              (kbytes, -s) 8192 cpu time               (seconds, -t) unlimited max user processes              (-u) 709 virtual memory          (kbytes, -v) unlimited 

My MongoDB is running on OS X Yosemite.

回答1:

This is because you are using PyMongo incorrectly. You are creating a new MongoClient for each query, which requires you to open a new socket for each new query. This defeats PyMongo's connection pooling, and besides being extremely slow, it also means you open and close sockets faster than your TCP stack can keep up: you leave too many sockets in TIME_WAIT state so you eventually run out of ports.

Luckily, the fix is simple. Create one MongoClient and use it throughout:

conn = pymongo.MongoClient('mongodb://localhost:27017') coll = conn.databases['race_results']  def _perform_queries(query):     return coll.find(query).sort("date", -1) 


标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!