CURSOR_NOT_FOUND - my cron jobs started dying in the middle

问题

a cron job that was successfully running for years suddenly started dying after about 80% completion. Not sure if it is because the collection with results was steadily growing and reached some critical size (does not seem to be all that big to me) or for any other reason. I am not sure how to debug this, I found the user at whom the job died and tried to run the job for this user, got CURSOR_NOTFOUND message after 2 hours. Yesterday it died after 3 hours of running for all users. I am still using old mongoid (2.0.0.beta) because of multiple dependences and lack of time to change it, but mongo is up to date (I know about the bug in versions before 1.1.2).

I found two similar questions but neither of them is applicable. In this case, they used Mopped which was not production ready. And here the problem was in pagination.

I am getting this error message

MONGODB cursor.refresh() for cursor xxxxxxxxx
rake aborted!
Query response returned CURSOR_NOT_FOUND. Either an invalid cursor was specified, or the cursor may have timed out on the server.

Any suggestions?

回答1:

A "cursor not found" error from MongoDB is typically an indication that the cursor timed out (after 10 minutes of inactivity) but it could potentially indicate that the client code has become confused and is using a stale or closed cursor or has corrupted the cursor somehow. If the 3 hour runtime included a lot of busy time on the client in between calls to MongoDB, that might give the server time to timeout the cursor.

You can specify a no-timeout option on the cursor to see if it is a server timeout of your cursor that is causing your problem.

来源：https://stackoverflow.com/questions/10887797/cursor-not-found-my-cron-jobs-started-dying-in-the-middle

标签

ruby

mongodb

rake