Is it possible to mongodump the last “x” records from a collection?

跟風遠走 提交于 2021-02-05 13:37:17

问题


Can you use mongodump to dump the latest "x" documents from a collection? For example, in the mongo shell you can execute:

db.stats.find().sort({$natural:-1}).limit(10);

Is this same capability available to mongodump?

I guess the workaround would be to dump the above documents into a new temporary collection and mongodump the entire temp collection, but would be great to just be able to do this via mongodump.

Thanks in advance,

Michael


回答1:


mongodump does not fully expose the cursor interfaces. But you can work around it, using the --query parameter. First get the total number of documents of the collection

db.collection.count()

Let's say there are 10000 documents and you want the last 1000. To do so get the id of first document you want to dump.

db.collection.find().sort({_id:1}).skip(10000 - 1000).limit(1)

In this example the id was "50ad7bce1a3e927d690385ec". Now you can feed mongodump with this information, to dump all documents a with higher or equal id.

$ mongodump -d 'your_database' -c 'your_collection' -q '{_id: {$gte: ObjectId("50ad7bce1a3e927d690385ec")}}'

UPDATE The new parameters --limit and --skip were added to mongoexport will be probably available in the next version of the tool: https://github.com/mongodb/mongo/pull/307




回答2:


Building off of Mic92's answer, to get the most recent 1000 items from a collection:

Find the _id of the 1000th most recent item:

db.collection.find('', {'_id':1}).sort({_id:-1}).skip(1000).limit(1)

It will be something like 50ad7bce1a3e927d690385ec.

Then pass this _id in a query to mongodump:

$ mongodump -d 'your_database' -c 'your_collection' -q '{"_id": {"$gt": {"$oid": "50ad7bce1a3e927d690385ec"}}}'




回答3:


mongodump supports the --query operator. If you can specify your query as a json query, you should be able to do just that.

If not, then your trick of running a query to dump the records into a temporary collection and then dumping that will work just fine. In this case, you could automate the dump using a shell script that calls a mongo with a javascript command to do what you want and then calling mongodump.




回答4:


I was playing with a similar requirement (using mongodump) where I wanted to do sequential backup and restore. I would take dump from last stored timestamp. I couldn't get through --query '{ TIMESTAMP : { $gte : $stime, $lt : $etime } }'

Some points to note: 1) use single quote instead of double 2) do not escape $ or anything 3) replacing $stime/$etime with real numbers will make the query work 4) problem I had was with getting $stime/$etime resolved before mongodump executes itself under -x it showed as + eval mongodump --query '{TIMESTAMP:{\$gte:$utc_stime,\$lt:$utc_etime}}' ++ mongodump --query '{TIMESTAMP:$gte:1366700243}' '{TIMESTAMP:$lt:1366700253}'

Hell, the problem was evident. query gets converted into two conditionals.

The solution is tricky and I got it after repeated trials.... escape { and } ie use { ..} . This fixes the problem.




回答5:


try this:

NUM=10000    
doc=selected_doc
taskid=$(mongo 127.0.0.1/selected_db -u username -p password --eval "db.${doc}.find({}, {_id: 1}).sort({_id: -1}).skip($NUM).limit(1)" |  grep -E  -o '"[0-9a-f]+"')
mongodump --collection $doc  --db selected_db --host 127.0.0.1 -u username -p password -q "{_id: {\$gte: $taskid}}" --out ${doc}.dump



回答6:


_id-based approaches may not work if you use a custom _id for your collection (such as returned by a 3rd party API). In that case, you should depend on a createdAt or equivalent field:

COL="collectionName"
HOW_MANY=10000

DATE_CUTOFF=$(mongo <host, user, pass...> dbname --quiet \
--eval "db.$COL.find({}, { createdAt: 1 }).sort({ createdAt: -1 }).skip($HOW_MANY).limit(1)"\
| grep -E -o '(ISODate\(.*?\))')

echo "Copying $HOW_MANY items after $DATE_CUTOFF..."

mongodump <host, user, pass...> -d dbname -c ${COL}\
-q "{ createdAt: { \$gte: $DATE_CUTOFF} }" --gzip


来源:https://stackoverflow.com/questions/7828817/is-it-possible-to-mongodump-the-last-x-records-from-a-collection

标签
易学教程内所有资源均来自网络或用户发布的内容,如有违反法律规定的内容欢迎反馈
该文章没有解决你所遇到的问题?点击提问,说说你的问题,让更多的人一起探讨吧!