问题
Can you use mongodump to dump the latest "x" documents from a collection? For example, in the mongo shell you can execute:
db.stats.find().sort({$natural:-1}).limit(10);
Is this same capability available to mongodump?
I guess the workaround would be to dump the above documents into a new temporary collection and mongodump the entire temp collection, but would be great to just be able to do this via mongodump.
Thanks in advance,
Michael
回答1:
mongodump does not fully expose the cursor interfaces.
But you can work around it, using the --query parameter.
First get the total number of documents of the collection
db.collection.count()
Let's say there are 10000 documents and you want the last 1000. To do so get the id of first document you want to dump.
db.collection.find().sort({_id:1}).skip(10000 - 1000).limit(1)
In this example the id was "50ad7bce1a3e927d690385ec".
Now you can feed mongodump with this information, to dump all documents a with higher or equal id.
$ mongodump -d 'your_database' -c 'your_collection' -q '{_id: {$gte: ObjectId("50ad7bce1a3e927d690385ec")}}'
UPDATE
The new parameters --limit and --skip were added to mongoexport will be probably available in the next version of the tool: https://github.com/mongodb/mongo/pull/307
回答2:
Building off of Mic92's answer, to get the most recent 1000 items from a collection:
Find the _id of the 1000th most recent item:
db.collection.find('', {'_id':1}).sort({_id:-1}).skip(1000).limit(1)
It will be something like 50ad7bce1a3e927d690385ec.
Then pass this _id in a query to mongodump:
$ mongodump -d 'your_database' -c 'your_collection' -q '{"_id": {"$gt": {"$oid": "50ad7bce1a3e927d690385ec"}}}'
回答3:
mongodump supports the --query operator. If you can specify your query as a json query, you should be able to do just that.
If not, then your trick of running a query to dump the records into a temporary collection and then dumping that will work just fine. In this case, you could automate the dump using a shell script that calls a mongo with a javascript command to do what you want and then calling mongodump.
回答4:
I was playing with a similar requirement (using mongodump) where I wanted to do sequential backup and restore. I would take dump from last stored timestamp. I couldn't get through --query '{ TIMESTAMP : { $gte : $stime, $lt : $etime } }'
Some points to note: 1) use single quote instead of double 2) do not escape $ or anything 3) replacing $stime/$etime with real numbers will make the query work 4) problem I had was with getting $stime/$etime resolved before mongodump executes itself under -x it showed as + eval mongodump --query '{TIMESTAMP:{\$gte:$utc_stime,\$lt:$utc_etime}}' ++ mongodump --query '{TIMESTAMP:$gte:1366700243}' '{TIMESTAMP:$lt:1366700253}'
Hell, the problem was evident. query gets converted into two conditionals.
The solution is tricky and I got it after repeated trials.... escape { and } ie use { ..} . This fixes the problem.
回答5:
try this:
NUM=10000
doc=selected_doc
taskid=$(mongo 127.0.0.1/selected_db -u username -p password --eval "db.${doc}.find({}, {_id: 1}).sort({_id: -1}).skip($NUM).limit(1)" | grep -E -o '"[0-9a-f]+"')
mongodump --collection $doc --db selected_db --host 127.0.0.1 -u username -p password -q "{_id: {\$gte: $taskid}}" --out ${doc}.dump
回答6:
_id-based approaches may not work if you use a custom _id for your collection (such as returned by a 3rd party API). In that case, you should depend on a createdAt or equivalent field:
COL="collectionName"
HOW_MANY=10000
DATE_CUTOFF=$(mongo <host, user, pass...> dbname --quiet \
--eval "db.$COL.find({}, { createdAt: 1 }).sort({ createdAt: -1 }).skip($HOW_MANY).limit(1)"\
| grep -E -o '(ISODate\(.*?\))')
echo "Copying $HOW_MANY items after $DATE_CUTOFF..."
mongodump <host, user, pass...> -d dbname -c ${COL}\
-q "{ createdAt: { \$gte: $DATE_CUTOFF} }" --gzip
来源:https://stackoverflow.com/questions/7828817/is-it-possible-to-mongodump-the-last-x-records-from-a-collection