RS102 MongoDB on ReplicaSet

问题

I have set up a replica set with 4 servers.

For testing purpose, I wrote a script to fill my database up to ~150 millions rows of photos using GridFS. My photos are around ~15KB. (This shouldn't be a problem to use gridfs for small files ?!)

After after a few hours, there were around 50 millions rows, but I had this message in the logs :

replSet error RS102 too stale to catch up, at least from 192.168.0.1:27017

And here is the replSet status :

 rs.status();
{
"set" : "rsdb",
"date" : ISODate("2012-07-18T09:00:48Z"),
"myState" : 1,
"members" : [
    {
        "_id" : 0,
        "name" : "192.168.0.1:27017",
        "health" : 1,
        "state" : 1,
        "stateStr" : "PRIMARY",
        "optime" : {
            "t" : 1342601552000,
            "i" : 245
        },
        "optimeDate" : ISODate("2012-07-18T08:52:32Z"),
        "self" : true
    },
    {
        "_id" : 1,
        "name" : "192.168.0.2:27018",
        "health" : 1,
        "state" : 3,
        "stateStr" : "RECOVERING",
        "uptime" : 64770,
        "optime" : {
            "t" : 1342539026000,
            "i" : 5188
        },
        "optimeDate" : ISODate("2012-07-17T15:30:26Z"),
        "lastHeartbeat" : ISODate("2012-07-18T09:00:47Z"),
        "pingMs" : 0,
        "errmsg" : "error RS102 too stale to catch up"
    },
    {
        "_id" : 2,
        "name" : "192.168.0.3:27019",
        "health" : 1,
        "state" : 3,
        "stateStr" : "RECOVERING",
        "uptime" : 64735,
        "optime" : {
            "t" : 1342539026000,
            "i" : 5188
        },
        "optimeDate" : ISODate("2012-07-17T15:30:26Z"),
        "lastHeartbeat" : ISODate("2012-07-18T09:00:47Z"),
        "pingMs" : 0,
        "errmsg" : "error RS102 too stale to catch up"
    },
    {
        "_id" : 3,
        "name" : "192.168.0.4:27020",
        "health" : 1,
        "state" : 3,
        "stateStr" : "RECOVERING",
        "uptime" : 65075,
        "optime" : {
            "t" : 1342539085000,
            "i" : 3838
        },
        "optimeDate" : ISODate("2012-07-17T15:31:25Z"),
        "lastHeartbeat" : ISODate("2012-07-18T09:00:46Z"),
        "pingMs" : 0,
        "errmsg" : "error RS102 too stale to catch up"
    }
],
"ok" : 1

The set is still accepting datas, but as I have my 3 servers "DOWN" how should I proceed to repair (nicer than delete datas and re-sync which wil take ages, but will work) ?

And especially : Is this something because of a too violent script ? Meaning that it almost never happens in production ?

回答1:

You don't need to repair, simply perform a full resync.

On the secondary, you can:

stop the failed mongod
delete all data in the dbpath (including subdirectories)
restart it and it will automatically resynchronize itself

Follow the instructions here.

What's happened in your case is that your secondaries have become stale, i.e. there is no common point in their oplog and that of the oplog on the primary. Look at this document, which details the various statuses. The writes to the primary member have to be replicated to the secondaries and your secondaries couldn't keep up until they eventually went stale. You will need to consider resizing your oplog.

Regarding oplog size, it depends on how much data you insert/update over time. I would chose a size which allows you many hours or even days of oplog.

Additionally, I'm not sure which O/S you are running. However, for 64-bit Linux, Solaris, and FreeBSD systems, MongoDB will allocate 5% of the available free disk space to the oplog. If this amount is smaller than a gigabyte, then MongoDB will allocate 1 gigabyte of space. For 64-bit OS X systems, MongoDB allocates 183 megabytes of space to the oplog and for 32-bit systems, MongoDB allocates about 48 megabytes of space to the oplog.

How big are records and how many do you want? It depends on whether this data insertion is something typical or something abnormal that you were merely testing.

For example, at 2000 documents per second for documents of 1KB, that would net you 120MB per minute and your 5GB oplog would last about 40 minutes. This means if the secondary ever goes offline for 40 minutes or falls behind by more than that, then you are stale and have to do a full resync.

I recommend reading the Replica Set Internals document here. You have 4 members in your replica set, which is not recommended. You should have an odd number for the voting election (of primary) process, so you either need to add an arbiter, another secondary or remove one of your secondaries.

Finally, here's a detailed document on RS administration.

来源：https://stackoverflow.com/questions/11537977/rs102-mongodb-on-replicaset

标签

mongodb

gridfs