In mongodb 3.0 replication, how elections happen when a secondary goes down

2018-06-21 23:09:21

Situation: I have a MongoDB replication set over two computers.

One computer is a server that holds the primary node and the arbiter. This server is a live server and is always on. It's local IP that is used in replication is 192.168.0.4 .

Second is a PC that the secondary node resides on and is on for a few hours a day. It's local IP that is used in replication is 192.168.0.5 .

My expectation: I wanted the live server to be the main point of data interaction of my application, regardless of the state of the PC (whether it is reachable or not, since PC is secondary), so I wanted to make sure that server's node is always primary.

The following is the result of rs.config() :

liveSet:PRIMARY> rs.config()
{
    "_id" : "liveSet",
    "version" : 2,
    "members" : [
        {
            "_id" : 0,
            "host" : "192.168.0.4:27017",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 10,
            "tags" : {

            },
            "slaveDelay" : 0,
            "votes" : 1
        },
        {
            "_id" : 1,
            "host" : "192.168.0.5:5051",
            "arbiterOnly" : false,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : 0,
            "votes" : 1
        },
        {
            "_id" : 2,
            "host" : "192.168.0.4:5052",
            "arbiterOnly" : true,
            "buildIndexes" : true,
            "hidden" : false,
            "priority" : 1,
            "tags" : {

            },
            "slaveDelay" : 0,
            "votes" : 1
        }
    ],
    "settings" : {
        "chainingAllowed" : true,
        "heartbeatTimeoutSecs" : 10,
        "getLastErrorModes" : {

        },
        "getLastErrorDefaults" : {
            "w" : 1,
            "wtimeout" : 0
        }
    }
}

Also I have set the storage engine to be WiredTiger, if that matters.

What I actually get, and the problem: When I turn off the PC, or kill its mongod process, then the node on the server becomes secondary.

The following is the output of the server when I killed PC's mongod process, while connected to primary node's shell:

liveSet:PRIMARY>
2015-11-29T10:46:29.471+0430 I NETWORK  Socket recv() errno:10053 An established connection was aborted by the software in your host machine. 127.0.0.1:27017
2015-11-29T10:46:29.473+0430 I NETWORK  SocketException: remote: 127.0.0.1:27017 error: 9001 socket exception [RECV_ERROR] server [127.0.0.1:27017]
2015-11-29T10:46:29.475+0430 I NETWORK  DBClientCursor::init call() failed
2015-11-29T10:46:29.479+0430 I NETWORK  trying reconnect to 127.0.0.1:27017 (127.0.0.1) failed
2015-11-29T10:46:29.481+0430 I NETWORK  reconnect 127.0.0.1:27017 (127.0.0.1) ok
liveSet:SECONDARY>

There are two doubts for me:

Considering this part of MongoDB documentation:

Replica sets use elections to determine which set member will become primary. Elections occur after initiating a replica set, and also any time the primary becomes unavailable.

The election occurs when the primary is not available (or at the time of initiating, however this is part does not concern our case), but primary was always available, so why the election happens.

Considering this part of the same documentation:

If a majority of the replica set is inaccessible or unavailable, the replica set cannot accept writes and all remaining members become read-only.

Considering the part 'members become read-only', I have two nodes up vs one down, so this should not also affect our replication.

Now my question: How to keep the node on the server as primary, when the node on PC is not reachable?

Update: This is the output of rs.status() .

Thanks to Wan Bachtiar, now This makes the behavior obvious, since arbiter was not reachable.

liveSet:PRIMARY> rs.status()
{
    "set" : "liveSet",
    "date" : ISODate("2015-11-30T04:33:03.864Z"),
    "myState" : 1,
    "members" : [
        {
            "_id" : 0,
            "name" : "192.168.0.4:27017",
            "health" : 1,
            "state" : 1,
            "stateStr" : "PRIMARY",
            "uptime" : 1807553,
            "optime" : Timestamp(1448796026, 1),
            "optimeDate" : ISODate("2015-11-29T11:20:26Z"),
            "electionTime" : Timestamp(1448857488, 1),
            "electionDate" : ISODate("2015-11-30T04:24:48Z"),
            "configVersion" : 2,
            "self" : true
        },
        {
            "_id" : 1,
            "name" : "192.168.0.5:5051",
            "health" : 1,
            "state" : 2,
            "stateStr" : "SECONDARY",
            "uptime" : 496,
            "optime" : Timestamp(1448796026, 1),
            "optimeDate" : ISODate("2015-11-29T11:20:26Z"),
            "lastHeartbeat" : ISODate("2015-11-30T04:33:03.708Z"),
            "lastHeartbeatRecv" : ISODate("2015-11-30T04:33:02.451Z"),
            "pingMs" : 1,
            "configVersion" : 2
        },
        {
            "_id" : 2,
            "name" : "192.168.0.4:5052",
            "health" : 0,
            "state" : 8,
            "stateStr" : "(not reachable/healthy)",
            "uptime" : 0,
            "lastHeartbeat" : ISODate("2015-11-30T04:33:00.008Z"),
            "lastHeartbeatRecv" : ISODate("1970-01-01T00:00:00Z"),
            "configVersion" : -1
        }
    ],
    "ok" : 1
}
liveSet:PRIMARY>

As stated in the documentation, if a majority of the replica set is inaccessible or unavailable, the replica set cannot accept writes and all remaining members become read-only.

In this case the primary has to step down if the arbiter and the secondary are not reachable. rs.status() should be able to determine the health of the replica members.

One thing you should also watch for is the primary oplog size. The size of the oplog determines how long a replica set member can be down for and still be able to catch up when it comes back online. The bigger the oplog size, the longer you can deal with a member being down for as the oplog can hold more operations. If it does fall too far behind, you must resynchronise the member by removing its data files and performing an initial sync.

See Check the size of the Oplog for more info.

Regards,

Wan.

链接地址: http://www.djcxy.com/p/61652.html

上一篇: 使用简单密码验证设置MongoDB副本

下一篇: 在mongodb 3.0复制中，当一个次要级别出现时，选举是如何发生的