Restoring MongoDB config replica set stucked in RECOVERY
I have a MongoDB sharded cluster in 3.2. Recently I had an issue with a WiredTigerLAS.wt file groing out of control in the data folder for the config server replica set of one of my secondaries. This config server replica set has 3 servers.
Considering it was a file in a replica set, I simply shutted down all mongod instances for the config servers replica sets as well as the shards and the mongos instance. Then I removed this WiredTigerLAS.wt of the data folder of the config server replica set of the affected secondary. I expected that starting the mongod instances of the config server replica sets with restore it all properly with an initial sync but it did not. Instead my mongod instances could not last more than simply seconds.
Following https://docs.mongodb.com/manual/tutorial/resync-replica-set-member/ I decided to manually copy the data files of the config replica set from the primary to the secondary affected server and then start over. The problem is that after doing that all my servers got into a RECOVERY status that has lasted for 4 days and it seems it is not synchronizing well. I include the message shown by rs.status() below (it shows all servers in RECOVERY state and a message of "could not find member to sync from" in the supposed primary server):
confreplSet:RECOVERING> rs.status()
{
"set" : "confreplSet",
"date" : ISODate("2017-09-21T16:03:41.471Z"),
"myState" : 3,
"term" : NumberLong(28),
"configsvr" : true,
"heartbeatIntervalMillis" : NumberLong(2000),
"members" : [
{
"_id" : 0,
"name" : "vm01170-htskernelmongo01v:27100",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 1279,
"optime" : {
"ts" : Timestamp(1503957698, 3),
"t" : NumberLong(28)
},
"optimeDate" : ISODate("2017-08-28T22:01:38Z"),
"infoMessage" : "could not find member to sync from",
"configVersion" : 1,
"self" : true
},
{
"_id" : 1,
"name" : "vm01171-htskernelmongo02v:27100",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 1278,
"optime" : {
"ts" : Timestamp(1503957698, 3),
"t" : NumberLong(28)
},
"optimeDate" : ISODate("2017-08-28T22:01:38Z"),
"lastHeartbeat" : ISODate("2017-09-21T16:03:39.055Z"),
"lastHeartbeatRecv" : ISODate("2017-09-21T16:03:40.913Z"),
"pingMs" : NumberLong(0),
"configVersion" : 1
},
{
"_id" : 2,
"name" : "vm01172-htskernelmongo03v:27100",
"health" : 1,
"state" : 3,
"stateStr" : "RECOVERING",
"uptime" : 1278,
"optime" : {
"ts" : Timestamp(1503957698, 3),
"t" : NumberLong(28)
},
"optimeDate" : ISODate("2017-08-28T22:01:38Z"),
"lastHeartbeat" : ISODate("2017-09-21T16:03:39.054Z"),
"lastHeartbeatRecv" : ISODate("2017-09-21T16:03:41.106Z"),
"pingMs" : NumberLong(0),
"configVersion" : 1
}
],
"ok" : 1
}
I have also considered https://docs.mongodb.com/manual/tutorial/restore-replica-set-from-backup/ using my data files of the config server replica set but the problem is that I cannot reinitiate the replica set (when I start the mongod instance with my data files my replica set is already initiated with all the servers) and I cannot remove servers as everything is in RECOVERY and I have no primary now.
Any help? Thank you very much in advance
链接地址: http://www.djcxy.com/p/61660.html上一篇: 主节点失败后无法访问MongoDB群集