Persistent restart of a host in a replica set

Tricky_N_A · January 25, 2022, 12:01pm

Good day! Sharding is used, which consists of 4 nodes. Primary, Secondary, Arbiter. When the power went out, the servers rebooted. In this connection, Primary began to constantly reboot.
Please tell me how to solve the problem. Operation log below.

2022-01-25T09:07:43.518+0000 I SHARDING [conn353] remotely refreshing metadata for private-messages.conversation-mutuality with requested shard version 41|73||5848c561e51f80cc3fce100d, current shard version is 0|0||000000000000000000000000, current metadata version is 0|0||000000000000000000000000
2022-01-25T09:07:43.524+0000 I SHARDING [conn353] collection rbh-private-messages.conversation-mutuality was previously unsharded, new metadata loaded with shard version 41|73||5848c561e51f80cc3fce100d
2022-01-25T09:07:43.524+0000 I SHARDING [conn353] collection version was loaded at version 41|76||5848c561e51f80cc3fce100d, took 5ms
2022-01-25T09:07:43.533+0000 I SHARDING [conn393] request split points lookup for chunk contacts.contacts { : 4087333471265970141 } -->> { : 5113103397945695205 }
2022-01-25T09:07:43.535+0000 I ASIO     [NetworkInterfaceASIO-ShardRegistry-0] Successfully connected to conf-2:49018
2022-01-25T09:07:43.635+0000 I NETWORK  [conn457] end connection 10.200.202.15:38262 (432 connections now open)
2022-01-25T09:07:43.635+0000 I NETWORK  [conn478] end connection 10.200.202.15:38308 (432 connections now open)
2022-01-25T09:07:43.639+0000 I SHARDING [conn393] request split points lookup for chunk contacts.contacts { : -21698294565895821 } -->> { : 1006777624207005004 }
2022-01-25T09:07:43.639+0000 F STORAGE  [conn364] Unique index cursor seeing multiple records for key { : 79459563, : 76959901 }
2022-01-25T09:07:43.640+0000 I -        [conn364] Fatal Assertion 28608
2022-01-25T09:07:43.640+0000 I -        [conn364] 

***aborting after fassert() failure


2022-01-25T09:07:43.649+0000 I NETWORK  [conn297] end connection 10.200.202.15:37948 (430 connections now open)
2022-01-25T09:07:43.649+0000 I NETWORK  [conn415] end connection 10.200.202.15:38186 (429 connections now open)
2022-01-25T09:07:43.657+0000 F -        [conn364] Got signal: 6 (Aborted).

kevinadi · January 27, 2022, 5:41am

Hi @Tricky_N_A welcome to the community!

The standout message I see in the logs you posted is Unique index cursor seeing multiple records for key.

There was an issue with unique index not being enforced in older versions of MongoDB. This issue affects MongoDB 4.4.7, 5.0.0 and 5.0.1. This issue is resolved in 4.4.8 and 5.0.2. (see SERVER-58936).

If you’re using the affected version, please upgrade the cluster to newer versions not affected by the issue.

To remediate the duplicate key issue, please follow the steps outlined in the DIAGNOSIS AND REMEDIATION section in the above JIRA ticket.

Best regards,
Kevin

Tricky_N_A · February 7, 2022, 2:30am

Hi @kevinadi Thanks for the answer! I’ll try to update the cluster, after the update I’ll tell you about the result.