Shard queries timing out - Sharding status not available

Hello,

Queries to one of our shards are regularly timing out with the following error:

“sharding status of collection db.collection is not currently available for description and needs to be recovered from the config server”

I’ve spent a sizeable amount of time scouring the internet but the only references I find are to test bug reports in Jira.

Has anyone experienced similar issues?

Hi @Joao_Galrito and welcome to MongoDB community forums!!

In order to understand the concern better and to help you with a possible solution, could you share a few details about the sharded deployment.

  1. The architecture for the sharded cluster.

  2. Is there a possible way to reproduce this error in the local environment. For instance, is there a specific command being performed which results into this error message?

  3. The MongoDB versions being used.

  4. Finally, can you confirm if any upgrade has been done in the sharded cluster between the versions?

As mentioned in the SERVER-74195: Transaction failed after version upgrade, identifies a similar error message, however this was related to the use of transactions.
Can you also confirm if the transactions are being used inside the sharded cluster?

Regards
Aasawari

Hello Aasawari, and thank you!

  1. The cluster consists of 5 data shards + 1 config shard. Each shard has 3 data members (except for one which is a PSA). We have one sharded collection by range, and we have 5 zones.
  2. The query that triggers the error message is a simple find command, using the shard key and other fields. It only occurs on queries that target one particular shard, and the error is visible on the logs of all the replicas in the shard.
  3. All instances are running 5.0.5, except for the problematic shard which is running 5.0.9 (just noticed this), and a newer shard running 6.0.5 (set on 5.0 compatibility mode)
  4. See above
  5. As far as I know we don’t use transactions. Unless they are used internally by the clients we use (PHP, Java, Node.JS). But for the query in question I would say no.

Thank you for any assistance you might be able to provide!

Hi @Joao_Galrito

Mismatching between the versions in a sharded cluster is allowed unless they are from adjacent major MongoDB server releases however, a sharded cluster should have consistent versions.
Hence, I would suggest to keep all shard at the same versions to avoid any inconsistencies.

Along with the shard servers, I would also suggest to upgrade mongos and the config servers also to the same versions and same feature compatibility version.
Once you have performed the necessary upgrades, try running the similar query again and let us know if the issue persists.

Please follow the documentation to Upgrade a Sharded Cluster to 6.0 to help with the steps to be followed for smooth upgrade.

Regards
Aasawari

Hello Aasawari,

We moved the entire cluster to 6.0.4 and it seems to have solved the issue.

Thank you!

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.