Index Build: draining writes received during build

Hello,
we have a docker setup for the app and on each container start we try to create all indexes again, just in case, cause we would not know if it is first started or not, etc. But it does not matter.

Point is, sometimes indexing are stuck with:
“Index Build: draining writes received during build”

And they never stop.

It seems to happen with 4.4.4 and now after upgrading to 4.4.9 it also still seems to happen.
And it seems to happen only on Replica sets.
And it happens on collections with low write rates or even empty collections that have no writes.

Could it be related to the bugs fixed in 4.4.9? And we simply need to fix/clean all data?
Or this is something else?

Hi @Arturs_Sosins ,

This was a known issue previously (see SERVER-48516), where the main issue was that using auth, the node was not able to communicate with itself. A workaround was to create a keyfile as mentioned in this comment.

Having said that, this is supposed to be fixed in 4.4.4. If you have added the keyfile and it doesn’t seem to fix the issue, could you please post some more details:

  • What is the MongoDB version you’re running
  • What’s the output of rs.status() and rs.conf() from the replica set
  • What do you mean exactly by “sometimes it’s stuck”? Is this happening intermittently? Are there any pattern you can discern?

Best regards
Kevin

1 Like

True, I forgot to mention that it is not the case with the node not being able to communicate with itself. We specifically checked that.

This thing is more random, and it happens on collections that already have the index. Afaik it never happened on collections that do not have the index.

That’s why my next idea is that maybe that bug with index uniqueness could be causing that. We have hundreds of client servers, and it has been happening on only a couple of those servers, but if it happens for a client, then it would be happening and repeating again in the future (but not always), only collections sometimes change.

Unfortunately, we could not establish any patterns. Versions vary from 4.4.2 to now 4.4.9.

The only way we can get around that is either kill the index building process or mongoexport and mongoimport with drop.

Any specific diagnostic data we could collect when that happens next time?