When a chunk is migrating, do its documents exist on both shards?

I had a problem with the lab “Sharding a collection” in chapter 3, and I haven’t been the only one. There are a few posts in the forum about this issue, for example:

https://www.mongodb.com/community/forums/t/lab-shard-a-collection-script-validate-lab-shard-collection-fails-allway/10998

In a nutshell, the problem is that we mongoimport a file into our primary shard, and it contains 516784 documents, as it should. We then enable sharding for the database which contains that collection (having already added a second shard to the cluster), and shard the collection. And then we run the validation script validate_lab_shard_collection, and it tells us that we’ve got the wrong number of documents:

Incorrect number of documents imported - make sure you import the entire dataset.

Having repeated the process (several times) of dropping the collection, importing it again, creating the index and then sharding the collection, and then monitoring the number of documents in the collection from 3 different mongo shells (one connected to the mongos on port 26000 and the others connected to whichever port is the current primary node of each of the two shards), I have reached the conclusion that

  • As soon as we start sharding the collection, the balancer has a lot of work to do, migrating chunks from the primary shard to the other shard(s) (although there’s only one other shard in this lab)
  • When the balancer starts working, all the chunks are on the primary shard
  • The balancer can take a while to distribute the chunks evenly amongst the shards
  • Whilst chunk migration is in progress, the total number of documents in a collection can appear to be larger than it should be
  • Whilst chunk migration is in progress, the sum of the number of documents in each collection, in each of the shards, when added together, can appear to be larger than it should be

And, critical to my question:

  • Whilst a chunk migration is in progress, the documents in that chunk appear to be in both the shard that the chunk is being moved from, and the shard that it’s being moved to

The solution to this appears to be to monitor the number of documents which appear to be in the collection according to the mongos, and wait until that number of documents matches the number of documents we originally imported (at which point the validation script will succeed).

This is all very well, but my day job is to provide IT services to an organisation which requires “exactly once”. They’re not going to like the same document being returned more than once just because it’s in a chunk which is being migrated :frowning:

So… Is there a way to query the cluster for a document in a chunk which is being migrated, which, whilst that chunk is being migrated, will only return that document from the shard that it’s being migrated from, or from the shard that it’s being migrated to, without returning both of them?

1 Like

Hi @Simon_39939,

Please correct me if I misunderstood.

Consider chunks are being migrated from Shard1 to Shard2. So until the migration between the chunks is complete, all the queries will be handled by Shard1 i.e. donor shard.

So, when the chunk is in migration, all the queries which come to the mongos will be automatically be routed to the donor shard. The mongos will connect to the right shard. It won’t return the same document twice.

Each shard updates its config data from the Config Server when the chunk moves.

The same is mentioned in the Documentation here: Connecting to a Sharded Cluster

Till 3.4, if you were reading from secondary shard members, there is a possibility of getting duplicate documents (orphans) which seems like your scenario.
When you read from a primary (or a secondary with read concern local starting from mongodb 3.6) the node will apply the shard filter before returning the document. So, we won’t return twice the document. If the shard doesn’t official own the document it will just not returning it even if it has it locally.

Let me know if you still have any doubts.

Kanika

1 Like

I did try to catch a document being on both shards at once, and didn’t succeed. At the time I thought it was just because I wasn’t quick enough and the chunk had finished migrating by the time I tried to fetch it from the donor shard. But it sounds like the mongos logic you describe was catching this so that the document would only be returned from one of the shards.

What I was doing in the lab was using the db.collection.count() method (without specifying a read concern or read preference, so it should have defaulted to reading only from primaries) to see how many records there were, both in the entire collection, and also on each of the shards individually, and it’s the counts which this method returned which seemed to imply that the chunk being migrated was present on both shards, because the count was going up and then down again even though there were no inserts or deletes happening (other than migration between the shards).

I wonder if I’d have got a constant document count if I’d used db.collection.find().count() instead? It’d probably be less efficient than db.collection.count() but it may prevent the same documents being counted twice, which is what db.collection.count() appears to be doing.

Hello Simon_39939,

Thanks for the note. The db.collection.count() command may only return an approximate count when run against a sharded collection - this behavior has been documented in the count() command documentation.

In order to get a more accurate document count, we recommend running the following command instead:

db.collection.countDocuments()

This MongoDB Shell command uses the Aggregation Pipeline to get a more accurate count of documents in a collection. You can read more about it on the countDocuments() command documentation.

I hope this helps! Please post here with any other questions you have.

Matt

2 Likes

Thanks @Kanika @mattjavaly, this does explain quite a lot. Slight problem with the countDocuments method in this context though…

MongoDB Enterprise mongos> db.products.countDocuments()
2019-07-02T16:02:29.348+0000 E QUERY    [thread1] TypeError: db.products.countDocuments is not a function :
@(shell):1:1
MongoDB Enterprise mongos> version()
3.6.12

… it was added in version 4.0.3 and the Ubuntu VM used for the M103 course has version 3.6.12 of MongoDB installed. To get equivalent behaviour in version 3.6 I need to use the aggregation pipeline that it wraps around.

But I get the point, using db.collection.count() without supplying a predicate doesn’t actually look at the collection to see how many documents it contains, instead it goes to the metadata (on the config server replica set?) to see how many documents we think the collection contains. A bit of experimenting shows that this is very quick, but the results may not be accurate, as per the count() documentation.

Avoid using the db.collection.count() method without a query predicate since without the query predicate, the method returns results based on the collection’s metadata, which may result in an approximate count. In particular,

Meanwhile, using an aggregation pipeline to count the documents actually does count the documents, and filters out orphaned documents, which means it’s slower (between 2 and 15 seconds on my machine with this products collection), but the count stays constant throughout a chunk migration so it’s a more reliable measure of how many documents there actually are.

I understand this now, thank you for the explanations :slight_smile:

However, this does suggest that there’s a bug in the validate_lab_shard_collection script used in this particular lab, because it appears to be using db.collection.count() or something very similar, in a context where we’ve just sharded a collection and so chunk migration is likely to be in progress and we’re likely to have orphaned documents, which, as the documentation points out, db.collection.count() doesn’t filter out. I think the validation script in this case ought to be using an aggregation pipeline to validate the number of documents in the collection. Yes, the validation script will take longer to run, but I think students would prefer that over a misleading validation failure and potentially having to repeat the lab multiple times trying to work out where all those extra documents are coming from.

Would you agree?

Initially we have the Primary Shard with 500K documents and secondary with 0 documents ( before sharding collections ).
After sharding we have 500K in shard 1 and 270K documents in shard2 and the script fail.
When also in second shard there’ll be 500K the execution is successful.
My question is ? But is normal that if I import N documents on mongos , every shard contains N documents ? What’s the advantage of this behaviour in horizontal scaling ?

Hi @granata78, @Simon_39939,

Please allow me some time to check this issue and I will get back to you with an update on this as soon as possible.

Thanks,
Sonali

OK Thank you. After lab finished I would expected an equal distribution in primary db on shard1 and shard2. N Tot documents in shard 1 + Num Tot document in shard 2 = 500K and not 1M
But probabily i am wrong something…

@granata78 are you asking why the documents aren’t evenly distributed across the shards? If so then the reason is that it’s the chunks which are distributed, not the documents. In this lab we only have enough data for 3 chunks so we have 1 chunk on one shard and 2 chunks on the other shard. If we had more data then we’d have more chunks and so the documents would be more evenly distributed.

Hope that helps

@Kanika @mattjavaly Just wondering if you’d had a chance to think about my point regarding the validate_lab_shard_collection script counting the collection’s documents in a way which appears not to filter out orphaned documents, and this being an incorrect way to count the documents given that the previous step in the lab is to shard a collection and therefore it’ll be getting an incorrect document count due to chunk migration being in progress?

I’d use the “report an issue” facility, but I’m not convinced that I can explain this one in less than 500 characters.

Thank you for your continued support :slight_smile: