At 1:38, giving an example of a query against nycFacilities with a sample size of 200, the lecturer says this is greater than 5% of the documents in the collection. This is incorrect. There are 36,112 documents, so 200 is less than 5% of that. And it’s important that it’s less, because otherwise, the pseudo-random cursor wouldn’t be the sampling method used, as the lecturer correctly states that it is for the query given.
You are right!!
The sample size 200 is less than the 5% of the total documents in the collection.