thank you for the information. The behaviour it self is as expected. compass uses the $sampel aggregation to collect some random documents. This is done by a certain rule set:
- $sample is the first stage of the pipeline
- N is less than 5% of the total documents in the collection
- The collection contains more than 100 documents
The initial value is 1000 for the schema analysis, the collection has only 284 documents so this is not fulfilled
284 * 0,05 = 14,2 so 14 docs should work 15 docs should not. This is true pls see the screenshots.
Since the one rule is not fulfilled the second option is taken:
If any of the above conditions are NOT met, $sample performs a collection scan followed by a random sort to select N documents. In this case, the $sample stage is subject to the sort memory restrictions.
So we do a collection scan restricted to 100 megabytes of RAM for in-memory sorts. This fails.
The error message states the standard “solution” to allow to write to disk which is enabled by setting allowDiskUse:ture
So after this round trip the question still is:
Where should the allowDiskUse:true be put to sample up to 1000 documents