In a collection I have about 10 million documents.
I use this code to find random 20 of them:
db.mycoll.aggregate([{ $sample: { size: 20 } }])
How many RPU MongoDB Atlas needs to do this?
In a collection I have about 10 million documents.
I use this code to find random 20 of them:
db.mycoll.aggregate([{ $sample: { size: 20 } }])
How many RPU MongoDB Atlas needs to do this?
Hi @tri_be - Welcome to the community
How many RPU MongoDB Atlas needs to do this?
I would recommend going over the Serverless - Usage Cost Summary documentation. In regards to RPU’s specifically (as of the time of this message):
You are charged one
RPU
for each document read (up to 4KB) or for each index read (up to 256 bytes).
So in terms of RPU for your question, one of the factors you will need to consider is document and index read size(s).
In a collection I have about 10 million documents.
db.mycoll.aggregate([{ $sample: { size: 20 } }])
There are several conditions in which the $sample
stage will do a COLLSCAN
/ use all documents from preceding aggregation stage or use a pseudo-random cursor. As per the documentation linked:
If all of the following conditions are true,
$sample
uses a pseudo-random cursor to select theN
documents:
$sample
is the first stage of the pipeline.N
is less than 5% of the total documents in the collection.- The collection contains more than 100 documents.
If any of the previous conditions are false,
$sample
- Reads all documents that are output from a preceding aggregation stage or a collection scan.
- Performs a random sort to select
N
documents.
Whether the RPU usage is higher when the pseudo-random cursor is used versus when it is not would differ on a case-by-case basis.
As serverless costs may be a concern to you, you may wish to set up a billing alert.
Regards,
Jason