I talked with the Atlas & Serverless team and they explained to me a bit more how it works.
So, yes, Atlas is billing Serverless based on the uncompressed size of your BSON docs + the indexes. The idea of billing on the uncompressed data rather than on the compressed data is that the final price doesn’t depend on the performances of the compression algorithm that WiredTiger is using. So it’s always fair and wouldn’t change if we update the compression algorithm in the future.
If Atlas Serverless was billing on the compressed size, it would be x4 or x5 more expensive so it would seem that it’s less competitive and it would be less predictable as the compression can be more or less performant depending on the schema design you are using, the field types, etc. So it would be more complicated to predict your serverless costs in advance & plan ahead your spendings.
Finally, the 1TB storage limitation is based on the collection + indexes data compressed. It is expected that users would migrate to Atlas Dedicated clusters if they come close to the limit. Eventually in the future, the team wants to push this limit up or completely remove it.
I hope this makes sense and is helpful
!
Let me know if you have more questions of course!
Cheers,
Maxime.