After reviewing our architecture, I believe we should evaluate MongoDB as a potential alternative to our current SQL Server implementation.
We use Azure, kubernetes
Join Elimination: How significant was the performance improvement when moving from a joined SQL structure to a denormalized document model in MongoDB?
MongoDB vs Cosmos DB: We initially considered Cosmos DB but were concerned about its pricing model and optimization being heavily based on data colocation. How does MongoDB’s approach to data distribution compare? Does it offer more flexibility?
Query Performance: For workloads that frequently filter on multiple categorical attributes, how does MongoDB query performance compare to SQL Server, especially when there are many potential filter combinations?
Sharding Considerations: What sharding key strategies work best for analytical data that needs to be distributed across multiple servers?
Storage Efficiency: How does MongoDB’s storage efficiency compare to SQL Server when storing structured data with consistent schema but variable-length attribute values?
Index Optimization: What are the best indexing strategies when users need to filter on different combinations of attributes?
Concurrency: How well does MongoDB handle concurrent read operations during heavy write periods?
I need to convince team that Mongo is worth looking into and will perform better than our Azure Sql Server option but not sure how to prove it
Those are great questions. Here are some quick thoughts:
Join Elimination: The most benefits are seen when there are filters on multiple tables. With the document model, one index can be efficient, whereas in SQL, you need an index per table, and then some filtering happens after the join. I published an example recently: A case where SQL joins struggle but MongoDB documents shine.
MongoDB vs. Cosmos DB: The document model helps to have a clear sharding key. I don’t know the internals of CosmosDB, but it is an emulation of the MongoDB API on top of PostgreSQL, so there are limitations due to 8k block storage and TOAST compression of JSONB, whereas MongoDB storage is optimized for documents.
Query Performance: The link above is a good example. I’m not an expert in SQL Server, but I think it’s the same problem as with other SQL databases—there are no efficient multi-table indexes.
Storage Efficiency: One big advantage of MongoDB storage (WiredTiger) compared to other DBMS, including SQL Server, is that the pages in memory do not need to be a one-to-one copy of the on-disk block. This allows pages to grow in memory and be split only at the checkpoint, when written to disk in background.
Concurrency: One advantage of MongoDB is that the developer has more control by choosing the read and write concern. Additionally, the in-memory index structure is lock-free.
I’ll ping some colleagues who may have more ideas and feedback about users who have migrated from SQL Server to MongoDB, which may help you. The document model shines with modern applications where the database schema is optimized for one domain access patterns. Relational model is more for one central database designed before knowing the access patterns
Yes, hard to say, but MongoDB has an amazing compression factor to save disk (and memory as it is compressed in the OS cache). Pre-sales colleagues may have numbers to share based on customer cases who did the migration, ping me (here or LinkedIn) if you want to get in touch - may depend on geographical location and business domain.