Modernizing Core Insurance Systems: Breaking the Batch Bottleneck
Modernizing your legacy database to Java +
MongoDB Atlas
doesn’t have to mean sacrificing batch performance. By leveraging bulk operations, intelligent prefetching, and parallel execution, we built an optimization framework that not only bridges the performance gap but, in many cases, surpasses legacy systems.
In workloads where jobs were previously running 25–30x slower before using this framework, it brought execution times back on par and, in some cases, delivered 10–15x better performance. For global insurance platforms, significantly improved batch performance has become an added technical benefit to potentially support newer functionality.
The modernization dilemma
For organizations modernizing their core platforms, catering to significant user workload and revenue-generating applications, moving from a legacy RDBMS to a modern application stack with Java + MongoDB unlocks several benefits:
Flexible document model:
PL/SQL code tightly couples business logic with the database, making even small changes risky and time-consuming.
MongoDB Atlas
, with its flexible document model and application-driven logic, enables teams to evolve schemas and processes quickly, a huge advantage for industries like insurance, where regulations, products, and customer expectations change rapidly.
Scalability and resilience:
Legacy RDBMS platforms were never designed for today’s scale of digital engagement. MongoDB’s distributed architecture supports
horizontal scale-out
, ensuring that core insurance workloads can handle growing customer bases, high-volume claims, and peak-time spikes without major redesigns.
Cloud-native by design:
MongoDB is built to thrive in the cloud. Features like global clusters, built-in replication, and high availability with reduced infrastructure complexity, while enabling deployment flexibility across hybrid and multi-cloud environments.
Modern developer ecosystem:
Decouples database and business logic dependencies, accelerating feature delivery.
Unified operational + analytical workloads:
Modern insurance platforms demand more than transactional processing; they require real-time insights. MongoDB’s ability to support both operational workloads and
analytics
on live data reduces the gap between claims processing and decision-making.
However, alongside these advantages, one of the first hurdles they encounter is batch jobs performance, the jobs that are meant to run daily/weekly/monthly, like an ETL process.
PL/SQL thrives on set-based operations within the database engine. But when the same workloads are reimplemented with a separate application layer and MongoDB, they can suddenly become unpredictable, slow, and even time out. In some cases, processes that ran smoothly for years started running 25–30x slower after a like-for-like migration. The majority of the issues can be factored into the following broad categories:
High network round-trips between the application and the database.
Inefficient per-record operations replacing set-based logic.
Under-utilization of database bulk capabilities.
Application-layer computation overhead when transforming large datasets.
For teams migrating complex ETL-like processes, this wasn’t just a technical nuisance—it became a blocker for modernization at scale.
The breakthrough: A batch job optimization framework
We designed an extensible, multi-purpose & resilient batch optimization framework purpose-built for high-volume, multi-collection operations in MongoDB. The framework focuses on minimising application-database friction while retaining the flexibility of Java services.
Key principles include:
Bulk operations at scale:
Leveraging MongoDB’s native ```bulkWrite``` (including multi-collection bulk transactions in MongoDB 8) to process thousands of operations in a single round trip.
Intelligent prefetching:
Reducing repeated lookups by pre-loading and caching reference data in memory-friendly structures.
Parallel processing:
Partitioning workloads across threads or event processors (e.g., Disruptor pattern) for CPU-bound and I/O-bound steps.
Configurable batch sizes:
Dynamically tuning batch chunk sizes to balance memory usage, network payload size, and commit frequency.
Pluggable transformation modules:
Modularized data transformation logic that can be reused across multiple processes.
Technical architecture
The framework adopts a layered and orchestrated approach to batch job processing, where each component has a distinct responsibility in the end-to-end workflow. The diagram illustrates the flow of a batch execution:
Trigger (user / cron job):
The batch process begins when a user action or a scheduled cron job triggers the Spring Boot controller.
Spring boot controller:
The controller initiates the process by fetching the relevant records from the database. Once retrieved, it splits the records into batches for parallel execution.
Database:
Acts as the source of truth for input data and the destination for processed results. It supports both reads (to fetch records) and writes (to persist batch outcomes).
Executor framework:
This layer is responsible for parallelizing workloads. It distributes batched records, manages concurrency, and invokes ETL tasks efficiently.
ETL process:
The ETL (Extract, Transform, Load) logic is applied to each batch. Data is pre-fetched, transformed according to business rules, and then loaded back into the database.
Completion & write-back:
Once ETL operations are complete, the executor framework coordinates database write operations and signals the completion of the batch.
Figure 1.
The architecture for the layered approach.
From bottleneck to advantage
The results were striking. Batch jobs that previously timed out are now completed predictably within defined SLAs, and workloads that had initially run 25–30x slower after migration were optimized to perform on par with legacy RDBMSs and in several cases even deliver 10–15x better performance. What was once a bottleneck became a competitive advantage, proving that batch processing on MongoDB can significantly outperform legacy PL/SQL when implemented with the right optimization framework.
Caveats and tuning tips
While the framework is adaptable, its performance depends on workload characteristics and infrastructure limits:
Batch size tuning:
Too large can cause memory pressure; too small increases round-trips.
Transaction boundaries:
MongoDB transactions have limits (document size, total operations), plan batching accordingly.
Thread pool sizing:
Over-parallelization can overload the database or network.
Index strategy:
Even with bulk writes, poor indexing can cause slowdowns.
Prefetch scope:
Balance memory usage against lookup frequency.
In short, it’s not one size fits all. Every workload is different, the data you process, the rules you apply, and the scale you run at all shape how things perform. What we’ve seen though is that with the right tuning, this framework can handle scale reliably and take batch processing from being a pain point to something that actually gives you an edge.
If you’re exploring how to modernize your own workloads, this approach is a solid starting point. You can pick and choose the parts that make sense for your setup, and adapt as you go.
Ready to modernize your applications? Visit the
modernization page
to learn about the MongoDB Application Platform.
September 18, 2025