Transforming Financial Services with MongoDB and IBM Watsonx.ai

Vasanth Sanna Mariyappa and Shashank Pandey
July 21, 2025 | Updated: October 1, 2025

Announcement: As of 9/26/2025, we are refocusing the MongoDB AI Applications Program (MAAP) to instead focus on fostering and developing strategic partnerships. Please visit the MongoDB Partner ecosystem page to learn how MongoDB and our partners are helping you build modern applications.

Financial institutions around the world are increasingly adopting AI-driven solutions to enhance user experiences, streamline operations, and deliver personalized financial insights. As a part of the MongoDB AI Applications Program (MAAP), IBM’s Watsonx.ai and MongoDB Atlas unite to deliver scalable, enterprise-grade AI development. By integrating MongoDB Atlas and IBM Watsonx.ai, we’ve built an intelligent finance assistant that combines cutting-edge database management and generative AI (gen AI) capabilities.

Modern financial institutions face challenges in delivering personalized, real-time assistance to their customers. Generic chatbots or static systems often fail to address nuanced queries, limiting their utility and customer satisfaction. By using MongoDB Atlas Vector Search and IBM Watsonx.ai’s gen AI models, we can create a finance assistant capable of handling complex queries, retrieving relevant financial data, and providing actionable insights.

This blog post will walk you through:

The core architecture behind the finance assistant.
The ways that MongoDB Atlas and IBM Watsonx.ai complement each other in building AI-driven financial solutions.
The method of building an intelligent finance assistant with MongoDB Atlas and IBM Watsonx.ai.

Architecture overview

The architecture of the finance assistant integrates advanced Vector Search capabilities with IBM Watsonx.ai’s reasoning and language generation models. The system provides an end-to-end pipeline for handling user queries, from natural language understanding to intelligent data retrieval and response generation.

Figure 1. Finance assistant architecture.

Diagram showing the finance assistant architecture.

Key components

Each component plays a specific role in enabling natural language understanding, data retrieval, and intelligent response generation.

User input: Users interact with the Finance Assistant using natural language queries like:

“What are my last three transactions?”
“How can I improve my savings?”

IBM Watsonx.ai

Granite embedding models: Convert user queries into high-dimensional vector embeddings that represent semantic meaning.
Granite language models: Generate intelligent and context-aware responses by reasoning over retrieved data.

MongoDB Atlas

Vector search index: Stores vector embeddings of transactional and financial data for fast, accurate similarity-based retrieva.l
Hybrid search: Combines keyword search with vector similarity for holistic data retrieval.
Operational data store: Maintains structured and unstructured financial data in a scalable and secure database.

LangChain

Orchestrates the flow between MongoDB Atlas and IBM Watsonx.ai.
Implements retrieval-augmented generation (RAG) for real-time query handling and response generation.

The flow of the architecture

Describes how user queries are transformed into insights through embedding, retrieval, and AI-driven response generation.

Preprocessing

Financial data, such as customer transactions or private knowledge bases, is vectorized using IBM Watsonx.ai embedding models.
Vector embeddings are stored in MongoDB Atlas alongside metadata.

Query execution

User input is processed into embeddings and matched against the Vector Search index.
Relevant data is retrieved and passed to IBM Watsonx.ai for contextual reasoning.

Response generation

Watsonx.ai generates intelligent, explainable recommendations based on the retrieved data..
The response is delivered to the user in natural language.

Why MongoDB Atlas?

MongoDB Atlas provides a powerful platform for managing and querying large-scale data using its vector search capabilities. When building a RAG pipeline, it simplifies the process by enabling the storage of vectorized embeddings alongside metadata in a flexible schema. Its hybrid search capabilities—combining traditional keyword searches with vector similarity searches—make it ideal for efficiently retrieving relevant documents or financial data based on user input. MongoDB Atlas also provides scalability and real-time data updates, making it a robust operational data layer for dynamic RAG workflows. By seamlessly integrating vector search with existing data, MongoDB Atlas minimizes latency and complexity so that your gen AI applications can retrieve the right context every time.

Why IBM Watsonx.ai?

IBM Watsonx.ai brings enterprise-grade foundation models to power the reasoning and generative components of a RAG pipeline. Watsonx.ai’s foundation models, such as the Granite series, offer robust embeddings and advanced reasoning capabilities, enabling the system to process retrieved documents and generate natural language responses tailored to the user’s query. With its focus on transparency, security, and customization, Watsonx.ai is particularly suited for regulated industries like finance. Its integration with tools like LangChain facilitates seamless orchestration between retrieval and generation, enabling RAG systems to go beyond static responses by delivering personalized, insightful, and context-rich outputs.

Method for building an intelligent finance assistant with MongoDB Atlas and IBM Watsonx.ai

For this tutorial, we will be using a financial dataset containing customer details, transactions, spending insights, and metadata. These records represent real-world information such as payments, savings, and expenses, making the dataset highly relevant for building an intelligent finance assistant. To generate the vector embeddings for storing and retrieving this data, we will use the Granite embedding models from IBM Watsonx.ai. These embeddings capture the semantic meaning of financial data, enabling efficient similarity searches and contextual data retrieval.

To follow along, you will need an integrated development environment, a MongoDB Atlas account for data storage and indexing, and an IBM Watsonx.ai account for generating embeddings. By the end of this tutorial, you’ll have a functional system ready to support real-time financial assistance and personalized recommendations.

Prerequisites

Before starting the implementation, ensure you have the following set up:

MongoDB Atlas: Cluster with transaction and customer data collections. MongoDB Atlas will be the primary database for storing and querying transaction and customer data.

Steps:

Create a MongoDB Atlas account

Visit MongoDB Atlas and click “Get Started.”
Sign up using your email or log in with Google, GitHub, or Microsoft.

Set up a cluster

Click “Build a Cluster” after logging in.
Choose a free tier cluster or upgrade for more features.
Select your cloud provider (AWS, Google Cloud, or Azure) and region.
Click “Create Cluster” to deploy (this may take a few minutes).

Configure your cluster

Go to “Database Access” and create a user with a username, password, and role (e.g., “Read and Write to Any Database”).
In “Network Access,” add your IP address or allow all IPs (0.0.0.0/0) for unrestricted development access.

IBM Watsonx.ai: API key for accessing large language models (LLMs). IBM Watsonx.ai will handle the reasoning and generative tasks.

Steps:

Create an IBM Cloud account

Visit IBM Cloud and sign up for a free account.

Set up Watsonx.ai

Log in and search for “Watsonx.ai” in the catalog.
Create an instance; a sandbox environment will be set up automatically.

Generate an API key

Go to “Manage,” then “Access (IAM)” in the IBM Cloud dashboard.
Click “Create API Key,” name it (e.g., “watsonx_key”), and save it securely.

Retrieve service URL

Find the service URL (e.g., https://us-south.ml.cloud.ibm.com) in the Watsonx.ai instance dashboard.

You’re ready to start building your finance assistant!

Implementation steps

To set up and run your finance assistant, follow the steps below to clone, configure, and execute the code. Ensure that your MongoDB Atlas cluster and IBM Watsonx.ai configurations are ready before proceeding.

Step 1: Clone the code repository.

The demo code is available on GitHub.
Clone the project repository from the provided GitHub link, using this command:

git clone <repository_url>
cd <repository_directory>

#Install dependencies (requires python version 3.11 or higher)
pip install -r requirements.txt

This repository contains all the necessary files, including preprocessing.py, processing.py, and the HTML templates.

Step 2: Configure the preprocessing script.

Open the preprocessing.py file. This script is responsible for ingesting and vectorizing the financial data into MongoDB Atlas.
Locate the MONGO_CONN variable and replace it with your MongoDB Atlas connection string:

MONGO_CONN = "<your_mongodb_connection_string>"

Save the file.

Step 3: Run the preprocessing script.

Execute the preprocessing.py script to preprocess and ingest the financial data into MongoDB Atlas:

python preprocessing.py

If the script runs successfully:

A new database named banking_quickstart will be created in your MongoDB Atlas cluster.
The following collections will appear:
faqs
customers_details
transactions_details
spending_insight_details

Screen grab of the collections dashboard in MongoDB Atlas.

The script will also generate vector embeddings for textual data, enabling efficient similarity searches in MongoDB Atlas.

Screen grab of the banking_quickstart details. Shown via a dashboard in MongoDB Atlas.

Create a vector search index as follows for all four collections. Change the field name accordingly:

Screen grab of the Index Overview dashboard.

Step 4: Configure the processing script

Open the processing.py file. This script integrates IBM Watsonx.ai for reasoning and query handling.
Update the following variables:

MONGO_CONN: Your MongoDB Atlas connection string
Watsonx.ai Configuration:

These configurations enable secure access to both MongoDB Atlas and Watsonx.ai for data retrieval and AI-powered query handling.

Step 5: Run the processing script

Execute the processing.py file to start the backend Flask server: python processing.py.
If the server starts successfully, the application will be hosted locally at 127.0.0.1.

Step 6: Access the application

Open your browser and navigate to the following URL: http://127.0.0.1:5000/login
You will see the finance assistant login page. Use the provided credentials (or modify the preprocessing.py script to create custom login data).
Use any customer i.d. between 1-1000. Eg: CUST0571

Figure 2. Customer login portal.

Screen grab of the customer login portal

Figure 3. Finance assistant dashboard.

Screen grab of the finance assistant dashboard with a chatbot function.

Watch the full tutorial:

Additional technical details

Preprocessing with Watsonx.ai: During the execution of preprocessing.py, the Granite embedding models from Watsonx.ai are used to vectorize textual data (e.g., transaction descriptions). The generated embeddings are stored in MongoDB Atlas for similarity-based queries.
API configuration: The processing.py script integrates with IBM Watsonx.ai’s Granite language models to process natural language queries and generate meaningful responses.
Server logs: Check the terminal logs for any errors or status updates during the execution of the Flask server. Logs provide insights into API calls, database interactions, and AI responses.

The power of advanced vector search and enterprise AI

Building a finance assistant using MongoDB Atlas and IBM Watsonx.ai demonstrates the power of combining advanced vector search capabilities with enterprise-grade AI models. This architecture not only provides real-time, accurate, and personalized financial insights but also highlights the scalability and flexibility needed for modern financial applications.

In this tutorial, you’ve learned how to:

Preprocess financial data using Watsonx.ai’s Granite embedding models to create vector embeddings.
Store and query data efficiently in MongoDB Atlas using its Vector Search index and hybrid search capabilities.
Integrate IBM Watsonx.ai’s foundation models for intelligent reasoning and natural language understanding.
Build a seamless user interface to enable customers to access their financial information intuitively.

And with this system, you can deliver:

Personalized financial insights: Deliver tailored responses for individual users based on their financial data.
Scalable performance: Effortlessly handle large datasets and complex queries.
Enhanced user experiences: Provide customers with real-time, explainable, and context-aware recommendations.

As the financial services sector continues to evolve, combining tools like MongoDB Atlas and IBM Watsonx.ai will become essential for delivering smarter AI-driven solutions. You can easily extend this architecture to include advanced analytics, fraud detection, or even investment forecasting, making it a robust foundation for future innovation.

Ready to take your finance assistant to the next level? Start experimenting with more data, refining AI prompts, or exploring MongoDB Atlas and Watsonx.ai’s advanced features to unlock even greater potential!

To fast-track your AI journey, explore the MongoDB AI Applications Program (MAAP). It brings together cutting-edge technologies and expert services from top AI and tech leaders including IBM to help your organization move seamlessly from concept to road map, prototype, and full-scale production.

← Previous

Google’s Datastream Powers Seamless MongoDB Integration into BigQuery

Google’s Datastream service now offers public preview support for MongoDB as a source , marking an exciting expansion of its data streaming capabilities. This new feature enables users to seamlessly ingest data from MongoDB databases into Google’s BigQuery and Cloud Storage for real-time insights and enhanced data-driven decision-making. MongoDB Atlas has emerged as a cornerstone of modern application development, and is celebrated for its flexible document model, horizontal scalability, and high performance. As a leading NoSQL database, it's the go-to choice for applications requiring agile schema evolution, handling diverse data types, and supporting rapid iteration cycles. From real-time analytics dashboards to content management systems and IoT data ingestion, MongoDB Atlas's versatility allows developers to build robust, scalable, and responsive applications that can easily adapt to changing business needs and data structures. Its ability to store semi-structured and unstructured data makes it particularly powerful for dynamic datasets that don't fit neatly into traditional relational tables, which is one of the reasons MongoDB was recognized as a leader in the Gartner Magic Quadrant . Supercharging MongoDB with BigQuery analytics MongoDB shines as an operational database, perfectly suited for transactional workloads and providing efficient, application-specific data access. For deep analytical insights, complex querying, and leveraging the power of machine learning and generative AI, moving this valuable data into a dedicated data warehouse like Google BigQuery becomes paramount. BigQuery offers petabyte-scale analytics, a serverless architecture, and powerful SQL capabilities, making it ideal for running complex queries across massive datasets, joining data from various sources, and performing advanced analytics. Generative AI thrives on rich data, making the MongoDB operational insights invaluable. Structuring this data in BigQuery empowers you to train powerful AI models, build recommendation engines, perform sentiment analysis, and unlock entirely new revenue streams from your existing data. Datastream helps to integrate MongoDB into BigQuery Datastream is a serverless Change Data Capture (CDC) service that enables real-time data replication from various sources, including MongoDB, directly into BigQuery. It captures changes (inserts, updates, deletes) as they happen in your MongoDB database and streams them continuously and seamlessly to BigQuery, ensuring your analytical data warehouse is always up-to-date. For now, data destined for BigQuery will be delivered in JSON This eliminates the need for complex batch processing, custom scripts, or manual data transfers, significantly reducing operational overhead and data latency. With Datastream, organizations can unlock immediate insights from their MongoDB data, fuel real-time dashboards, and empower their gen AI initiatives with the freshest possible information, all with minimal effort and maximum reliability. Figure 1. MongoDB as a source connector on Google Datastream. The key benefits of Datastream Better decisions and actionable Intelligence: With Datastream's low-latency replication, you can empower your business with up-to-the-minute insights from your MongoDB data. Scalability and reliability: Datastream scales to handle large volumes of data and ensures reliable replication. Fully managed: No need to manage infrastructure or worry about maintenance, freeing your team to focus on core tasks. Wide support matrix: The MongoDB connectivity in Datastream supports Replica Sets and Sharded Clusters, as well as self-hosted and fully-managed Atlas databases . Support for backfill and CDC: Datastream supports both backfill and CDC (change data capture) from a MongoDB source. Secure by design: Datastream supports multiple secure, private connectivity methods to protect data in transit and encrypts it in transit and at rest. With Datastream's new MongoDB connector , you can effortlessly integrate your MongoDB data. This means greater data flexibility and the ability to make smarter, data-driven decisions. Start leveraging your MongoDB information to innovate and boost business growth today. Connecting your MongoDB databases to Datastream is a simple process—just follow the easy steps in the Datastream documentation to begin data replication. Ready to get started with MongoDB and Google Cloud? Check out the Google Cloud Marketplace .

July 21, 2025

Next →

That’s a Wrap: MongoDB’s 2025 in Review & 2026 Predictions

It’s nearly the end of the year—again! That means it’s time for an end-of-year blog post that expresses disbelief at the passage of time. Which, as the saying goes, flies when you’re having fun. And definitely when you’re as busy as MongoDB was in 2025. It was a big year for the company—and more importantly, for the tens of thousands of customers and millions of developers who rely on MongoDB’s modern data platform for their most mission-critical workloads. At MongoDB, everything we do starts with our obsession with customers and their needs, and if there’s a theme to MongoDB’s 2025, it was (and will continue to be) enabling customer innovation and helping them succeed in the AI era. So here are a few highlights of how MongoDB acted on behalf of customers in 2025. From the acquisition of Voyage AI to customer success across industries, a lot happened in 2025. Let’s go!* *Read to the end for 2026 thoughts. 2025: The (MongoDB) year that was Voyage AI, modernization, and search In February, MongoDB announced the acquisition of Voyage AI, a pioneer in embedding and reranking models, to enhance the accuracy of AI applications. Integrating Voyage AI's advanced retrieval technology with MongoDB’s modern, AI-ready data platform addresses a critical challenge: LLM model hallucinations caused by a lack of context. By improving retrieval accuracy for specialized domains like finance and law, the integration enables businesses to deploy AI for mission-critical use cases. To learn more, see the MongoDB Voyage AI page. Then, in September, we launched MongoDB AMP, an AI-powered Application Modernization Platform. AMP is designed to accelerate the transformation of legacy applications through a combination of AI-powered tooling, a proven delivery framework, and expert guidance (tools, techniques, and talent) to help enterprises reduce technical debt and modernize 2-3 times faster. Want more? Sure you do! Check out this short video. MongoDB also announced the addition of search and vector search capabilities to MongoDB Community Edition and MongoDB Enterprise Server. This allows developers to build and test AI-native applications, including those using retrieval-augmented generation (RAG), in local or on-premises environments. Previously exclusive to MongoDB Atlas, these features enable secure, hybrid deployments where sensitive data can remain on-premises while still leveraging advanced search tools. Here’s a (slightly less short) video about search and vector search on Enterprise Server. Growing and scaling with MongoDB As noted, everything we do at MongoDB starts with our obsession with customers. 2025 was another banner year for customer success and innovation—we were inspired by what organizations of every shape and size, across industries and geographies, built with MongoDB in 2025. Here are just two of the many stories our customers shared in 2025; much more can be found in my colleague Katie Palmer’s blog series, Innovating with MongoDB. Factory By combining the Atlas modern data platform with Voyage AI’s high-performance embeddings, the AI-native startup Factory—which uses AI agents called Droids to accelerate software development lifecycles for organizations—consolidated its fragmented tech stack. This enabled superior code retrieval, simplified operations, and provided the scalability needed to process billions of tokens daily. McKesson McKesson, a global pharmaceutical distributor, replaced its monolithic legacy infrastructure with MongoDB Atlas to meet strict drug tracing mandates. By adopting our modern cloud data platform, McKesson scaled its operations 300x, managing tracking data for 1.2 billion containers annually without latency, and ensuring compliance and patient safety while reducing developer complexity. For more, check out the video of McKesson at MongoDB.local NYC from September. From niche NoSQL to enterprise powerhouse As senior MongoDB engineer and Technical Fellow Ashish Kumar put it earlier this year, “through a sustained and deliberate engineering effort,” MongoDB has gone from a (seemingly) niche NoSQL solution to a trusted enterprise standard, and now delivers “the high availability, tunable consistency, ACID transactions, and robust security that enterprises demand.” A new era of leadership The face of MongoDB has also changed—our CFO, Mike Berry, joined the company in April, and Dev Ittycheria stepped down as CEO in November, after more than 11 years leading the company (including its 2017 IPO). In a LinkedIn post about his role, new MongoDB CEO CJ Desai noted that the company is “at the forefront of a new data revolution, unlocking the next wave of productivity and intelligence.” “Having spent my career building and scaling technology platforms, I’ve always been drawn to companies defined by clarity of vision, relentless organic innovation, and a customer-first culture. MongoDB exemplifies all three,” said Desai. We couldn’t agree more. Onward! Reading the 2026 tea leaves So what might 2026 bring (for MongoDB and tech at large)? Here are a handful of our leaders’ predictions: “As much as people want to talk about Artificial General Intelligence (AGI), we’re still in the phase where most AI use cases automate redundant tasks but benefit from human-in-the-loop checks. Organizations that use AI to complete work that historically is a drain on human resources—but then uses people to carefully verify what AI builds, apply governance frameworks, and maintain accountability across the data lifecycle—will be more successful.” —Pete Johnson, Field CTO, AI, MongoDB “After years of inflated expectations and unsustainable spending, the AI industry is trapped in a bubble where companies reflexively attempt to deploy LLMs at every problem, driving up costs with minimal to no return. Businesses that break free from this spending cycle are the ones that understand the need to ground LLM responses in factual data and learn from prior mistakes. We believe the best way to do this will be with highly accurate embedding models and rerankers for reliable data retrieval.” —Frank Liu, Staff Product Manager, MongoDB "In 2026, cloud independence will evolve from strategic preference to existential imperative across enterprises of every scale. The outages and disruptions of recent years have exposed a fundamental truth: in an always-on digital economy—where commerce, mobility, governance, and even public safety depend on uninterrupted access to cloud services—single-provider reliance is no longer a calculated risk, but a systemic vulnerability. Compounding this is the inexorable rise of data sovereignty. Regulatory regimes worldwide now demand precise jurisdictional control over data residency, rendering rigid cloud commitments incompatible with compliance at global scale. The defining competitive advantage will belong to organizations that transcend fragile prevention theater and engineer true infrastructural resilience: architectures inherently portable, data frictionlessly mobile, and operations autonomously sustained across heterogeneous clouds through AI-orchestrated redundancy. In short, the winners will not merely mitigate downtime—they will design systems that render the concept obsolete." —Ben Cefalo, SVP, Head of Core Products, MongoDB Happy holidays and happy New Year, everyone!

December 22, 2025