Learn how generative AI can generate detailed risk assessments and how MongoDB enables comprehensive loan risk analysis.
Use cases: Gen AI, Lending and Leasing
Industries: Financial Services
Products and tools: MongoDB Atlas, Geospatial Data
Partners: Google Maps API, Fireworks.ai
Solution Overview
Business loans are important for banking operations and provide significant benefits to financial institutions and economies. For example, in 2023 the value of commercial and industrial loans in the United States reached nearly $2.8 trillion. Loans involve a business plan, which details the borrower's plans and financial projections and helps lenders evaluate the business's goals and profitability. However, reading through borrower credit information is challenging for loan officers due to time constraints and the complexity of the material. Additionally, loans themselves present risks for banks, such as credit risks where the borrower defaults, or when economic downturns impact borrowers' ability to repay loans.
This solution uses MongoDB and Generative AI (Gen AI) to analyze business plans and generate detailed risk assessments for business loans. It uses MongoDB to store contextual data which is used to power an AI chatbot that you can query about specific risk assessments.
Interactive Risk Analysis with Generative AI-powered Chatbots
Figure 1 below shows an example of how ChatGPT-4o responds when you ask it to assess the risk of a business loan. Although the input of the loan purpose and business description is simple, Generative AI offers a detailed analysis.

Figure 1. Example of a ChatGPT-4.0 response for business loan risk assessment
By applying Generative AI to risk assessments, lenders can explore additional risk factors that Generative AI can evaluate, such as the risk of natural disasters or broader climate risk. In Figure 2, the user specifically adds flood risk as a factor to the previous question.

Figure 2. Example of a ChatGPT-4.0 response to flood risk as a factor
Based on the response, there is a low risk of flooding. However, it suggested reviewing FEMA flood maps and local flood history, indicating it might not have the latest information. To validate the information, you can ask ChatGPT-4o the same question phrased in a different way, focusing on its knowledge of flood data. See Figure 3 for an example of this question and response.

Figure 3. Example of location-specific flood question
In the query shown, ChatGPT-4o now indicates that there has been “significant flooding” nearby and provides references to evidence, after performing an internet search across four sites, which it did not perform previously.
When ChatGPT-4o does not have the relevant data it starts to make false claims, or hallucinations, such as when it indicated a low flood risk in the first two queries due to lack of information. However, it can also recognize and intelligently seek additional data sources to fill its knowledge gaps.
A similar test was performed on Llama 3, which is hosted by MongoDB's MAAP partner Fireworks.AI. The experiment tested Llama 3's knowledge of flood data, which showed a similar knowledge gap to ChatGPT-4o. However, rather than providing misleading answers, Llama 3 provided a hallucinated list of fload data but highlighted that “this data is fictional and for demonstration purposes only."

Figure 4. LLM response to fictional flood locations
Retrieval-Augmented Generation (RAG) Risk Analysis
While Gen AI can augment business loan analysis, interacting with a chatbot requires loan officers to repeatedly prompt the bot and augment their questions with relevant information. This can be time-consuming and impractical due to a lack of prompt-engineering skills or necessary data.
This solution uses Gen AI to augment the risk analysis process and fill the LLM's knowledge gap. It uses MongoDB to store data and uses geospatial queries to discover floods within five kilometers of the proposed business location.
In this demo, you select a business location, a business purpose, and a description of a business plan. It also includes an "Example" button so you can generate a brief business description.

Figure 5. User input for the loan risk assessment demonstration
When you submit your input, the demo provides a risk analysis using RAG. It uses prompt engineering to provide a simplified analysis of the business while considering the location and flood risk that has been downloaded from external flood data sources.

Figure 6. Loan risk response with a RAG architecture
You can reveal all the sample flood locations by clicking on the "Pin" icon in the demo. In the image, geolocation pins represent the flood location and the blue circle indicates the five-kilometer radius in which flood data is queried.

Figure 7. Flood locations displayed in the demo
Reference Architectures
The following diagram provides an overview of this solution's architecture:

Figure 8. RAG data flow architecture diagram
With MongoDB, developers can enhance the RAG process by utilizing features such as network graphs, time series collections, and vector search. In turn, this enhances the context for the Gen AI agent, such as using geospatial data to identify flood risk locations, which reduces hallucination.
The iterative nature of the RAG process allows the model to continuously learn and improve from new data and feedback, which eventually leads to increasingly accurate risk assessments and fewer hallucinations.
Data Model Approach
The code snippet below is an example of a geospatial query. This example uses the $geoNear aggregation stage, which allows the user to fetch all locations within a given distance of a point specified by longitude and latitude. You can use an aggregation pipeline to include other data processing operations, such as selecting specific fields by using $project, or filtering based on certain conditions by using $match.
The data used in this demo is pulled from the United States Flood Database, which contains multiple sources, with 2020 as the latest dataset.
pipeline = [ {"$geoNear": {"near": {"type": "Point", "coordinates": [longitude, latitude]}, "distanceField": "DISTANCE", "spherical": True, "maxDistance": radius * 1000}}, {"$project": {"year": 1, "COORD": 1, "DISTANCE": 1}}, {"$match": {"year": {"$gte": 2016}}} ]
Build the Solution
The code to demonstrate all the features of MongoDB for building this solution is available in the following GitHub repo.
Authors
Wei You Pan, Global Director, Financial Industry Solutions, MongoDB