Use cases: Gen AI, Fraud Prevention
Industries: Financial Services, Insurance
Products and tools: MongoDB Atlas, MongoDB Atlas Charts, MongoDB Data Federation
Partners: Amazon S3, Amazon SageMaker Canvas
Solutions Overview
Financial institutions face growing risks from cybercriminals, including high-profile hacks and fraudulent transactions. Cyber incidents undermine customer trust and can result in significant financial losses for companies. Companies struggle to implement secure systems, due to the limitations of legacy fraud systems, which include:
Incomplete data visibility: Lack of access to relevant data sources for pattern detection.
Latency within fraud systems: Lack of real-time processing capabilities that causes fraud detection delays.
Weak security protocols: Outdated security that exposes vulnerabilities to cyber attacks.
Technical sprawl: Diverse technologies that complicate maintenance and updates.
Poor team collaboration: Siloed approaches that lead to delayed responses.
To overcome these challenges, financial companies can use real-time analytics solutions powered by MongoDB Atlas and Amazon SageMaker Canvas. These tools deliver strong fraud detection systems that use the most accurate data available for their operations.
In this system, MongoDB Atlas stores the operational data and processes high-volume transactions. While, Amazon SageMaker Canvas uses sophisticated AI and machine learning (ML) tools to power advanced analytics for fraud detection.
Reference Architectures
Below is the architecture used to build this fraud detection solution. The architecture includes an end-to-end solution for detecting different types of fraud in the banking sector, including card fraud detection, identity theft detection, and consumer fraud detection.
The architecture diagram illustrates model training and near real-time inference. The operational data stored in MongoDB Atlas is written to the Amazon S3 bucket using MongoDB Atlas Triggers. Thus stored, the data is used to create and train the model in Amazon SageMaker Canvas. The SageMaker Canvas stores the metadata for the model in the S3 bucket and exposes the model endpoint for inference.

Figure 1. Fraud detection architecture
Data Model Approach
The data is divided into two separate files:
Transaction
Identity
These files are connected through the TransactionID
.
However, not every transaction includes associated identity details.
Based on the above two datasets, prepare a test join on the TransactionID
,
adding the target column as Fraud.
Data courtesy of Kaggle.
Source Table1: Transaction TransactionID, TransactionDT, Card_no, Card_type, Email_domain, ProductCD, TransactionAmt, Transaction_ID Source Table2: Identity TransactionID, IpAddress, PhoneNo, DeviceID, Location, Name, Address Test Data: TransactionID, Card_no, card_type, Email_domain, IpAddress, PhoneNo, DeviceID, ProductCD, TransactionAmt, isFraud
Build the Solution
The detailed step-by-step guide to build this solution is available on this Github repo. Below is an overview of those steps taken:
Set up the S3 bucket to which the MongoDB Atlas data needs to be exported.
Set up an MongoDB Atlas Cluster.
Set up the Amazon SageMaker domain.
Key Learnings
Develop real-time fraud detection solutions: MongoDB Atlas handles large amounts of data in a flexible schema empowering financial institutions to capture, store, and process high-volume transactional data in real-time.
Update fraud detection models: Real-time processing with MongoDB's aggregation pipeline ensures that models are continuously trained with the most current and relevant information available. This capacity provides financial institution a powerful tool to create a robust fraud detection system.
Integrate sophisticated AI and ML tools: MongoDB integrates with external services, such as Amazon SageMaker, which offers AI and ML solutions in a no-code platform. This friendly-user interface makes models accessible to analysts, enabling them to easily generate accurate ML predictions for classification, regression, forecasting, natural language processing (NLP), and computer vision (CV).
Authors
Babu Srinivasan, Partner Solutions Architect at MongoDB
Igor Alekseev, Partner Solutions Architect at AWS