EVENTGet 50% off your ticket to MongoDB.local NYC on May 2. Use code Web50! Learn more >

Fraud detection accelerator using AWS SageMaker

Revolutionize fraud detection in finance with MongoDB Atlas and Amazon SageMaker Canvas. Leverage real-time data and AI for stronger defenses against cybercrime.
Start Free
Illustration of a secure application and a smartphone with a fingerprint scanner.
Solutions overview

Financial services organizations face growing risks from cybercriminals. High-profile hacks and fraudulent transactions undermine faith in the industry. As technology evolves, so do the techniques employed by these perpetrators, making the battle against fraud a perpetual challenge. Existing fraud detection systems often grapple with a critical limitation: relying on stale data. The newest tactics often can be seen in the data. That's where the power of operational data comes into play.

By harnessing real-time data, fraud detection models can be trained on the most accurate and relevant clues available. MongoDB Atlas, a highly scalable and flexible developer data platform, coupled with Amazon SageMaker Canvas, an advanced machine learning tool, presents a groundbreaking opportunity to revolutionize fraud detection. By harnessing operational data and leveraging the power of real-time insights, financial institutions can fortify their defenses against cybercriminals who seek to exploit vulnerabilities for illicit gains. MongoDB Atlas proves its strength as an operational data store, accommodating high-volume transactional data with exceptional performance and flexibility. Meanwhile, Amazon SageMaker Canvas empowers business analysts to leverage AI/ML solutions effortlessly, providing a no-code platform that brings the power of advanced analytics to their fingertips.

Challenges with legacy fraud systems:
  • Incomplete data visibility from legacy systems: Lack of access to relevant data sources hampers fraud pattern detection.
  • Latency issues in fraud prevention systems: Legacy systems lack real-time processing, causing delays in fraud detection.
  • Difficulty in adapting legacy systems: Inflexibility hinders the adoption of advanced fraud prevention technologies.
  • Weak security protocols in legacy systems: Outdated security exposes vulnerabilities to cyber attacks.
  • Operational challenges due to technical sprawl: Diverse technologies complicate maintenance and updates.
  • Lack of collaboration between teams: Siloed approach leads to delayed solutions and higher overhead.
Reference architectures

Below, you will find the architecture used to build this fraud solution. The architecture includes an end-to-end solution for detecting different types of fraud in the banking sector, including card fraud detection, identity theft detection, account takeover detection, money laundering detection, consumer fraud detection, insider fraud detection, and mobile banking fraud detection to name a few.

The architecture diagram illustrates model training and near real-time inference. The operational data stored in MongoDB Atlas is written to the Amazon S3 bucket using the Triggers feature in Atlas Application Services. Thus stored, data is used to create and train the model in Amazon SageMaker Canvas. The SageMaker Canvas stores the metadata for the model in the S3 bucket and exposes the model endpoint for inference.

Reference architecture diagram.
Data model approach

The data is divided into two separate files: one containing identity information and the other containing transaction data. These files are connected through the TransactionID. It's important to note that not every transaction includes associated identity details.

Based on the above two datasets, we prepare a test join on the TransactionID, adding the target column as Fraud.

Data courtesy of Kaggle.

Building the solution

The detailed step-by-step guide to build this solution can be found in this Github repo. Below you will find an overview of those steps taken:

  1. Set up the S3 bucket to which the MongoDB Atlas data needs to be exported.
  2. Set up a MongoDB Atlas Cluster
  3. Set up Atlas Application Services
  4. Set up the AWS SageMaker domain
MongoDB Atlas as the operational data store

The MongoDB Atlas developer data platform is an integrated suite of data services centered on a cloud database designed to accelerate and simplify how developers build with data. Its ability to handle large amounts of data in a flexible schema empowers financial institutions to effortlessly capture, store, and process high-volume transactional data in real-time. This means that every transaction, every interaction, and every piece of operational data can be seamlessly integrated into the fraud detection pipeline, ensuring that the models are continuously trained on the most current and relevant information available. With MongoDB Atlas, financial institutions gain an unrivaled advantage in their fight against fraud, unleashing the full potential of operational data to create a robust and proactive defense system.

Amazon SageMaker Canvas as an AI/ML solution

Amazon SageMaker Canvas revolutionizes the way business analysts leverage AI/ML solutions by offering a powerful no-code platform. Traditionally, implementing AI/ML models required specialized technical expertise, making it inaccessible for many business analysts. However, SageMaker Canvas eliminates this barrier by providing a visual point-and-click interface to generate accurate ML predictions for classification, regression, forecasting, natural language processing (NLP), and computer vision (CV). SageMaker Canvas empowers business analysts to unlock valuable insights, make data-driven decisions, and harness the power of AI without being hindered by technical complexities. It boosts collaboration between business analysts and data scientists by sharing, reviewing, and updating ML models across tools. It brings the realm of AI/ML within reach, allowing analysts to explore new frontiers and drive innovation within their organizations.

Key learnings
  • Understand the use of Atlas Application Services and Atlas Charts to build products at scale.
  • How MongoDB integrates natively with external services (such as AWS SageMaker, AWS S3) to provide even more powerful applications.
Technologies and products used
MongoDB developer data platform:
Partner technologies:
  • AWS S3
  • AWS Sagemaker Canvas
  • Babu Srinivasan, Partner Solutions Architect at MongoDB
  • Igor Alekseev, Partner Solutions Architect at AWS
Related resources

GitHub Repository: Fraud Detection

Create this demo for yourself by following the instructions and associated models in this solution’s repository.


Fraud Detection Immersion Day

Work your way through this e-workshop designed to show you the steps required for building this solution.


Fraud Prevention with MongoDB

Analyze and detect fraud in real time with MongoDB.

Get started with Atlas today

Get started in seconds. Our free clusters come with 512 MB of storage so you can experiment with sample data and get familiar with our platform.
Try FreeContact sales
Illustration of hands typing on a laptop in the foreground and a superimposed desktop window and coffee cup in the background.