BlogAtlas Vector Search voted most loved vector database in 2024 Retool State of AI reportLearn more >>
MongoDB Developer
Sign in to follow topics
MongoDB Developer Centerchevron-right
Developer Topicschevron-right

Building a Dynamic Pricing Microservice with Vertex AI and MongoDB Atlas

Francesco Baldissera, Sebastian Rojas Arbulu17 min read • Published Jun 21, 2024 • Updated Jun 21, 2024
Facebook Icontwitter iconlinkedin icon
Rate this tutorial
In the hyper-competitive world of e-commerce, crafting a winning pricing strategy is essential for growth. Thankfully, big data and machine learning have revolutionized pricing. Businesses can now leverage real-time customer behavior and competitor data to dynamically adjust prices.
This tutorial dives into building a responsive dynamic pricing microservice that enables prices to be adjusted in real time for maximum effectiveness. We'll explore using MongoDB Atlas for its efficient data storage and management while leveraging Google Cloud Platform's (GCP) power for complex calculations and hosting. By the end, you'll be equipped to implement this approach and unlock the potential of data-driven pricing.
The following animation illustrates what we aim to achieve:
Gif of fashion illustration
As seen, this e-commerce store displays a dynamically predicted price alongside the actual product price. The predicted price is calculated in real time using machine learning algorithms that analyze market trends, demand, competitor prices, customer behavior, and sales data to optimize sales and profit.

Data model overview

Before we begin, let's establish context with an overview of our data model. Our microservice leverages MongoDB Atlas, a developer data platform, to power real-time AI in our e-commerce app. Atlas stores our ML features in two key collections, acting as a feature store. This streamlines data management, automates decision-making, and isolates workloads. With change streams and triggers, updates flow seamlessly to our AI models, minimizing operational overhead for both business and MLOps.

Products collection

The Product collection in MongoDB Atlas is organized using the polymorphic pattern. The polymorphic pattern is useful when we want to access (query) information from a single collection. Grouping documents together based on the queries we want to run, instead of separating the objects across tables or collections, helps improve performance. By centralizing various product types, this pattern streamlines data management and improves query efficiency. The following table outlines the key fields within the Products collection (not public), which you will use to create a collection in MongoDB Atlas according to this schema. The table includes the data types and brief descriptions of each field:
Field NameData TypeDescription
_idObject IDUnique identifier for the document; used by MongoDB for internal purposes
nameStringName of the product
codeStringUnique code identifying the product
autoreplenishmentBooleanIndicates if the product is set up for auto-replenishment
idIntegerNumeric identifier for the product; numeric identifier for external reference
genderStringGender category the product is intended for
masterCategoryStringBroad category for the product
subCategoryStringMore specific category under the master category
articleTypeStringType of article, e.g., Vest
baseColourStringPrimary color of the product
seasonStringSeason the product is intended for
yearIntegerYear of the product release
usageStringIntended use of the product, e.g., Casual
imageObjectContains the URL of the product image
priceObjectContains the amount and currency of the product price; nested structure
descriptionStringDetailed description of the product
brandStringBrand of the product
itemsArray of ObjectsContains variants of the product, including size, stock information, and delivery time
total_stock_sumArray of ObjectsAggregated stock information across different locations
pred_priceDoublePredicted price of the product based on machine learning models; uses a double data type for precision in pricing predictions
The JSON objects within the collection should appear as follows:
Json objects in collection

Events collection (feature store)

Utilizing MongoDB's Events collection as an ML feature store — a centralized repository designed to streamline the management and delivery of features used in machine learning models — offers numerous advantages. It means that your feature store is always accessible, reducing downtime and improving the efficiency of machine learning operations. On the other hand, cross-region deployments further enhance performance by bringing features closer to the models that use them. This reduces latency, allowing for faster model training and serving.
This collection (not public) stored in MongoDB Atlas serves as a repository for user behavior events crucial for training our pricing prediction model. The key fields you will use to create the collection, according to this schema, are as follows:
FieldData TypeDescriptionExample Values
product_nameStringThe name of the product"MongoDB Notebook"
product_idIntegerUnique identifier for the product98803
actionStringType of action performed on the product (user interaction)"view", "add_to_cart", "purchase"
priceFloatPrice of the product18.99
timestampStringISO format timestamp of when the event occurred"2024-03-25T12:36:25.428461"
encoded_nameIntegerAn encoded version of the product name for machine learning models23363195
tensorArrayA numerical representation of the product extracted through machine learning techniques; the size of the tensor can vary depending on the specific model requirements[0.0005624396083488047, -0.9579731008383453]
An Events object, with its associated tensors, should look like: Events object

Demo components

Additionally, it's important to consider that our solution incorporates various components to facilitate dynamic pricing:
  • Data ingestion: Pub/Sub acts as a high-speed pipeline, efficiently bringing in large amounts of customer behavior data formatted as JSON.
  • Data processing: Vertex AI Workbench provides a clean environment for data cleaning and training TensorFlow models. These models analyze customer events, product names, and existing prices to predict the optimal price for each item.
  • Feature storage: MongoDB Atlas serves as a central hub for all the features used by the model. This ensures consistency between the data used for training and the data used for real-time predictions, as well as the operational data for your applications, thereby reducing the overhead of “in-app analytics.” It also simplifies the overall process by keeping everything in one place.
  • Model orchestration: Cloud Functions act like a conductor, directing the flow of customer event data. They take the data from Pub/Sub, transform it into a format usable by the model (tensors), and store it in MongoDB Atlas. This allows the model to easily access the information it needs.

Architecture overview

The architecture is designed to enhance pricing strategies through deep learning and continuous model optimization.

Blue data flow: Real-time pricing adjustment

  • Event ingestion: Customer event data is ingested into a Google Cloud Pub/Sub topic, serving as the entry point for real-time data.
  • Data processing: A Cloud function is triggered via a push subscription from the Pub/Sub topic. This function transforms raw event data into a structured tensor format.
  • Model invocation and price update: The same Cloud function calls a deployed model endpoint (e.g., in Vertex AI) with the tensor data to predict pricing. It then updates the predicted price in the MongoDB product catalog collection.
Figure 1. Dynamic pricing architecture integrating different Google Cloud components and MongoDB Atlas as a feature store

Green data flow: Feature store building

  • Feature store update: Concurrently, the Cloud function pushes the tensor data into the MongoDB Events collection, which acts as a feature store. Every event has its own tensor.
  • Versioning and accessibility: The data within the feature store is not versioned at the moment. The versioning pattern is useful for a future store because it addresses the problem of wanting to keep around older revisions of data in MongoDB, avoiding the need for a separate management system.
Important: Make sure to check out the pattern versioning guide. Versioning in a feature store enhances reproducibility, traceability, collaboration, and compliance in MLOps workflows, making it an essential component for managing machine learning pipelines effectively.


Let's get started building your AI pricing microservice! To seamlessly integrate AI pricing into your application, you'll need to set up the following components:
  1. MongoDB Atlas account: Set up a cluster, configure security settings, and connect your application.
  2. Google Cloud Platform account: Create a project, enable necessary APIs (e.g., Cloud Storage, Cloud Function, Pub/Sub, Vertex AI), and configure the CLI.
  3. Install Node.js and Express: Clone the repository containing the microservice code for this tutorial, configure environment variables, and develop the pricing logic.

Initial configuration

Step 1: Setting up MongoDB Atlas

Tip: Make sure to follow the how-to guide.
  • Create a cluster: Sign in to your MongoDB Atlas account and create a new cluster. Choose a region that is closest to your user base for optimal performance.
  • Configure security: Set up your cluster's security settings. Create database users with specific roles and enable IP whitelisting to secure your database connection.
  • Connect to your cluster: Use the connection string provided by Atlas to connect your application to your MongoDB database. You'll need this in the microservice configuration.

Step 2: Setting up GCP

Tip: View our guide to configure your GCP project.
  • Create a GCP project: Log into your Google Cloud console and create a new project for your microservice.
  • Enable APIs: Ensure that the necessary APIs are enabled for your project. In this microservice, we are using:
    Service APIExplanation
    Cloud StorageSaving data scalers as .joblib files
    Cloud FunctionOrchestrating the data flow
    Pub/SubIngesting live streaming of e-commerce events to loosely couple subscription-based microservices
    VertexAITraining notebook and model endpoint
  • Configure GCP CLI: Install and initialize the Google Cloud CLI. Authenticate with your GCP account and set your project as the default.

Step 3: Develop the microservice and model

  • Clone the repository: Start by cloning the repository with the microservice code.
Open your terminal and run the following commands:
Then, navigate to the directory containing the dynamic pricing microservice:
  • Configure Python packages: Once you're in the correct directory, type the following command and press Enter:
  • Configure environment variables: Set up the necessary environment variables, including your MongoDB Atlas connection string and any other service-specific configurations. Environment variables are essential for managing configuration settings, especially those that contain sensitive information.
Here's a template illustrating how to set up environment variables for a microservice that connects to MongoDB Atlas:
Creating the .env File with the following content:
Tip: Replace username and password with your MongoDB username and password, clusterName with the name of your MongoDB cluster, my-google-credentials with the path to your Google application credentials file, my-google-cloud with your Google Cloud project ID, and my-topic-id with the ID of your Google Cloud Pub/Sub topic.
Configure the Pub/Sub topic:
  1. Navigate to Pub/Sub in your Google Cloud Platform project.
  2. Provide a unique name for your topic in the "Topic ID" field.
  3. Adjust topic settings
  4. Select the encryption method you prefer
  5. Finalize the process by clicking the Create button Google Cloud create topic
Develop the pricing logic: Modify the dynamicPricing service to implement your pricing algorithm. This could involve analyzing historical data, considering competitor pricing, and integrating real-time supply-demand signals.
  1. Use VertexAI to navigate to Colab Enterprise.
  2. Click on +Create to make a new notebook.
Vertex AI image
Google Cloud Colab Enterprise picture
Connect directly to your MongoDB cluster for live data analysis and algorithm training: Once you create a new notebook, you can use it to connect to your MongoDB cluster directly using the following snippets:
Tip: Replace "your_mongodb_connection_string_here" with your actual MongoDB connection string. This typically follows a format similar to: mongodb+srv://<username>:<password>@<cluster-address>/<database-name>?retryWrites=true&w=majority Be sure to substitute: <username> and <password> with your MongoDB credentials. <cluster-address> with the address of your MongoDB cluster. <database-name> (optional) with the specific database you want to connect to within your cluster. Additionally, replace "your_database_name" and "your_collection_name" with the actual names you are using in your MongoDB setup.
This will allow you to pull data from your clusters live and train a pricing algorithm. In this case, we used TensorFlow to capture how prices change based on user behavior.
Train a TensorFlow neural network model: Now that we're connected to MongoDB, we'll show you a Jupyter Notebook designed for an e-commerce store, similar to the one in the introduction. Feel free to modify it for your specific needs. This notebook demonstrates how to train a TensorFlow neural network model to predict optimal prices based on e-commerce events stored in a MongoDB Atlas feature store. Let's get started.
We’ve decided that the e-commerce store has the following data model for capturing user behavior events:
FieldData TypeDescriptionExample Values
product_nameStringThe name of the product"MongoDB Notebook"
product_idIntegerUnique identifier for the product98803
actionStringType of action performed on the product (user interaction)"view", "add_to_cart", "purchase"
priceFloatPrice of the product18.99
timestampStringISO format timestamp of when the event occurred"2024-03-25T12:36:25.428461"
encoded_nameIntegerAn encoded version of the product name for machine learning models23363195
This table assumes the product["price"] field is a float representing the price of the product in a single currency (e.g., USD). The encoded_name field is considered an integer, which could represent a hash or an encoding used to transform the product name into a numerical format suitable for machine learning models. The timestamp field is a string formatted as an ISO timestamp, which provides the exact date and time when the action was recorded. The example values are placeholders and should be replaced with actual data from your application.
Setting up a MongoDB connection with Python: First, we need to install the necessary Python packages and establish a connection to our MongoDB database.
Data cleaning: Once connected, we'll fetch the data and perform some basic cleaning operations to prepare it for model training.
Building the dynamic pricing model: Next, we import the necessary TensorFlow and scikit-learn libraries, encode categorical variables, and normalize our data.
Saving the encoder for event data pre-processing: We'll save the encoder objects to Google Cloud Storage for later use in preprocessing new data for predictions. This code will generate joblib files to save the encoding and standardizing criteria from the above preprocessing and upcoming training.
Training the model: With our data prepared, we'll split it into training and testing sets, define our neural network architecture, and train our model. Please remember this is a model meant for a simple demo.
Test prediction: After training, we make a test prediction to verify the model's performance.
Saving the model: Finally, we'll save our trained model to Google Cloud Storage.
Registering the model in Vertex AI: Next, we'll register our trained model in the VertexAI model registry:
Congratulations! You should now see your model listed in the Vertex AI Model Registry. This is where you'll manage and deploy your models for various applications. Now, we need to train a TensorFlow neural network model to predict an optimal price based on e-commerce events stored in a MongoDB Atlas feature store.
Model registry pic

Step 4: Deploy a model to an endpoint

You must deploy a model to an endpoint before that model can be used to serve online predictions. Deploying a model associates physical resources with the model so it can serve online predictions with low latency. Here are the steps needed:
  1. In the Google Cloud console, in the Vertex AI section, go to the Models page.
  2. Click the name and version ID of the model you want to deploy to open its details page (model from last step).
  3. Select the Deploy & Test tab.
  4. Click Deploy to endpoint.
  5. Fill out the rest of the parameters (Model Settings, Model Monitoring).
  6. Click on Deploy.
deploy to endpoint
Next, in the Vertex AI panel:
  1. Click on Endpoints and then select your deployed model.
  2. Get the Endpoint ID from the details page, as you will need it to configure the Cloud Function for sending prediction requests to this endpoint. online prediction image
cloudFunction configuration: Google CloudFunction will orchestrate converting events data into tensors and its input into the feature store collection, as well as invoking the VertexAI endpoint model. Follow these steps:
Browse to Cloud Functions in your GCP project and click on CREATE FUNCTION.
cloud functions
Make sure the trigger for your Cloud function is the previously created Pub/Sub topic.
In the file seen below, copy and paste the following Python code snippet:
Make sure you add the requirements.txt seen below in the Cloud Function folder structure over GCP:
Simulating customer events: If you're looking to mimic customer behavior, feel free to use the Python script called seen below.
The python script generates fake customer events based on the explained data model. These events will be pushed to a Pub/Sub topic and your Atlas feature store collection. You can adjust the number of events and their cadence directly within the code. To run this script, use the following command:
After running this script, you should be able to see fake customer events being pushed into your MongoDB Atlas cluster and your Pub/Sub topic, effectively triggering the microservice to respond to those events and calculate optimal pricing points for the different products.

Key takeaways

Have you mastered building a reactive dynamic pricing microservice? Great job! Here's what you learned:
  • Centralized feature store: MongoDB serves as a feature store, acting as a centralized repository specifically designed for storing, managing, and serving features for machine learning (ML) models. Its polymorphic capabilities enable the utilization of a single interface to represent various types of data. This implies that as new features are introduced or pricing models evolve, MongoDB can adeptly manage diverse data types within the same system. In the context of dynamic pricing, this capability facilitates the seamless incorporation of new pricing factors or variables without causing disruptions to existing data structures or operations.
  • Scalability and efficiency: Google Cloud Pub/Sub can handle massive volumes of customer data efficiently, ensuring scalability for real-world applications. While this microservice simulates only 25 customer events every three seconds, Pub/Sub is capable of processing much larger data streams.
  • Real-time price updates: Cloud functions trigger TensorFlow models to generate dynamic prices based on customer behavior. These generated prices are then inserted or updated (upserted) back into the product catalog collection in MongoDB. This enables real-time adjustments in the e-commerce application because the application's front end retrieves data directly from the same collection.
Curious how MongoDB is changing the retail landscape? Dive deeper into MongoDB's capabilities and discover how it's revolutionizing the industry:
MongoDB helps retailers innovate and gain a competitive edge. Apply for an innovation workshop to explore the possibilities with our experts. If you’d like to connect with other people using MongoDB to build their next big project, head to the MongoDB Developer Community.

Facebook Icontwitter iconlinkedin icon
Rate this tutorial

Utilizing PySpark to Connect MongoDB Atlas with Azure Databricks

Apr 02, 2024 | 6 min read

How to Use the Union All Aggregation Pipeline Stage in MongoDB 4.4

Sep 23, 2022 | 16 min read

Adding Real-Time Notifications to Ghost CMS Using MongoDB and Server-Sent Events

Aug 14, 2023 | 7 min read

Build a Newsletter Platform With Flask and MongoDB

Jun 27, 2024 | 11 min read
Table of Contents