Introducing MongoDB’s Multimodal Search Library For Python

Clark Gates-George and Henry Weller
July 16, 2025 | Updated: August 18, 2025

AI applications increasingly rely on a variety of different data types—text, images, charts, and complex documents—to drive rich user experiences. For developers building these applications, determining how to effectively search and retrieve information that spans these data types presents a challenge. Developers have to consider different chunking strategies, figure out how to incorporate figures and tables, and manage context that could bleed across chunks.

To simplify this, we're excited to announce the public preview of MongoDB’s Multimodal Search Python Library. This new library makes it easy to build sophisticated applications using multimodal data, providing a single interface for integrating MongoDB Atlas Vector Search, AWS S3, and Voyage AI's multimodal embedding model voyage-multimodal-3.

The library handles:

Processing and storage: It interacts with S3 for storing PDFs from a URL or referring to a PDF already stored in S3. PDFs are then turned into single-page images and stored in S3.
Generating embeddings: Images use voyage-multimodal-3 to produce high-quality embeddings.
Vector indexing: Finally, it indexes the embeddings using Atlas Vector Search and provides a reference back to S3.

The power of multimodal

Traditional search methods often struggle when dealing with documents that contain text alongside visual elements like charts and graphs, which are common in research papers, financial reports, and more. Developers typically need to build complex, custom pipelines to handle image storage, embedding generation, and vector indexing.

Our Multimodal Search Library abstracts this complexity away, using the best-in-class voyage-multimodal-3. It empowers developers to build applications that can understand and search the content of images just as easily as text. This enables accurate and efficient information retrieval and richer user experiences when working with either multimodal data or PDFs with visually rich documents.

Figure 1. Traditional chunking vs. multimodal embedding.

Diagram showing the before and after structure of using traditional chunking vs. multimodal embedding.

Imagine you're a financial analyst sifting through hundreds of annual reports—dense PDFs filled with text, tables, and charts—to find a specific trend. With our Multimodal Search Library, you can simply ask a question in natural language, like: "Show me all the charts illustrating revenue growth over the past three years." The library will process the query and retrieve pages containing the relevant charts from your corpus of knowledge.

Likewise, consider an e-commerce platform with a large product catalog. A shopper might be looking for a specific style of shoes but may not know the right keywords to describe exactly what they are looking for. By leveraging multimodal search, the user could upload an image of the shoes they like, and the application finds visually similar in-stock items, creating a seamless product discovery journey.

Learn how to get started

To get started, you’ll need:

A MongoDB Atlas cluster (sign up for the free tier)
A MongoDB collection in that cluster
A MongoDB Atlas Vector Search index
A Voyage AI API key (sign up)
An S3 bucket (sign up)

Installation and setup

First, we’ll ensure that we can connect to MongoDB Atlas, AWS S3, and Voyage AI.

pip install pymongo-voyageai-multimodal

import os
from pymongo import MongoClient
from pymongo_voyageai_multimodal import PyMongoVoyageAI

client = PyMongoVoyageAI.from_connection_string(
    connection_string=os.environ["MONGODB_ATLAS_CONNECTION_STRING"],
    database_name="db_name",
    collection_name="collection_name",
    s3_bucket_name=os.environ["S3_BUCKET_NAME"],
    voyageai_api_key=os.environ["VOYAGEAI_API_KEY"],
)

Adding documents

Next, we’ll add relevant documents for embedding generation.

from pymongo_voyageai_multimodal import TextDocument, ImageDocument

text = TextDocument(text="foo", metadata={"baz": "bar"})
images = client.url_to_images(
    "https://www.fdrlibrary.org/documents/356632/390886/readingcopy.pdf"
)
documents = [text, images[0], images[1]]
ids = ["1", "2", "3"]
client.add_documents(documents=documents, ids=ids)

Performing search

Finally, we’ll search for content most semantically similar to our query.

results = client.similarity_search(query="example", k=1)
for doc in results:
    print(f"* {doc['id']} [{doc['inputs']}]")

Loading data already stored in S3

Developers can also query against documents already stored in S3. See more information in the documentation.

import os
from pymongo_voyageai_multimodal import PyMongoVoyageAI

client = PyMongoVoyageAI(
    voyageai_api_key=os.environ["VOYAGEAI_API_KEY"],
    s3_bucket_name=os.environ["S3_BUCKET_NAME"],
    mongo_connection_string=os.environ["MONGODB_URI"],
    collection_name="test",
    database_name="test_db",
)

query = "The consequences of a dictator's peace"
url = "s3://my-bucket-name/readingcopy.pdf"
images = client.url_to_images(url)
resp = client.add_documents(images)
client.wait_for_indexing()
data = client.similarity_search(query, extract_images=True)

print(f"Found {len(data)} relevant pages")
client.close()

A few important notes:

Automatic updates to source data are not supported. Changes to indexed data need to be made via application code calling the client using the add_documents and delete functions.
This library is primarily meant to support integrating multimodal embeddings and MongoDB Atlas on relatively static datasets. It is not intended to support sophisticated aggregation pipelines that combine multiple stages or data that updates frequently.
voyage-multimodal-3 is the only embedding model supported directly, and AWS is the only cloud provider supported directly.

Ready to try it yourself?

Check out the Github project today to get started.

Learn more in our documentation, and please share feedback.

We can't wait to see what you build!

← Previous

“Hello, Community!”: Meet the 2025 MongoDB Community Champions!

We are so excited to announce this year’s new cohort of MongoDB Community Champions! Community Champions are the connective tissue between MongoDB and our community, keeping them informed about MongoDB’s latest developments and offerings. Community Champions also share their knowledge and experiences with others through a variety of media channels and event engagements. “The MongoDB Community Champions program is one of the best influencer programs,” says Shrey Batra, Head of Engineering and a fifth-year returning Champion. “We can contribute directly to the product development, participate in developer outreach, get developer feedback to the right people, and so much more! “ This year’s 47-member group includes 21 new champions. They come to us from countries all over the world, including Canada, the United States, South Korea, Malaysia, China, Australia, Serbia, Germany, India, Portugal, and Brazil. As a group, they represent a broad range of expertise and serve in a variety of community and professional roles—ranging from engineering leads to chief architects to heads of developer relations. “I’m excited to join the MongoDB Community Champions program because it brings together engineers who are deeply invested in solving real-world data challenges,” says Ruthvik Reddy Anumasu, Principal Database Engineer and a first-year Champion. “As someone who’s worked on scaling, securing, and optimizing critical data systems, I see this as a chance to both share practical insights and learn from others pushing boundaries.” Each Community Champion demonstrates exceptional leadership in advancing the growth and knowledge of MongoDB’s brand and technology. “Being part of the MongoDB Community Champions program is like a solo leveling process—from gathering like-minded personnel to presenting valuable insights that help others in their careers,” says Lai Kai Yong, a Software Engineer and first-year Champion. “I’m excited to continue shipping things, as I believe MongoDB is not only a great product and an amazing company, but also a vibe.” As members of this program, Community Champions gain a variety of experiences—including exclusive access to executives, product roadmaps, preview programs, an annual Champions Summit with product leaders—and relationships that grow their professional stature as MongoDB practitioners, helping them be seen as leaders in the technology community. “After working with MongoDB for more than a decade, I’m happy to be a MongoDB Community Champion,” says Patrick Pittich-Rinnerthaler, Hands-on Web Architect and first-year Champion. “One of the things I’m interested in particular, is the connection to other Champions and Engineers. Together, we enable customers and users to do more with MongoDB.” And now, without further ado, let’s meet the 2025 cohort of Community Champions! NEW COMMUNITY CHAMPIONS: Maria Khalusova, Margaret Menzin, Samuel Molling, Karen Zhang, Shaun Roberts, Joey Marburger, Steve Jones, Ruthvik Reddy Anumasu, Karen Huaulme, Lai Kai Yong, XiaoLei Dai, Luke Thompson, Darae Park, Kim Joong Hui, Rishi Agrawal, Sachin Hejip, Sachin Gupta, Patrick Pittich-Rinnerthaler, Marko Aleksendrić, PhD, Markus Wildgruber, Carla Barata. RETURNING COMMUNITY CHAMPIONS: Abirami Sukumaran, Arek Borucki, Azri Azmi, Christoph Strobl, Christopher Dellaway, Claudia Cardeno Cano, Elie Hannouch, Flavia da Silva Bomfim Policante, Igor Alekseev, Justin Jenkins, Kevin Smith, Leandro Domingues, Malak Abu Hammad, Mateus Leonardi, Michael Höller, Mustafa Kadioglu, Nancy Agarwal, Nenad Milosavljevic, Nilesh Soni, Nuri Halperin, Rajesh Nair, Roman Right, Shrey Batra, Tamara Manzi de Azevedo, Vivekanandan Sakthivelu, Zidan M. For more, visit our MongoDB Community Champions page. If you’d like to connect with your local MongoDB community, check out our MongoDB User Groups on Meetup .

July 15, 2025

Next →

Cars24 Improves Search For 300 Million Users With MongoDB Atlas

The Indian multinational online car marketplace Cars24 serves 300 million users globally. The company offers services that span sales, insurance, maintenance, financing, and more, reshaping the entire car ownership journey. Speaking at MongoDB .local Bengaluru in July 2025 , Pradeep Sharma, Head of Technology at Cars24, shared how MongoDB has been a key driver of Car24’s digital transformation journey. Specifically, he highlighted two recent use cases that show how MongoDB Atlas has helped Cars24 scale, improve its search capabilities, and reduce its architectural complexity. Matching the growing scale with simplified and expanded search Cars24 has operations in multiple countries, and a diverse customer base. Over the years, the company has used customer data, behavior analytics, and operational workflows to build, evolving from being a platform for buying and selling cars, to an end-to-end ecosystem, supported by a hub of interconnected systems. At the start of its journey, Cars24 relied on legacy databases for managing and searching data, such as Postgres. Their relational database set-up would store information, synchronize the data to a separate “bolt-on” search engine (such as Elasticsearch), manually indexing it, and then querying the index. While initially effective for a small application ecosystem, these processes became bottlenecked as the organization’s services grew. Multiple engineering teams piped data into a single search index, which often resulted in synchronization challenges and overwhelming administrative overhead. Cars24 faced three core limitations with this setup: Lower developer productivity: Exponential effort was spent maintaining pipelines and synchronizing procedures. Developers had little bandwidth for building business features or innovation. Architectural complexity: Ensuring data sync consistency required multiple pipelines and race logic. This led to inefficiencies in real-time dashboard updates for agents. Operational overhead: Maintaining separate systems for database and search—alongside provisioning, patching, scaling, and monitoring—strained resources. Seeking an integrated approach, Cars24 embraced MongoDB Atlas, hosted on Google Cloud . MongoDB Atlas would serve as a single, consistent, modern database and embedded search solution, powered by Apache Lucene. MongoDB Atlas Search also enabled Cars24 to run queries directly in the database. This eliminated the need to synchronise data between systems while delivering real-time results. This unified approach allowed the company’s developers to transition from managing complex synchronization mechanisms to building applications. Furthermore, the reduced administrative overhead enabled Cars24 to consolidate the team’s efforts, and to streamline query execution across the ecosystem. Thanks to MongoDB Atlas and MongoDB Atlas Search, Cars24 was able to: Avoid "synchronization tax”: Switching to MongoDB Atlas eliminated the need for data synchronization and the additional tooling this mandated. Real-time searches can be performed from a single interface and workflow. Deliver new search features faster: By using a single, unified API across database and search operations, new features can be delivered rapidly. Work with a fully managed platform: With MongoDB Atlas, Cars24’s engineers can focus more on application development and building products, rather than thinking about managing indexes, syncing, and more. Following this successful migration, Cars24 decided to also use MongoDB Atlas to replace one of its legacy databases, ArangoDB. The switch to MongoDB Atlas eliminated major roadblocks for other critical search capabilities. From ArangoDB to MongoDB: Streamlined operations and 50% cost savings As Cars24 scaled new services globally, it encountered limitations with its geospatial search solution, which was based on ArangoDB. This included performance bottlenecks, weak transactions as it was difficult to guarantee consistent data operations, and a limited ecosystem which meant that scaling developer onboarding and troubleshooting became increasingly onerous. Moving to MongoDB Atlas enabled Cars24 to transition its geospatial services, consolidating its data storage and search capabilities under a single, versatile platform. “We now have a highly available architecture, and an amazing team at MongoDB that has our back,” said Sharma. MongoDB offered a proven architecture for high availability, scalability, and real-world production readiness: Enhanced scalability: MongoDB’s ability to scale massive workloads supports Cars24’s growing global presence. Reliable transactions: MongoDB provides robust multi-document ACID transactions across shards, meeting mission-critical needs. Streamlined operations: MongoDB offers a single platform that is not limited to a database only. By consolidating its geospatial search workload under MongoDB, Cars24 has reduced maintenance and operational overhead. Not only did Cars24 cut costs in half by moving to MongoDB, but the widespread market adoption of MongoDB Atlas also means that Cars24 can continue to rapidly onboard developers familiar with MongoDB, a recruiting priority for Cars24’s growing development team. “To give you an idea, one of our business units had a developer team of less than 10 about a year ago. Now they are a triple-digit team,” said Sharma. “If we are going to keep introducing new developers, for a product coming up or scaling up, it becomes very important to focus on the community skills and support provided by our technology partner.” “Now that we have moved from ArangoDB to MongoDB Atlas, our developers are the happiest,” he added. Cars24 is now looking to consolidate even more of its application and data workflows under MongoDB Atlas. With the growing number of developers joining Cars24’s engineering teams, plans are to utilize MongoDB Atlas further to enhance productivity, scalability, and data-driven insights. Visit the MongoDB Atlas Learning Hub to learn more about Atlas. To learn more about MongoDB Atlas Search, visit our product page .

October 12, 2025