Building AI with MongoDB: Announcing the First Qualifiers for the Innovators Program

Mat Keep

#genAI#Vector Search

Artificial Intelligence is igniting so many brilliant ideas for new products and services. But turning those ideas into reality is a path that even the brightest minds struggle to navigate without some help along the way. That’s why we launched the MongoDB AI Innovators Program back in June this year. Access to expert technical advice, free MongoDB Atlas credits, co-marketing opportunities, and – for eligible startups, introductions to potential venture investors – come together to help you “build the next big thing” in AI.

Since opening the program, we’ve received applications from around the world addressing every industry and spanning the spectrum of generative to analytical AI use cases. From enterprise chat and video bots that improve customer service and unlock insights from vast internal information repositories, conversational intelligence for sales reps, AI agents for workflow orchestration, tools for talent recruitment and retention, identifying workplace burnout through to news classifiers and summarization, personal wellbeing assistants, and the generation of bedtime stories for children that make science and technology more accessible to them. We’ve been amazed by the breadth and pace of innovators building AI on top of MongoDB Atlas.

In this blog post, I want to share an overview of three startups that have just qualified from our AI Innovators Program.

Check out our AI resource page to learn more about building AI-powered apps with MongoDB.

Elevating the edge experience: Deploy AI anywhere with Cloneable and MongoDB

Cloneable provides the application layer that brings AI to any device at the edge of the network. The Cloneable platform empowers developers to craft dynamic applications using intuitive low/no-code tools, instantly deployable to a spectrum of devices - mobiles, IoT devices, robots, and beyond.

By harnessing machine learning models, a business can seamlessly leverage complex technologies across its operations. Models are pushed down to the device where they are converted to a native embedded format such as CoreML. From here, they are executed by the device’s neural engine to provide low latency inference, computer vision, and augmented reality.

Cloneable uses MongoDB Atlas Device Sync to persist data locally on the device and sync it to the Atlas database backend in the cloud. This unique synergy creates an ecosystem where enterprise apps become real-time gateways to track, measure, inspect, and respond to events across an operation.

The company is also exploring creating vector embeddings from images and data collected on devices and storing them in Atlas Vector Search. With this expanded functionality, users can better search and analyze events collected from the field.

Predicting risks to public safety

ExTrac draws on thousands of data sources identified by domain experts, using AI-powered analytics to locate, track and forecast both digital and physical risks to public safety in real-time. Initially serving Western governments to predict risks of emerging or escalating conflicts overseas, ExTrac is expanding into enterprise use cases for reputational management, operational risk, and content moderation.

“Data is at the core of what we do. Our domain experts find and curate relevant streams of data, and then we use AI to anonymize and make sense of it at scale”, said Matt King, CEO at ExTrac. “We take a base model, such as RoBERTa or an LLM, and fine-tune it with our own labeled data to create domain-specific models capable of identifying and classifying threats in real-time.”

Asked about why ExTrac built on MongoDB Atlas, King said “The flexibility of the document data model allows us to land, index, and analyze data of any shape and structure – no matter how complex. This helps us unlock instant insights for our customers.”

King went on to say “Atlas Vector Search is also proving to be incredibly powerful across a range of tasks where we use the results of the search to augment our LLMs and reduce hallucinations. We can store vector embeddings right alongside the source data in a single system, enabling our developers to build new features way faster than if they had to bolt-on a standalone vector database - many of which limit the amount of data that can be returned if it has meta-data attached to it. We are also moving beyond text to vectorize images and videos from our archives dating back over a decade. Being able to query and analyze data in any modality will help us to better model trends, track evolving narratives, and predict risk for our customers.”

Access to technical expertise provided by the AI Innovators Program will help ExTrac manage the ever-growing size of its data sets – keeping performance high and costs low as the business scales.

Using AI to cut maritime emissions and risk

CetoAI provides predictive analytics for the maritime industry; combining high-frequency data, engineering expertise, and artificial intelligence the company reduces machinery breakdowns, cuts carbon emissions, and manages operational risk.

Sensors installed onto each vessel generate real-time data feeds of engine and vessel performance. The data is used by the company’s AI models for predictive maintenance, optimizing fuel consumption, and carbon intensity forecasting, with the outputs consumed by the vessel’s crew, owners, and insurers.

The data feeds generated by CetoAI are highly complex. Sensors on each vessel emit around 90,000 JSON documents daily. Each document stores around 100 unique time-series measurements, all requiring heavy-duty analytics processing before feeding machine learning models. It was these demands that led CetoAI’s engineering team to select MongoDB, migrating from a standalone time-series database that couldn’t keep pace with business growth.

Sensor measurements from each data feed are streamed through Microsoft Azure’s IoT hub and ingested into MongoDB Atlas’ purpose-built time-series database collections. Here MongoDB window functions process and transform the data before serving it to CetoAI’s machine-learning models built with PyTorch and Scikit-Learn.

CetoAI is now exploring additional capabilities available with MongoDB to expand its offerings. Atlas Device Sync can persist data locally and sync it between vessels and the cloud, withstanding the loss of network connectivity. Atlas Vector Search can be used for Retrieval Augmented Generation with the company’s LLMs. These are being developed to help crews diagnose and remediate equipment failures using natural language queries. Access to Atlas credits and expert support provided as part of the AI Innovators program enable CetoAI to accelerate and derisk the delivery of these new services.

Get started

Today we’ve focused on just three startups – there are many more that are already enjoying the benefits of the AI Innovators Program. We have more places left, but they are filling up fast, so sign up directly on the program’s web page.

Also, check out our MongoDB for Artificial Intelligence resources page for all of the latest best practices to get you started on building the “next big thing” with AI.