Fireworks AI and MongoDB: The Fastest AI Apps with the Best Models, Powered By Your Data

Mat Keep and Angela Lee


We’re happy to announce that Fireworks AI and MongoDB are now partnering to make innovating with generative AI faster, more efficient, and more secure. Fireworks AI was founded in late 2022 by industry veterans from Meta’s PyTorch team, where they focused on performance optimization, improving the developer experience, and running AI apps at scale.

This post is also available in: Deutsch, Français, Español, Português, Italiano, 한국어, 简体中文.

It’s this expertise that Fireworks AI brings to its production AI platform, curating and optimizing the industry's leading open models. Benchmarking by the company shows gen AI models running on Fireworks AI deliver up to 4x faster inference speeds than alternative platforms, with up to 8x higher throughput and scale.

Models are one part of the application stack. But for developers to unlock the power of gen AI, they also need to bring enterprise data to those models. That’s why Fireworks AI has partnered with MongoDB, addressing one of the toughest challenges to adopting AI. With MongoDB Atlas, developers can securely unify operational data, unstructured data, and vector embeddings to safely build consistent, correct, and differentiated AI applications and experiences.

Jointly, Fireworks AI and MongoDB provide a solution for developers who want to leverage highly curated and optimized open-source models, and combine these with their organization’s own proprietary data — and to do it all with unparalleled speed and security.

Lightning-fast models from Fireworks AI: Enabling speed, efficiency, and value

Developers can choose from many different models to build their gen AI-powered apps. Navigating the AI landscape to identify the most suitable models for specific tasks — and tuning them to achieve the best levels of price and performance — is complex and creates friction in building and running gen AI apps. This is one of the key pain points that Fireworks AI alleviates.

With its lightning-fast inference platform, Fireworks AI curates, optimizes, and deploys 40+ different AI models. These optimizations can simultaneously result in significant cost savings, reduced latency, and improved throughput. Their platform delivers this via:

  • Off-the-shelf models, optimized models, and add-ons: Fireworks AI provides a collection of top-quality text, embedding, and image foundation models. Developers can leverage these models or fine-tune and deploy their own, pairing them with their own proprietary data using MongoDB Atlas.

  • Fine-tuning capabilities: To further improve model accuracy and speed, Fireworks AI also offers a fine-tuning service using its CLI to ingest JSON-formatted objects from databases such as MongoDB Atlas.

  • Simple interfaces and APIs for development and production: The Fireworks AI playground allows developers to interact with models right in a browser. It can also be accessed programmatically via a convenient REST API. This is OpenAI API-compatible and thus interoperates with the broader LLM ecosystem.

  • Cookbook: A simple and easy-to-use cookbook provides a comprehensive set of ready-to-use recipes that can be adapted for various use cases, including fine-tuning, generation, and evaluation.

Fireworks AI and MongoDB: Setting the standard for AI with curated, optimized, and fast models

With Fireworks AI and MongoDB Atlas, apps run in isolated environments ensuring uptime and privacy, protected by sophisticated security controls that meet the toughest regulatory standards:

  • As one of the top open-source model API providers, Fireworks AI serves 66 billion tokens per day (and growing).

  • With Atlas, you run your apps on a proven platform that serves tens of thousands of customers, from high-growth startups to the largest enterprises and governments.

Together, the Fireworks AI and MongoDB joint solution enables:

  • Retrieval-augmented generation (RAG) or Q&A from a vast pool of documents: Ingest a large number of documents to produce summaries and structured data that can then power conversational AI.

  • Classification through semantic/similarity search: Classify and analyze concepts and emotions from sales calls, video conferences, and more to provide better intelligence and strategies. Or, organize and classify a product catalog using product images and text.

  • Images to structured data extraction: Extract meaning from images to produce structured data that can be processed and searched in a range of vision apps — from stock photos, to fashion, to object detection, to medical diagnostics.

  • Alert intelligence: Process large amounts of data in real-time to automatically detect and alert on instances of fraud, cybersecurity threats, and more.

Figure 1: The Fireworks tutorial showcases how to bring your own data to LLMs with retrieval-augmented generation (RAG) and MongoDB Atlas

Getting started with Fireworks AI and MongoDB Atlas

To help you get started, review the Optimizing RAG with MongoDB Atlas and Fireworks AI tutorial, which shows you how to build a movie recommendation app and involves:

  • MongoDB Atlas Database that indexes movies using embeddings. (Vector Store)

  • A system for document embedding generation. We'll use the Fireworks embedding API to create embeddings from text data. (Vectorisation)

  • MongoDB Atlas Vector Search responds to user queries by converting the query to an embedding, fetching the corresponding movies. (Retrieval Engine)

  • The Mixtral model uses the Fireworks inference API to generate the recommendations. You can also use Llama, Gemma, and other great OSS models if you like. (LLM)

  • Loading MongoDB Atlas Sample Mflix Dataset to generate embeddings (Dataset)

We can also help you design the best architecture for your organization’s needs. Feel free to connect with your account team or contact us here to schedule a collaborative session and explore how Fireworks AI and MongoDB can optimize your AI development process.