Join us Sept 17 at .local NYC! Use code WEB50 to save 50% on tickets. Learn more >
MongoDB Event
Docs 菜单
Docs 主页
/
Atlas
/ / /

使用 CrewAI 和MongoDB构建代理 RAG 应用程序

在本教程中,您构建一个 工作组 ,其中包括一个AI代理,该代理可以使用MongoDB Vector Search 工具分析PDF 文档。

要学习;了解有关MongoDB CrewAI 集成的更多信息,请参阅 将MongoDB与 CrewAI 集成。

如要完成本教程,您必须具备以下条件:

  • 已安装 CrewAI。要学习;了解更多信息,请参阅安装。

  • 以下之一:

    • 运行MongoDB 版本的Atlas6.0.11 集群,7.0.2 或更高版本。确保您的 IP解决 包含在Atlas项目的 访问权限列表 中。

    • 使用Atlas CLI创建的本地Atlas部署。要学习;了解更多信息,请参阅部署本地Atlas集群。

  • OpenAI API密钥。您必须拥有一个具有可用于API请求的积分的 OpenAI 帐户。要学习;了解有关注册 OpenAI 帐户的更多信息,请参阅 OpenAI API网站。

注意

Python版本兼容性可能与 CrewAI 官方文档不同。在撰写本文时,crewai-tools包依赖于 embedchain,这需要介于 3.9 和 3.13.2 之间的Python版本(含)。

请完成以下步骤来构建和运行工作人员:

1
  1. 在终端中运行以下命令,创建名为 crewai-mongodb-project 的新目录并安装所需的依赖项:

    mkdir crewai-mongodb-project
    cd crewai-mongodb-project
    pip install 'crewai-tools[mongodb]' python-dotenv langchain-community
  2. 在您的项目中,创建一个 .env文件并添加以下行:

    OPENAI_API_KEY="<openai-api-key>"
    MONGODB_URI="<connection-string>"

    注意

    <connection-string> 替换为Atlas 集群或本地Atlas部署的连接字符串。

    连接字符串应使用以下格式:

    mongodb+srv://<db_username>:<db_password>@<clusterName>.<hostname>.mongodb.net

    要学习;了解更多信息,请参阅通过驱动程序连接到集群。

    连接字符串应使用以下格式:

    mongodb://localhost:<port-number>/?directConnection=true

    要学习;了解更多信息,请参阅连接字符串。

2

在项目中创建名为 main.py 的文件并粘贴以下代码:

from crewai import Agent, Task, Crew, Process, LLM
from crewai_tools import MongoDBVectorSearchTool, MongoDBVectorSearchConfig
from langchain_community.document_loaders import PyPDFLoader
from dotenv import load_dotenv
import os, time
load_dotenv()
def rag_agent():
"""
An agent that uses RAG to analyze recent MongoDB announcements.
"""
# Configure the vector search tool
tool = MongoDBVectorSearchTool(
connection_string = os.environ.get("MONGODB_URI"),
database_name = "crewai_db",
collection_name = "test"
)
# Connect to MongoDB collection and delete all documents
coll = tool._coll
coll.delete_many({})
# Load PDF from URL and insert documents into MongoDB
print("Loading MongoDB AI announcements PDF...")
loader = PyPDFLoader("https://investors.mongodb.com/node/13556/pdf")
tool.add_texts([i.page_content for i in loader.load()])
# Create the vector search index
print("Creating vector search index...")
if not any([ix["name"] == "vector_index" for ix in coll.list_search_indexes()]):
tool.create_vector_search_index(dimensions=1536, auto_index_timeout=60)
# Wait for index initial sync to complete
n_docs = coll.count_documents({})
start = time.monotonic()
while time.monotonic() - start <= 60:
if len(tool._run("test query")) == n_docs:
print("Index is ready for queries")
break
else:
time.sleep(1)
# Specify a custom vector search query (optional)
tool.query_config = MongoDBVectorSearchConfig(
limit=3, score_threshold=0.75
)
# Test the tool
print("Testing the tool...")
print(tool.run(query="AI announcements"))
# Assemble a crew by specifying an agent and its task
researcher = Agent(
role="MongoDB Announcement Researcher",
goal="Find and extract key information about MongoDB's recent announcements and developments",
backstory="You're specialized in analyzing business and technology announcements",
verbose=False,
tools=[tool],
llm=LLM(model="gpt-4o"), # Customize to your LLM of choice
)
research_task = Task(
description="Research MongoDB's recent AI announcements and developments",
expected_output="A summary of MongoDB's latest AI initiatives, partnerships, and features",
agent=researcher,
)
crew = Crew(
agents=[researcher],
tasks=[research_task],
process=Process.sequential,
verbose=False
)
# Get the results and print the analysis
print("Running the crew...")
result = crew.kickoff()
print("\n" + "="*50 + "\nMONGODB AI ANNOUNCEMENTS ANALYSIS:\n" + "="*50)
print(result.raw)
return result
if __name__ == "__main__":
rag_agent()

此脚本执行以下操作:

  • 加载MongoDB AI公告 PDF,将每个页面的文本提取到 crewai_db数据库和 test集合中,并在该集合上创建Atlas Vector Search索引。

  • 定义其他向量搜索查询参数并执行快速测试查询。

  • 使用向量搜索工具定义 CrewAI代理,并描述其角色、目标和背景故事。

  • 为代理定义一项任务,以研究和总结 MongoDB 最近的AI公告。

  • 通过指定代理和任务来集合工作人员。然后,它运行该工作人员并打印结果。

3

运行以下命令以执行脚本:

uv run main.py
Loading MongoDB AI announcements PDF...
Creating vector search index...
Testing the tool...
Using Tool: MongoDBVectorSearchTool
[{"_id": {"$oid": "689baa5e6907244d329d0586"}, "text": "MongoDB Strengthens Foundation for AI Applications with Product Innovations and Expanded\nPartner Ecosystem\nAugust 11, 2025\nNew Voyage AI models introduce context awareness and set new accuracy benchmarks\u2014at industry-leading price-performance\nMongoDB's AI ecosystem expands AI framework, agentic evaluation, and agentic workflow orchestration capabilities\nApproximately 8,000 startups, including Laurel and Mercor, have chosen MongoDB to help build their AI projects ... (truncated)
Running the crew...
==================================================
MONGODB AI ANNOUNCEMENTS ANALYSIS:
==================================================
**MongoDB Strengthens Foundation for AI Applications with Product Innovations and Expanded Partner Ecosystem**
**August 11, 2025**
MongoDB announced a range of product innovations and AI partner ecosystem expansions at Ai4 2025 to make it faster and easier for customers to build accurate, trustworthy, and reliable AI applications at scale. The company is providing industry-leading embedding models and a fully integrated, AI-ready data platform, alongside assembling a world-class ecosystem of AI partners to deliver reliable and cost-effective AI solutions.
**Key Highlights from AI Initiatives:**
### AI Innovations:
- **Voyage AI Models**:
- MongoDB introduced context-aware embedding models, achieving better retrieval accuracy without requiring metadata hacks or pipeline gymnastics.
- New model variants, such as **voyage-context-3**, **voyage-3.5**, and **voyage-3.5-lite**, deliver groundbreaking retrieval accuracy at competitive price-performance metrics.
- **Instruction-following reranking models** like `rerank-2.5` and `rerank-2.5-lite` enable developers to improve retrieval accuracy further.
- **MongoDB MCP Server**:
- Launched as a public preview, the MongoDB Model Context Protocol (MCP) Server enables direct integration with popular tools like GitHub CoPilot in Visual Studio Code, Anthropic's Claude, Cursor, and Windsurf.
- Thousands of users have been actively building applications leveraging this new protocol.
### Partnerships:
MongoDB expanded its AI partner ecosystem to provide customers with streamlined workflows and reliable AI applications:
- **Galileo**:
- A reliability and observability platform for AI applications that offers continuous evaluations and monitoring for MongoDB-based projects.
- **Temporal**:
- A Durable Execution platform that empowers developers to orchestrate scalable, resilient AI use cases like retrieval-augmented generation (RAG) systems and context engineering pipelines. Temporal ensures that AI solutions can operate smoothly across distinct failures and interactions.
- **LangChain**:
- MongoDB's partnership with LangChain has facilitated advancements like natural language querying, agent-based system creation, and **GraphRAG** for enhanced LLM transparency. Developers can build sophisticated AI systems deploying real-time, proprietary MongoDB data.
### Developer Engagement:
- MongoDB has seen substantial adoption among both startups and enterprises:
- Approximately 8,000 startups selected MongoDB for AI projects, including Laurel (timekeeping startup) and Mercor (AI-based talent matching).
- Large enterprises like Vonage, LGU+, and The Financial Times also rely on MongoDB for scalable AI infrastructure.
### Thought Leadership:
Andrew Davidson, SVP of Products at MongoDB, emphasized the importance of robust database systems in the era of AI:
- "Modern AI applications require a database combining advanced capabilities such as integrated vector search and embedding models. By consolidating the AI stack, MongoDB is empowering developers to deliver innovative AI solutions faster than ever."
Fred Roma, SVP of Engineering, further highlighted the challenge of scaling AI due to complexity in fine-tuning models, high expenses, and integration barriers:
- "MongoDB's focus remains on designing models that achieve better functionality, reliability, and affordability for developers leveraging AI applications."
### About MongoDB:
MongoDB, headquartered in New York, provides a unified database platform powering next-gen applications across industries. Its comprehensive platform integrates operational data, search, real-time analytics, and AI-powered retrieval, supporting millions of developers globally. MongoDB boasts over 50,000 customers and supports a growing AI application ecosystem.
For more information on MongoDB's AI endeavors, visit [mongodb.com](https://www.mongodb.com).
### Sources:
Original announcement and additional multimedia available at [PRNewswire](https://www.prnewswire.com/news-releases/mongodb-strengthens-foundation-for-ai-applications-with-product-innovations-and-expanded-partner-ecosystem-302526003.html).
Contact: **press@mongodb.com** for press inquiries.

后退

CrewAI

在此页面上