Why does executing the Python program independently with MongoDBAtlasVectorSearch.from_documents
work without issues, but when placed inside a Sanic API function it raises AttributeError: '_asyncio.Future' object has no attribute 'inserted_ids'
?"
I think it’s happening because of the way Sanic manages the event loop… Sanic functions asynchronously using its own event loop, so if you’re using MongoDBAtlasVectorSearch.from_documents
and it involves async MongoDB operations (like an insert_many
), the result is an asyncio.Future
object, which needs to be awaited. If it’s not properly awaited, that’s when you see the error about 'inserted_ids'
.
My first suggestin is to try to make sure any async MongoDB operations inside that method are awaited, especially if you’re using things like insert_many
. Here’s a quick example of what it could look like:
@app.route("/vector_search", methods=["POST"])
async def vector_search(request):
try:
# Make sure to await this if it involves async operations
result = await MongoDBAtlasVectorSearch.from_documents(request.json)
return json({"success": True, "inserted_ids": result.inserted_ids})
except AttributeError as e:
return json({"success": False, "error": str(e)}, status=500)
Also - not sure about the rest of your app - but are you, or have you considered using Motor…
Motor is a full-featured, non-blocking MongoDB driver for Python asyncio and Tornado applications. Motor presents a coroutine-based API for non-blocking access to MongoDB.
Hope this helps… let us know how you make out.
I am using Motor, and the code is similar to yours.
@bp.route('/knowledge/add', methods=['POST'])
async def knowledge_add(request):
data = request.json
user_code = data.get('user_code')
file_path = data.get("file_path")
loader = PyPDFLoader(file_path)
data = loader.load()
# 调用向量化方法
text_splitter = RecursiveCharacterTextSplitter(chunk_size=400, chunk_overlap=20)
docs = text_splitter.split_documents(data)
embed_model = OpenAIEmbeddings()
vector_store = MongoDBAtlasVectorSearch.from_documents(
documents=docs,
embedding=embed_model,
collection=collection,
index_name="vector_index"
)
return create_response(msg=get_message(request, "knowledge_success"), data=vector_store)