$out from Datalake to Attlas Cluster with pymongo

Hello Everyone,

So today I ran into a road block in my development bc I cant seem to find anyone having this issue online.

I’m attempting to use mongo datalake to weekly import a csv file to my cluster , about 1MM records.

for that i have set up a aggregate pipeline with some transformations and and $out statement at the end, like this:

{
"$out": {
    "atlas": {
        "clusterName": "Cluster",
        "db": "source",
        "coll": "collection"
}

according to documentation here

https://docs.mongodb.com/datalake/reference/pipeline/out

as the last step. the pipeline works great on a mongo client , etc . but by doing it so on python:

def extract_new_movers_population():
pipeline = extract_transform
res= collection.aggregate(pipeline=pipeline, allowDiskUse=True)
print(res)

I get this output:

pymongo.errors.OperationFailure: If an object is passed to $out it must have exactly 2 fields: 
'db' and 'coll',     full error: {'operationTime': Timestamp(1612190585, 1), 'ok': 0.0, 
'errmsg': "If an object is passed to $out it must have exactly 2 fields: 
'db' and 'coll'", 'code': 16994, 'codeName': 'Location16994', '$clusterTime': {'clusterTime': 
Timestamp(1612190585, 1), 'signature': {'hash': b')\xb6\x15\xfa\x03\x1cv\x0b\xa1\xef\xa5\x0c\x0c\x0c^\xe7\x9d/\xa2\x1f', 'keyId': 6922836924319137794}}}

with motor driver gets even worst because the driver swallows the output and never tells

Hope I can find a captain

Hi @luis_carvajal.

From the error message it appears that you are not connected to an Atlas Data Lake instance via the Python driver, but instead connected directly to a MongoDB cluster (mongod). How are you obtaining the connection string for connecting to the Data Lake? I believe you need to follow the instructions detailed here https://docs.mongodb.com/datalake/tutorial/connect and choose the ‘Connect Your Application’ option to obtain the correct string. After that, please retry your command!

On the Motor front - it is possible that you are not seeing an error because you are not iterating the cursor returned by aggregate(). Motor’s cursors are instantiated lazily - i.e. the ‘aggregate’ command is only sent when the application attempts to iterate the cursor. Try iterating the Motor cursor in an async for loop and you should see error.

Please let me know if this works!
-Prashant