Good day. I’m working on some first steps in my development so mongo is still somewhat complex.
Nevertheless, I got to a point in my C# console application where in it’s simplest format takes data from a local SQL DB and pushes the documents to MongoDB. I believe I’m using an M2 basic elastic environment while I practice so maybe that’s my problem?
In one example I have ~190K documents to publish to MongoDB. It can be more or less but this is a case I’m using. at first I tried to push it all, which is several years of data, but that didn’t work. I figured maybe I surpassed the 16MB doc size limit so I broke it down into a throttle by years. In this case it’s 4 years of data. I feel using batching is how I’ll do most of my writes at this point.
This is an initial load so it has a larger dataset, which allows me to flesh out these limits. In production the data sets are relatively tiny. i.e 200-300 records a day.
Essentially
I’m generating a list of WriteModels as such:
var listWrites = new List<WriteModel>();
And publishing one year at a time to throttle it as such:
var result = await _salesCollection.BulkWriteAsync(listWrites);
A year is roughly 40-55K records(documents).
What I noticed first was that 2019 failed with 56K records which was the largest but subsequent (2020, 2021,2022) succeeded? They were all smaller (~40K records).
Then I changed from yearly to a throttle number (100K documents) and published chunks.
In this case a total of 193,519 documents the first 100K got in as expected, albeit slow, but the next 93,519 didn’t make it with the following exceptions.
Exception:
An exception occurred while receiving a message from the server
Exception Inner Message:
Attempted to read past the end of the stream.
Inner Exception.Stack:
at MongoDB.Driver.Core.Misc.StreamExtensionMethods.ReadBytesAsync(Stream stream, Byte buffer, Int32 offset, Int32 count, TimeSpan timeout, CancellationToken cancellationToken)
at MongoDB.Driver.Core.Connections.BinaryConnection.ReceiveBufferAsync(CancellationToken cancellationToken)
I haven’t managed to find an answer to help me understand what’s happening. It’s referring to an error “receiving a message”, but I don’t know exactly what the message is or how to trap it and resolve it? It’s talking about CenellationTokens etc but still not clear.
It feels like maybe I’m hitting a limit as 100K sounds familiar from documentation but I’d like to understand what sort of throttling I need to implement and on what limits?
At this point I’m ok with slow performance but for it to just fail? idk.
I feel this is “basic” for the pro’s out here and over time I’ll get more exposure but MAN I’m in trouble if I can’t even do a write without hitting a wall
Thanks in advance
CPT