Data got corrupted

Hi,

I have a very weird exception:

An error occurred while deserializing the Data property of class Squidex.Domain.Apps.Entities.MongoDb.Contents.MongoContentEntity: Unable to translate bytes [BD] at index 1136 from specified code page to Unicode.
at MongoDB.Bson.Serialization.BsonClassMapSerializer`1.DeserializeMemberValue(BsonDeserializationContext context, BsonMemberMap memberMap)
   at MongoDB.Bson.Serialization.BsonClassMapSerializer`1.DeserializeClass(BsonDeserializationContext context)
   at MongoDB.Bson.Serialization.BsonClassMapSerializer`1.Deserialize(BsonDeserializationContext context, BsonDeserializationArgs args)
   at MongoDB.Bson.Serialization.IBsonSerializerExtensions.Deserialize[TValue](IBsonSerializer`1 serializer, BsonDeserializationContext context)
   at MongoDB.Driver.Core.Operations.CursorBatchDeserializationHelper.DeserializeBatch[TDocument](RawBsonArray batch, IBsonSerializer`1 documentSerializer, MessageEncoderSettings messageEncoderSettings)
   at MongoDB.Driver.Core.Operations.FindOperation`1.CreateFirstCursorBatch(BsonDocument cursorDocument)
   at MongoDB.Driver.Core.Operations.FindOperation`1.CreateCursor(IChannelSourceHandle channelSource, IChannelHandle channel, BsonDocument commandResult)
   at MongoDB.Driver.Core.Operations.FindOperation`1.ExecuteAsync(RetryableReadContext context, CancellationToken cancellationToken)
   at MongoDB.Driver.Core.Operations.FindOperation`1.ExecuteAsync(IReadBinding binding, CancellationToken cancellationToken)
   at MongoDB.Driver.OperationExecutor.ExecuteReadOperationAsync[TResult](IReadBinding binding, IReadOperation`1 operation, CancellationToken cancellationToken)
   at MongoDB.Driver.MongoCollectionImpl`1.ExecuteReadOperationAsync[TResult](IClientSessionHandle session, IReadOperation`1 operation, ReadPreference readPreference, CancellationToken cancellationToken)
   at MongoDB.Driver.MongoCollectionImpl`1.UsingImplicitSessionAsync[TResult](Func`2 funcAsync, CancellationToken cancellationToken)
   at MongoDB.Driver.IAsyncCursorSourceExtensions.ToListAsync[TDocument](IAsyncCursorSource`1 source, CancellationToken cancellationToken)
   at Squidex.Infrastructure.MongoDb.MongoExtensions.ToListRandomAsync[T](IFindFluent`2 find, IMongoCollection`1 collection, Int64 take, CancellationToken ct) in C:\src\src\Squidex.Infrastructure.MongoDb\MongoDb\MongoExtensions.cs:line 237
   at Squidex.Domain.Apps.Entities.MongoDb.Contents.Operations.Extensions.QueryContentsAsync(IMongoCollection`1 collection, FilterDefinition`1 filter, ClrQuery query, CancellationToken ct) in C:\src\src\Squidex.Domain.Apps.Entities.MongoDb\Contents\Operations\Extensions.cs:line 114

It seems that a string was written, but the driver cannot read anymore. The same problem happens in compass, where the same error happens when the user goes to the page.

This needs to go into a Support Ticket.

Provide them with:
MongoDB Version
Compass Version
If on Atlas provide the cluster information and links.

They will guide you from there.

I will ask my customer about that. The data is stored on Atlas, but I don’t know which MongoDB version.

Definitely zero hesitation, get that to MongoDB Support team, their backend engineers may be able to revert or make necessary changes to recover and correct whatever happened.

Of course. But there are several perspectives:

  1. I am maintaining an Open Source project, where I give paid support. So I want to understand how this happens and if there is anything I can to prevent that. Is this a bug on the client? Is this a bug on the server? Or is this a problem from my side or is it really data corruption, e.g. the disk is broken or something like that.

  2. The customer wants to have the actual problem fixed.

So the correction is not the most important thing. I would like to understand if this can happen again.

@Sebastian_Stehle1

The only people who can actually investigate this, is MongoDB Engineering. Just as they are the only people who can determine why this happened.

So what you’d do, is establish a support ticket with MongoDB Support, and request a Root Cause Analysis which will not only go over what the fix was, but why this occurred and your Technical Services Engineer will provide the steps if any that you need to take to prevent this from occurring again.

You are always welcome to share in here for others what was stated etc. It’s your choice, but otherwise there’s nothing you can do with Atlas or Device Sync in troubleshooting things like that, because it’s all backend engineering side.

The longer that you take to raise a ticket to support, the longer your customer may have an outage.

When you go to make a ticket:

  • Sample data
  • User information
  • Around the time this was first seen and the timezone
  • Any logs on your end via application and so on.
  • Does this connect to a Realm/Device Sync Application
  • The URL to the atlas cluster
  • The URL to the Device Sync App
  • A Device Sync App dump if Device Sync App exists.

To get the Device Sync App dump:

mongodump --uri="mongodb+srv://<admin user>:<admin pwd>@cluster0.bleonard.mongodb.net/realm_sync" --gzip -o /path/to/folder 

The will come from this page and will be who you set as admin…

The will be the password for the user above.

You will need to use realm_syncat the end of the url to get the _realm_sync database. It should include a file called history.bson, as well as some other files that look like the ones above.

If you have a Device Sync App, attached the History.BSON to the support ticket and all of this information will make it much faster to figure out what happened, correct it, and explain it to you.

Thanks for your very detailed answer. I do not have access to the data and most of the engineers of my customer also do not have access to atlas. So I can only forward your information and see what happens.

As much information that can be provided to the support ticket, the better.