How to change all Binary types to String in MongoDB using Java?

Hi.

I am writing to connector from on prem MongoDB to Bigquery. I have a very difficult point. I convert one binary using decodeBinarytouuid method in java but data have lots of Binary data and also this connector is not specific to one team. So that I need a help for this situation.

Also I use Uuid Representation in MongoClientSettings but it isn’t work.

This example like this

{
  "_id": {
    "$binary": {
      "base64": "h7MRZycaOk6vLUlnqPx/HA==",
      "subType": "03"
    }
  },
  "ParentId": null,
  "Version": xx,
  "Name": "aaa",
  "HashedPassword": "dfdfd",
  "IsActive": false,
  "SapId": "",
  "InvoiceSapId": null,
  "LegalName": "sdsds",
  "BillingAddress": {
    "_id": {
      "$binary": {
        "base64": "LRSwK05HfEWDAr8Y1q1GcQ==",
        "subType": "03"
      }
    },
    "Line1": "dfdfd",
    "DistrictName": null,
    "CityCode": dfd,
    "TownCode": dfdf,
    "DistrictCode": dfd,
    "Label": null
  },
  "ShippingAddresses": [
    {
      "_id": {
        "$binary": {
          "base64": "eMK1k+wG6k2XTafzEBYQng==",
          "subType": "03"
        }
      },
      "Line1": "fdfdfds",
      "DistrictName": null,
      "CityCode": sdfds,
      "TownCode": sdffs,
      "DistrictCode":dfdf,
      "Label": "dfdsfd"
    }
  ]
}

This is my example payload and I want to take all binaries to string without I give. So can you help me please?

Best Regards,
Emin Can

Hi @Emin_Can_OGUZ and welcome in the MongoDB Community :muscle: !

What’s in your MongoDB collection initially? Can you share one document maybe from mongosh? Are these _id fields really binary data or it’s actually ObjectIds?

I have a Java quick start repo here: GitHub - mongodb-developer/java-quick-start: This repository contains code samples for the Java Quick Start blog post series

Maybe this can help you getting up and running.

Cheers,
Maxime.

1 Like

Hi @MaBeuLux88

Thanks for warm welcoming. Sorry for late response. This data see like this in mongosh. (I share only _id’s in data)

{
	"_id" : BinData(3,"AGUfrNBHhEK90SoDxgCIDA==")
    ...
	"BillingAddress" : {
		"_id" : BinData(3,"EuMKzcDDBkiDrZxsgwJgyw==")
        ...
	},
	"ShippingAddresses" : [
		{
			"_id" : BinData(3,"HwVvtOoXEEGc5ZF8wNe2cQ==")
            ...
		}
	],
	"FinancialInformation" : {
		"TaxOfficeId" : BinData(3,"AAAAAAAAAAAAAAAAAAAAAA==")
        ...
	},
}

I try also CodecRegistry but it isn’t works :frowning: I want to try Convert all Binary Data to String.

  • I try also decodeBinaryToUuid() function in Java. It is works but this function is only convert one _id field. But I want to convert all Binary Data without giving fields.

Can you help me again?

Sincerely,
Emin Can

1 Like

Well that’s very unusual.
It’s the first time that I actually see a BinData used as an _id. Usually we have ObjectIds which can be represented by an hexadecimal number.
What makes you think that this binary data can be represented as a string?
I have no idea how you could solve this problem really.

Cheers,
Maxime.

Because of I have a connector from on-prem mongo to BQ and I write this code using Java for my company. This connector is not supporting Binary Data. So that I am very trouble.

Thanks for responding for this situation

Best regards,
Emin Can

@Jeffrey_Yemin

Hi Jeffrey,

We are trouble for this situation and we are trying several ways to changing Binary data to String (Uuid is enough for us but this type must be string). Can you help us?

Sorry about my mention but we are so trouble for this problem.

Best regards,
Emin Can

For a generic tool like this, consider using a MongoCollection with a generic type of BsonDocument, e.g.

MongoCollection<BsonDocument> collection = database.getCollection("<name>", BsonDocument.class); 

Then all documents that you query for will be of type BsonDocument, and all binary values will be of type BsonBinary. From there you can get the raw byte array and convert it to string using any encoding you want, e.g. java.util.Base64. If you need to also encode the binary subtype, you’ll have to figure out a way to include the subtype as well.

UUIDs can be tricky though, so be careful with how you treat those bytes once you store them. See specifications/uuid.rst at master · mongodb/specifications · GitHub for the gory details. If you are able to start with fresh data, use STANDARD UuidRepresentation for all UUIDs. If you can’t do that, but you can assume JAVA_LEGACY, you might want to convert those to the standard representation before then converting to a String. I can help you with that as well if you need it.

Regards,
Jeff

Hi Jeffrey,

We solve problem in JSONObject. We can’t change default Mongo data. We write new function using BinaryConverter for JsonWriterSettings. This code this like that.

writer.writeString(UuidHelper.decodeBinaryToUuid(value.getData(),value.getType(),UuidRepresentation.JAVA_LEGACY).toString());

And also we use this data like that.

cursor.next().toJson(JsonWriterSettings.builder().binaryConverter(new JsonBinaryConverter()).dateTimeConverter(new JsonDateTimeConverter()).build())

Thanks for helping Jeff. I am very honour to talk with you.

Best Regards,
Emin Can

Hi @Jeffrey_Yemin again,

We solved this problem. Thanks for responding. So we are so discussing about adding this feature to driver with PR. We want to PR but we don’t know the PR rules in MongoDB. So that can you advice to PR the this feature or this problem is solved in forum. Which one do you prefer?

Best Regards,
Emin Can

Have a look at mongo-java-driver/CONTRIBUTING.md at master · mongodb/mongo-java-driver · GitHub.

Regards,
Jeff

Hi @Emin_Can_OGUZ, I also got stuck in the same situation.

I also have the same moto to decoding binary columns in my JSON response

here is my code

Object myData = new Object();
FindIterable findQuery = collection.find(Filters.and(Filters.eq(key, value)));
MongoCursor cursor = findQuery.iterator();
try {
while (cursor.hasNext()) {
response =cursor.next().toJson();
JSONParser parser = new JSONParser();
myData=parser.parse(response);
expData.add(myData);
}
} finally
{
cursor.close();
}


my JSON response:
{

    "SourceEndpoint": "file:///routes/appDataRoot/Source/",

    "Payload": {

        "$binary": "ew0KICAgICJwb3N0SWQiOiAxLA0KICAgICJpZCI6IDEsDQogICAgIm5hbWUiOiAiaWQgbGFib3JlIGV4IGV0IHF1YW0",

        "$type": "00"

    },

    "_id": {

        "$oid": "628659c1d03f0e204cdddd79"

        

    }

]


I need to decode the payload binary field in the above JSON response
I tried by the following code with json parameters.

cursor.next().toJson(JsonWriterSettings.builder().binaryConverter(new JsonBinaryConverter()).dateTimeConverter(new JsonDateTimeConverter()).build());

But I am getting issue Binary converter cannot be resolved a type

Could you please provide any suggestions on this?

Hi,

You can write a BinaryConverter class service in Java. Because of that JsonBinaryConverter is not here in Mongo. So that you can write a class after than you called the class new statement.

I want PR but I haven’t got any free time so that I share my code.

import org.bson.BsonBinary;
import org.bson.UuidRepresentation;
import org.bson.internal.UuidHelper;
import org.bson.json.Converter;
import org.bson.json.StrictJsonWriter;
import org.slf4j.Logger;
import org.slf4j.LoggerFactory;

public class JsonBinaryConverter implements Converter<BsonBinary> {
    private static final Logger LOGGER = LoggerFactory.getLogger(JsonDateTimeConverter.class);


    @Override
    public void convert(BsonBinary value, StrictJsonWriter writer) {
        try {
            writer.writeString(UuidHelper.decodeBinaryToUuid(value.getData(),value.getType(), UuidRepresentation.JAVA_LEGACY).toString());
        } catch (Exception e) {
            LOGGER.info(String.format("Fail to convert offset %d to JSON date",value),e);
        }
    }
}

Also if you have a time value you must convert this like JsonDateTimeConverter()

import org.bson.json.Converter;
import org.bson.json.StrictJsonWriter;

import java.time.Instant;
import java.time.ZoneId;
import java.time.format.DateTimeFormatter;
import java.util.Date;


public class JsonDateTimeConverter implements Converter<Long> {


    static final DateTimeFormatter DATE_TIME_FORMATTER = DateTimeFormatter.ISO_INSTANT
            .withZone(ZoneId.of("UTC"));

    @Override
    public void convert(Long value, StrictJsonWriter writer) {
        try {
            Instant instant = new Date(value).toInstant();
            String s = DATE_TIME_FORMATTER.format(instant);
            writer.writeString(s);
        } catch (Exception e) {
            System.out.println(e.getMessage());
        }
    }
}

You will call this service in your main statement. If you have a problem I will help you

Best Regards,
Emin Can

@Emin_Can_OGUZ

I have implemented of your code snippet Json Binary Converter and JsonDateTimeConverter.
I am getting the error

org.bson.BsonInvalidOperationException: Invalid state VALUE

Can you change mongo java driver to mongo driver sync? I used 4.2.0 version

Changed to mongo driver sync but again getting the same issue invalid state VALUE.

Can you add your code to Github if this code is not privacy? I don’t know what is going to this code. Maybe try the debug code.

Or can you change binary type? Because of that type is 00 for your payload.

Hi @Emin_Can_OGUZ

After a lots of debugging my code.
I am getting this issue

org.bson.BsonSerializationException: Expected length to be 16, not 1523.

my payload size not more than 1kb. which is inserted as binary format in mongodb.

Hi @Ravikishore_Bodha

What is your Java version? If your Java version is under 16, probably you get this error. My java version is Java 16.

Best regards,
Emin Can

I think I fixed the issue by converting BSON byte[] to string and returned to JSON Writer.

here is the code snippet

@Override
public void convert(BsonBinary value, StrictJsonWriter writer) {

        System.out.println("SUBTYPE"+value.getType());
        
       int sg= value.getData().length;
       System.out.println("LENGHTH"+sg);
       byte[] d = value.getData();
       String s = new String(d);   
       System.out.println("DATA-->"+s);
       try {
           writer.writeString(s);
           System.out.println(writer);
           
       } catch (Exception e) {
    	   System.out.println("EXCEPTION"+e);
    	   
          // LOGGER.info(String.format("Fail to convert offset %s to JSON date",value),e);
       }

JSON Response with decoded BINARY:

“Payload”: “{\r\n “postId”: 1,\r\n “id”: 1,\r\n “name”: “id labore ex et quam laborum”,\r\n “email”: “Eliseo@gardner.biz”,\r\n “body”: “laudantium enim quasi est quidem magnam voluptate ipsam eos\r\ntempora quo necessitatibus\r\ndolor quam autem quasi\r\nreiciendis et nam sapiente accusantium”\r\n"path” :"\game\forum\files\index.php$!@#%^&*()"\r\n }",

@Emin_Can_OGUZ Could you please help how to remove \r\n \ special characters in JSON response.

In my IDE console printing my payload response with no characters like \r\n
as shown below.
image