Hi, I want to import data from my Mongo collection to the BigQuery table using google dataflow.
There is a problem with type Decimal128, which is not supported by BQ. For example value -4 in Mongo is stored as {“high”:-5746593124524752896,“low”:4} in BQ. How can I convert this pair into the real value?
So far I’ve tried:
UDF in dataflow, but the runtime error occurs (without UDF it works),
I understand that you’re trying to import data from a Mongo collection to a BigQuery table using Google Dataflow and are facing issues with the Decimal128 type, which is not supported by BigQuery.
I managed to circumvent this issue by researching in forums and also looking at how the conversion is done in the bson library of Python. Based on this, I created a temporary function in BigQuery that takes the ‘high’ and ‘low’ values and returns the real value.
Here is the function I created:
CREATE TEMP FUNCTION decimal128(highString STRING, lowString STRING)
RETURNS STRING
LANGUAGE js
AS r"""
const _SIGN = BigInt("0x8000000000000000");
const high = BigInt(highString);
const low = BigInt(lowString);
const sign = (high & _SIGN) !== 0n ? 1 : 0;
const exponent = Number((high & BigInt("0x7FFF800000000000")) >> BigInt('49')) - 6176;
const digits = Array.from(low.toString()).map(Number);
const value = [sign, digits, exponent];
const _sign = value[0];
var digitsArray = [];
value[1].forEach(digit => {
if (Number.isInteger(digit) && digit >= 0 && digit <= 9) {
if (digitsArray.length || digit !== 0) {
digitsArray.push(digit);
}
} else {
throw new Error("The second value in the tuple must be composed of integers in the range 0 through 9.");
}
});
const _int = digitsArray.length ? digitsArray.join('') : '0';
const _exp = value[2];
const signStr = ['', '-'][_sign];
const leftDigits = _exp + _int.length;
let intPart, fracPart, dotPlace;
if (_exp <= 0 && leftDigits > -6){
dotPlace = leftDigits
}
else{
dotPlace = 1
}
if (dotPlace <= 0) {
intPart = '0';
fracPart = '.' + '0'.repeat(-dotPlace) + _int;
} else if (dotPlace >= _int.length) {
intPart = _int + '0'.repeat(dotPlace - _int.length);
fracPart = '';
} else {
intPart = _int.substring(0, dotPlace);
fracPart = '.' + _int.substring(dotPlace);
}
const exp = '';
return signStr + intPart + fracPart + exp;
"""
I hope this helps solve your problem. If you have any more questions, please let me know.