Exporting from Serverless Instance to GCP BigQuery

I’m trying to get my data from my Serverless Instance (v7.1.0) to GCP’s BigQuery via GCP’s Dataflow service. I’m getting this error:

Error message from worker: org.apache.beam.sdk.util.UserCodeException: com.mongodb.MongoConfigurationException: A TXT record is only permitted to contain the keys [authsource, replicaset], but the TXT record for 'starcade-prod-db.ljpzocf.mongodb.net' contains the keys [loadbalanced, authsource]
org.apache.beam.sdk.util.UserCodeException.wrap(UserCodeException.java:39)
org.apache.beam.sdk.io.Read$BoundedSourceAsSDFWrapperFn$DoFnInvoker.invokeSplitRestriction(Unknown Source)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForSplitRestriction(FnApiDoFnRunner.java:887)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:348)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:275)
org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1788)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForPairWithRestriction(FnApiDoFnRunner.java:824)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:348)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:275)
org.apache.beam.fn.harness.FnApiDoFnRunner.outputTo(FnApiDoFnRunner.java:1788)
org.apache.beam.fn.harness.FnApiDoFnRunner.access$3000(FnApiDoFnRunner.java:142)
org.apache.beam.fn.harness.FnApiDoFnRunner$NonWindowObservingProcessBundleContext.output(FnApiDoFnRunner.java:2506)
org.apache.beam.sdk.io.Read$OutputSingleSource.processElement(Read.java:1053)
org.apache.beam.sdk.io.Read$OutputSingleSource$DoFnInvoker.invokeProcessElement(Unknown Source)
org.apache.beam.fn.harness.FnApiDoFnRunner.processElementForParDo(FnApiDoFnRunner.java:799)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:348)
org.apache.beam.fn.harness.data.PCollectionConsumerRegistry$MetricTrackingFnDataReceiver.accept(PCollectionConsumerRegistry.java:275)
org.apache.beam.fn.harness.BeamFnDataReadRunner.forwardElementToConsumer(BeamFnDataReadRunner.java:213)
org.apache.beam.sdk.fn.data.BeamFnDataInboundObserver.multiplexElements(BeamFnDataInboundObserver.java:158)
org.apache.beam.fn.harness.control.ProcessBundleHandler.processBundle(ProcessBundleHandler.java:537)
org.apache.beam.fn.harness.control.BeamFnControlClient.delegateOnInstructionRequestType(BeamFnControlClient.java:150)
org.apache.beam.fn.harness.control.BeamFnControlClient$InboundObserver.lambda$onNext$0(BeamFnControlClient.java:115)
java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
org.apache.beam.sdk.util.UnboundedScheduledExecutorService$ScheduledFutureTask.run(UnboundedScheduledExecutorService.java:163)
java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
java.base/java.lang.Thread.run(Thread.java:829)
Caused by: com.mongodb.MongoConfigurationException: A TXT record is only permitted to contain the keys [authsource, replicaset], but the TXT record for 'starcade-prod-db.ljpzocf.mongodb.net' contains the keys [loadbalanced, authsource]
com.mongodb.ConnectionString.<init>(ConnectionString.java:388)
com.mongodb.MongoClientURI.<init>(MongoClientURI.java:257)
org.apache.beam.sdk.io.mongodb.MongoDbIO$BoundedMongoDbSource.getEstimatedSizeBytes(MongoDbIO.java:440)
org.apache.beam.sdk.io.Read$BoundedSourceAsSDFWrapperFn.splitRestriction(Read.java:289)

here’s a picture of my Dataflow job’s graph: https://drive.google.com/file/d/1-tQxnG0lldi_Yq5a8loE-wpJmCPS3kea/view?usp=sharing

it would be nice to not have to switch to a dedicated cluster because we’re still a small startup and the Serverless servers our purposes satisfactorily for now.

I have tried following https://cloud.google.com/blog/products/data-analytics/mongodb-atlas-and-bigquery-dataflow-templates however GCP tells me I need a dedicated cluster in Atlas

1 Like

Hi we are taking a look, and will get back to you soon

Any update here? I’m getting the same error message.

GCP’s Dataflow is unsupported with Serverless instances. We will let you know when that changes.