Query plan different for the same aggregation pipeline executed from Java vs in Compass

Simon_Burgess · September 22, 2023, 10:16pm

I use Java to create an aggregation pipeline. If I copy/paste that pipeline and run it in Compass, it executes quickly and the explain plan show that it’s using all the indexes I expect. When I run it in Java, it executes extremely slowly, and the query plan shows that it is NOT using the indexes that were used when in Compass. It’s definitely exactly the same pipeline in terms of its text representation, because I just copy/paste that from Java into Compass.

Why might this happen, and what can I do to ‘fix’ it in Java?

I’m running a local community edition v6.06, Compass 1.39.4, and Java sync version 4.10.2.

Here’s a snippet of the pipeline that seems to be behaving differently (forgive the field names, it’s all handled by the code and honestly there’s a good reason for it!)

{
    $lookup: {
      from: "dataInstances",
      localField: "fxLookup",
      foreignField:
        "compoundIndexValues.gbpFXLookup",
      as: "exchangeRates",
    },
  }

The explain on Compass shows me this for that bit:

{
      "$lookup": {
        "from": "dataInstances",
        "as": "exchangeRates",
        "localField": "fxLookup",
        "foreignField": "compoundIndexValues.gbpFXLookup"
      },
      "totalDocsExamined": 3631,
      "totalKeysExamined": 3631,
      "collectionScans": 0,
      "indexesUsed": [
        "schemaId_6f527503-ca52-4121-93ef-aa8b40537f1b_concatenatedFieldNames_gbpFXLookup"
      ],
      "nReturned": 4170,
      "executionTimeMillisEstimate": 194
    }

And the same pipeline executed via Java gives this in its explain:

documentAsMap = {LinkedHashMap@8342}  size = 7
 "$lookup" -> {Document@8252}  size = 4
 "totalDocsExamined" -> {Long@8254} 440902440
 "totalKeysExamined" -> {Long@8153} 0
 "collectionScans" -> {Long@8257} 4170
 "indexesUsed" -> {ArrayList@8259}  size = 0
 "nReturned" -> {Long@8261} 4170
 "executionTimeMillisEstimate" -> {Long@8263} 160575

(sorry, it’s just the copy of the debug stuff from IntelliJ, but hopefully you get the drift that it isn’t selecting to use the index)

Note that the ‘from’ collection is the same as the one the pipeline is running against. It has ~100k rows in it, so pretty tiny really. The pipeline IS quite large, with a few unions and projections up top, and 3 $lookups in total. One of those DOES use the index in both scenarios, but the other two don’t.

I can post more stuff but the pipeline is quite big and the field names are so complicated that it makes reading it difficult, so I’m hoping that somebody out there can help me to pin it down.

Simon_Burgess · September 24, 2023, 5:41pm

For what it’s worth, the issue turned out to be a ‘rogue’ comma at the end of a list definition in the JSON describing the pipeline. Compass is quite happy with this. Parsing the string to Bson docs and passing them to the MongoDB driver is not fine with it. Nothing complains, but behind the scenes there’s a null element of the array, which means the thing you think you’re matching is not what’s being matched.

Sergey_Petushkov · September 26, 2023, 6:33am

Hey @Simon_Burgess, Compass developer here! Trailing commas are valid in JavaScript and we can’t see any null values added to the pipeline in Compass when the pipeline is parsed from user input. When using “export to language” feature, it also seems to produce valid Java code with no null-s at the end:

We’d like to figure out what exactly might be going wrong here to decide how to address this. To help us with that can you share the whole pipeline string you got from Compass that was then parsed incorrectly? Can you also clarify a bit at what stage the null value is added to the pipeline? What are you using to convert pipeline from Compass pipeline editor to Java code?

Simon_Burgess · September 26, 2023, 2:42pm

Hi @Sergey_Petushkov,

Thanks for responding. I can confirm that Compass itself was doing everything I expected to do, and doing it well!

I’ve got some Kotlin code that parses a string into a list of Bson docs to be used as the pipeline:

gson.fromJson(pipelineText, Array<Document>::class.java).toList()

Pasting the contents of pipelineText into Compass works fine. However, the gson conversion doesn’t handle trailing commas well (depending on how you look at it), so my arrays get an extra null value in the Bson which completely throws off the MongoDB engine. That was the problem, not Compass.

Actually, out of interest, how do you handle converting to/from Bson/Text in Compass?

-Simon.

system · October 1, 2023, 2:42pm

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.