A query that compares timestamp, would run slower with the timestamp created as Index?

Hi all,

I am new to MongoDB. I’ve been trying to optimise my database schema, and queries. Reading on how to what to create index on.

In this collection, I would have large number of documents like this (currently at 1.5m+):

{
    "_id": "QQoMlgFhfoAGuVYsVcGCGhWBZMpNBfjOTbp",
    "cost": 0.0103,
    "url": "https://content-delivery.pro/in.php?tcid=f0e2590aae1cc442458eb30a7495554eb91ab2b6",
    "traffic": "XXAds",
    "campaigns": "123456 FFF",
    "browser": "Safari",
    "c1": 52029,
    "geo": "CA",
    "log_time": {
        "$date": {
            "$numberLong": "1654319236000"
        }
    }
}

A typical query would be an aggregate, that would match one of the attributes or compare timestamp, and create sums of records/payout/cost etc, like this:

[
  {
    '$match': {
      'log_time': {
        '$lt': new Date('Sat, 04 Jun 2022 04:27:59 GMT'), 
        '$gte': new Date('Sat, 28 May 2022 04:27:59 GMT')
      }
    }
  }, {
    '$group': {
      '_id': {
        '$dateToString': {
          'format': '%Y-%m-%d', 
          'date': '$log_time'
        }
      }, 
      'totalPayout': {
        '$sum': '$payout'
      }, 
      'totalClicks': {
        '$sum': '$click'
      }, 
      'totalConversions': {
        '$sum': '$conversions'
      }, 
      'totalCost': {
        '$sum': '$cost'
      }, 
      'totalVisitor': {
        '$sum': 1
      }
    }
  }, {
    '$sort': {
      '_id': 1
    }
  }
]

I have created indexes on “campaign”, and when I ran a query that compared campaign first before grouping, it didn’t make a huge difference. Then I created an index on log_time(timestamp), and the query would actually take much longer (from 11s → 21s). I’m trying to get the query time down, as 11s is a bit long for my application, and the number of records are just going to grow from here.

Running explain on this query without index:

{
  "explainVersion": "1",
  "stages": [
    {
      "$cursor": {
        "queryPlanner": {
          "namespace": "click_data.clicks",
          "indexFilterSet": false,
          "parsedQuery": {
            "$and": [
              {
                "log_time": {
                  "$lt": {
                    "$date": {
                      "$numberLong": "1654317033667"
                    }
                  }
                }
              },
              {
                "log_time": {
                  "$gte": {
                    "$date": {
                      "$numberLong": "1653712233667"
                    }
                  }
                }
              }
            ]
          },
          "queryHash": "D218AE91",
          "planCacheKey": "6A370F76",
          "maxIndexedOrSolutionsReached": false,
          "maxIndexedAndSolutionsReached": false,
          "maxScansToExplodeReached": false,
          "winningPlan": {
            "stage": "PROJECTION_SIMPLE",
            "transformBy": {
              "click": 1,
              "conversions": 1,
              "cost": 1,
              "log_time": 1,
              "payout": 1,
              "_id": 0
            },
            "inputStage": {
              "stage": "COLLSCAN",
              "filter": {
                "$and": [
                  {
                    "log_time": {
                      "$lt": {
                        "$date": {
                          "$numberLong": "1654317033667"
                        }
                      }
                    }
                  },
                  {
                    "log_time": {
                      "$gte": {
                        "$date": {
                          "$numberLong": "1653712233667"
                        }
                      }
                    }
                  }
                ]
              },
              "direction": "forward"
            }
          },
          "rejectedPlans": []
        },
        "executionStats": {
          "executionSuccess": true,
          "nReturned": 1448441,
          "executionTimeMillis": 4634,
          "totalKeysExamined": 0,
          "totalDocsExamined": 1512268,
          "executionStages": {
            "stage": "PROJECTION_SIMPLE",
            "nReturned": 1448441,
            "executionTimeMillisEstimate": 292,
            "works": 1512270,
            "advanced": 1448441,
            "needTime": 63828,
            "needYield": 0,
            "saveState": 1599,
            "restoreState": 1599,
            "isEOF": 1,
            "transformBy": {
              "click": 1,
              "conversions": 1,
              "cost": 1,
              "log_time": 1,
              "payout": 1,
              "_id": 0
            },
            "inputStage": {
              "stage": "COLLSCAN",
              "filter": {
                "$and": [
                  {
                    "log_time": {
                      "$lt": {
                        "$date": {
                          "$numberLong": "1654317033667"
                        }
                      }
                    }
                  },
                  {
                    "log_time": {
                      "$gte": {
                        "$date": {
                          "$numberLong": "1653712233667"
                        }
                      }
                    }
                  }
                ]
              },
              "nReturned": 1448441,
              "executionTimeMillisEstimate": 197,
              "works": 1512270,
              "advanced": 1448441,
              "needTime": 63828,
              "needYield": 0,
              "saveState": 1599,
              "restoreState": 1599,
              "isEOF": 1,
              "direction": "forward",
              "docsExamined": 1512268
            }
          },
          "allPlansExecution": []
        }
      },
      "nReturned": 1448441,
      "executionTimeMillisEstimate": 2636
    },
    {
      "$group": {
        "_id": {
          "$dateToString": {
            "date": "$log_time",
            "format": {
              "$const": "%Y-%m-%d"
            }
          }
        },
        "totalPayout": {
          "$sum": "$payout"
        },
        "totalClicks": {
          "$sum": "$click"
        },
        "totalConversions": {
          "$sum": "$conversions"
        },
        "totalCost": {
          "$sum": "$cost"
        },
        "totalVisitor": {
          "$sum": {
            "$const": 1
          }
        }
      },
      "maxAccumulatorMemoryUsageBytes": {
        "totalPayout": 504,
        "totalClicks": 504,
        "totalConversions": 504,
        "totalCost": 504,
        "totalVisitor": 504
      },
      "totalOutputDataSizeBytes": 4291,
      "usedDisk": false,
      "nReturned": 7,
      "executionTimeMillisEstimate": 4630
    },
    {
      "$sort": {
        "sortKey": {
          "_id": 1
        }
      },
      "totalDataSizeSortedBytesEstimate": 4403,
      "usedDisk": false,
      "nReturned": 7,
      "executionTimeMillisEstimate": 4630
    }
  ],
  "serverInfo": {
    "host": "ubuntu-s-2vcpu-2gb-sgp1-01",
    "port": 27017,
    "version": "5.0.9",
    "gitVersion": "6f7dae919422dcd7f4892c10ff20cdc721ad00e6"
  },
  "serverParameters": {
    "internalQueryFacetBufferSizeBytes": 104857600,
    "internalQueryFacetMaxOutputDocSizeBytes": 104857600,
    "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600,
    "internalDocumentSourceGroupMaxMemoryBytes": 104857600,
    "internalQueryMaxBlockingSortMemoryUsageBytes": 104857600,
    "internalQueryProhibitBlockingMergeOnMongoS": 0,
    "internalQueryMaxAddToSetBytes": 104857600,
    "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600
  },
  "command": {
    "aggregate": "clicks",
    "pipeline": [
      {
        "$match": {
          "log_time": {
            "$lt": {
              "$date": {
                "$numberLong": "1654317033667"
              }
            },
            "$gte": {
              "$date": {
                "$numberLong": "1653712233667"
              }
            }
          }
        }
      },
      {
        "$group": {
          "_id": {
            "$dateToString": {
              "format": "%Y-%m-%d",
              "date": "$log_time"
            }
          },
          "totalPayout": {
            "$sum": "$payout"
          },
          "totalClicks": {
            "$sum": "$click"
          },
          "totalConversions": {
            "$sum": "$conversions"
          },
          "totalCost": {
            "$sum": "$cost"
          },
          "totalVisitor": {
            "$sum": 1
          }
        }
      },
      {
        "$sort": {
          "_id": 1
        }
      }
    ],
    "allowDiskUse": true,
    "cursor": {},
    "maxTimeMS": 60000,
    "$db": "click_data"
  },
  "ok": 1
}

With log_time as index (21s):

{
  "explainVersion": "1",
  "stages": [
    {
      "$cursor": {
        "queryPlanner": {
          "namespace": "click_data.clicks",
          "indexFilterSet": false,
          "parsedQuery": {
            "$and": [
              {
                "log_time": {
                  "$lt": {
                    "$date": {
                      "$numberLong": "1654317109655"
                    }
                  }
                }
              },
              {
                "log_time": {
                  "$gte": {
                    "$date": {
                      "$numberLong": "1653712309655"
                    }
                  }
                }
              }
            ]
          },
          "queryHash": "D218AE91",
          "planCacheKey": "3A53374A",
          "maxIndexedOrSolutionsReached": false,
          "maxIndexedAndSolutionsReached": false,
          "maxScansToExplodeReached": false,
          "winningPlan": {
            "stage": "PROJECTION_SIMPLE",
            "transformBy": {
              "click": 1,
              "conversions": 1,
              "cost": 1,
              "log_time": 1,
              "payout": 1,
              "_id": 0
            },
            "inputStage": {
              "stage": "FETCH",
              "inputStage": {
                "stage": "IXSCAN",
                "keyPattern": {
                  "log_time": 1
                },
                "indexName": "Timestamp",
                "isMultiKey": false,
                "multiKeyPaths": {
                  "log_time": []
                },
                "isUnique": false,
                "isSparse": false,
                "isPartial": false,
                "indexVersion": 2,
                "direction": "forward",
                "indexBounds": {
                  "log_time": [
                    "[new Date(1653712309655), new Date(1654317109655))"
                  ]
                }
              }
            }
          },
          "rejectedPlans": []
        },
        "executionStats": {
          "executionSuccess": true,
          "nReturned": 1448472,
          "executionTimeMillis": 7844,
          "totalKeysExamined": 1448472,
          "totalDocsExamined": 1448472,
          "executionStages": {
            "stage": "PROJECTION_SIMPLE",
            "nReturned": 1448472,
            "executionTimeMillisEstimate": 1436,
            "works": 1448473,
            "advanced": 1448472,
            "needTime": 0,
            "needYield": 0,
            "saveState": 1536,
            "restoreState": 1536,
            "isEOF": 1,
            "transformBy": {
              "click": 1,
              "conversions": 1,
              "cost": 1,
              "log_time": 1,
              "payout": 1,
              "_id": 0
            },
            "inputStage": {
              "stage": "FETCH",
              "nReturned": 1448472,
              "executionTimeMillisEstimate": 1221,
              "works": 1448473,
              "advanced": 1448472,
              "needTime": 0,
              "needYield": 0,
              "saveState": 1536,
              "restoreState": 1536,
              "isEOF": 1,
              "docsExamined": 1448472,
              "alreadyHasObj": 0,
              "inputStage": {
                "stage": "IXSCAN",
                "nReturned": 1448472,
                "executionTimeMillisEstimate": 314,
                "works": 1448473,
                "advanced": 1448472,
                "needTime": 0,
                "needYield": 0,
                "saveState": 1536,
                "restoreState": 1536,
                "isEOF": 1,
                "keyPattern": {
                  "log_time": 1
                },
                "indexName": "Timestamp",
                "isMultiKey": false,
                "multiKeyPaths": {
                  "log_time": []
                },
                "isUnique": false,
                "isSparse": false,
                "isPartial": false,
                "indexVersion": 2,
                "direction": "forward",
                "indexBounds": {
                  "log_time": [
                    "[new Date(1653712309655), new Date(1654317109655))"
                  ]
                },
                "keysExamined": 1448472,
                "seeks": 1,
                "dupsTested": 0,
                "dupsDropped": 0
              }
            }
          },
          "allPlansExecution": []
        }
      },
      "nReturned": 1448472,
      "executionTimeMillisEstimate": 5734
    },
    {
      "$group": {
        "_id": {
          "$dateToString": {
            "date": "$log_time",
            "format": {
              "$const": "%Y-%m-%d"
            }
          }
        },
        "totalPayout": {
          "$sum": "$payout"
        },
        "totalClicks": {
          "$sum": "$click"
        },
        "totalConversions": {
          "$sum": "$conversions"
        },
        "totalCost": {
          "$sum": "$cost"
        },
        "totalVisitor": {
          "$sum": {
            "$const": 1
          }
        }
      },
      "maxAccumulatorMemoryUsageBytes": {
        "totalPayout": 504,
        "totalClicks": 504,
        "totalConversions": 504,
        "totalCost": 504,
        "totalVisitor": 504
      },
      "totalOutputDataSizeBytes": 4291,
      "usedDisk": false,
      "nReturned": 7,
      "executionTimeMillisEstimate": 7842
    },
    {
      "$sort": {
        "sortKey": {
          "_id": 1
        }
      },
      "totalDataSizeSortedBytesEstimate": 4403,
      "usedDisk": false,
      "nReturned": 7,
      "executionTimeMillisEstimate": 7842
    }
  ],
  "serverInfo": {
    "host": "ubuntu-s-2vcpu-2gb-sgp1-01",
    "port": 27017,
    "version": "5.0.9",
    "gitVersion": "6f7dae919422dcd7f4892c10ff20cdc721ad00e6"
  },
  "serverParameters": {
    "internalQueryFacetBufferSizeBytes": 104857600,
    "internalQueryFacetMaxOutputDocSizeBytes": 104857600,
    "internalLookupStageIntermediateDocumentMaxSizeBytes": 104857600,
    "internalDocumentSourceGroupMaxMemoryBytes": 104857600,
    "internalQueryMaxBlockingSortMemoryUsageBytes": 104857600,
    "internalQueryProhibitBlockingMergeOnMongoS": 0,
    "internalQueryMaxAddToSetBytes": 104857600,
    "internalDocumentSourceSetWindowFieldsMaxMemoryBytes": 104857600
  },
  "command": {
    "aggregate": "clicks",
    "pipeline": [
      {
        "$match": {
          "log_time": {
            "$lt": {
              "$date": {
                "$numberLong": "1654317109655"
              }
            },
            "$gte": {
              "$date": {
                "$numberLong": "1653712309655"
              }
            }
          }
        }
      },
      {
        "$group": {
          "_id": {
            "$dateToString": {
              "format": "%Y-%m-%d",
              "date": "$log_time"
            }
          },
          "totalPayout": {
            "$sum": "$payout"
          },
          "totalClicks": {
            "$sum": "$click"
          },
          "totalConversions": {
            "$sum": "$conversions"
          },
          "totalCost": {
            "$sum": "$cost"
          },
          "totalVisitor": {
            "$sum": 1
          }
        }
      },
      {
        "$sort": {
          "_id": 1
        }
      }
    ],
    "allowDiskUse": true,
    "cursor": {},
    "maxTimeMS": 60000,
    "$db": "click_data"
  },
  "ok": 1
}``` 


Would love some help here!!!  

Thanks a lot in advance,

KK

This went to the code block, and I couldn’t find the edit post button…

You have an IXSCAN and it looks adequate.

You have something else slowing the overall system. The same collection with an extra index might require more RAM to cache both the index and the data. So if your system is low in resource, which seems to be the case with “host”: “ubuntu-s-2vcpu-2gb-sgp1-01” , the extra disk I/O eats up all the performance.

Also your date selection is not really selective since nReturned:1448441 is almost the whole collection which is

You also do $sum of fields that are not present (except cost) in the sample document.

1 Like

Thanks for your reply.

I had resized the VM from 2GB 2CPU to 4CPU 8GB prior to running those test. I don’t think it was swapping but I will confirm. Afterall the whole database was under 100MB.
I’ll report in if it wasn’t swapping.

Only some of the documents has the field in $sum, would it be better if they all had 0?

So I guess, to get better performance I will need to change the schema / database strategy.

THanks!

I don’t think so. A non-existing field take a lot less space than a field with 0. I just mentioned it in case the documents have been redacted to hide some details.

  1. Your string _id takes a lot more space than an ObjectId
  2. Some fields are not used in this specific use-case, so they could be move into an accessory collection

But before changing anything, you have to make sure that you do not have another issue., like write traffic occurring at the same time?

What are the total databases size?
What are the total indexes size?

1 Like

I had loaded the database with test data. Just to confirm speed/usability and how much space I need to provide for each collection.

So no other queries was going on at the same time.

I was mistaken earlier, total database size for 1.5M record was about 150MB. Index size about 15Mb.

I’d redone some more tests, it seems if I use index on a smaller portion of the documents, it will benefit, but if I’m doing queries that involves 80%+ of the documents, it will actually be slower.

Makes sense.

2 Likes