With that first function I’m querying the metadata database to get a list of operations to work with. That list is saved to a file for later use. $1 equals “opi”, short for “operation identifier”.
With that second function I’m getting a list of “archiving units” from the metadata database, they are a subdivision of the aforementioned “opi”. $1 in that context equals “id”, i.e archiving units. At first I wanted to get a complete liste of archiving units, similar to my list of OPIs. But then I realized I didn’t care much about a list and a simple total count would be sufficient.
These two functions are working great on small databases but they don’t scale up well when millions of documents (~7) are involved.
If the queries are fast for small amount of data but struggles with a larger amount of data, please make sure that all your queries are backed by indexes. To check further on the slowness observed, could you share below details:
Please share the queries you are using.
If there are any error messages observed while executing the queries with large dataset.
As @Tarun_Gaur mentioned, further details would be useful in order to provide relevant advice.
However, I noticed your aggregation queries are using $group to count all documents so the number of documents processed will scale with the size of the collection.
Maintain a count in your application logic when documents are updated or deleted so you don’t have to calculate this dynamically.
If you need a count of all documents in a collection and speed is more important than accuracy, use the estimatedDocumentCount().
If you have other functions that return larger result sets, consider implementing using an officially supported MongoDB driver instead of piping to the mongo shell. You will likely have better performance and can do any extra transformations within a single implementation instead of piping through other tools like cut or tr.