So I’m currently thinking about this, how do I know if it’s ideal for use in my application?
I say this because my application operates as follows,
I basically have several collections where each one has at least 220 million documents, I use it as a pipeline, it then processes all the other documents together in a single document, but I’m finding it slow to do searches within it.
Second problem that I don’t understand how could I save historical data in mongodb what would be the best way?
If you’re finding it slow to perform querying within your collections in MongoDB, you can try the following to improve performance:
Use indexes: Ensure that you have appropriate indexes on the fields being used in the queries. Indexes can significantly improve query performance by allowing MongoDB to quickly locate the relevant documents.
Query optimization: Review your queries and try to optimize them by reducing the number of documents being processed. For example, consider using the $match stage to filter documents before processing, as this can reduce the load on the server.
Use .explain(): Use the .explain() method to analyze the execution plan of a query and understand how MongoDB is executing it. This can help you identify areas for optimization.
It’s important to keep in mind that the actual improvements will depend on the specifics of your data, queries, and hardware. It would be better if you can share your sample documents and the queries that you’re using so as to be able to suggest you better and more definitive way to go. Also, can you please specify what is the pipeline that you’re referring to?
Time-series data: Time-series collections efficiently store time series data. In time series collections, writes are organized so that data from the same source is stored alongside other data points from a similar point in time. You can read more: Time Series Collection
Archive data: Consider using the technique of moving historical data to a separate collection or database.
Please do note, that all the above-listed points are general in nature. The best approach will depend on your specific requirements, your queries, your documents, and the amount of data being queried.
Hoping this helps. Please feel free to reach out for anything else as well.