I have an application written in Node with Electron. Each client installs it on his computer alongside a MongoDB database. Each database has 3 collections (coll_A, coll_B, and coll_C).
So, if we have 20 clients, we’ll have 20 different databases. We are building a dashboard that will show metrics about all clients and this dashboard will have access to a central remote MongoDB database hosted on the cloud, for example.
I need infrastructure or some feature in MongoDB that will allow me to merge all documents from the coll_A of all clients into the coll_A located in the central database. It’s important to note that the dataflow is unidirectional (n clients → remote server)
Can I solve this problem with sharding or replica sets? As far I understood from the docs, it is not possible unless I am missing something
If there is no such tool, the best way would be to write an algorithm to merge from clients, right?
You have clients with separate MongoDB installations
You want metrics from each client’s MongoDB database to be visible from a central dashboard that you control (and no control from any client)
Is this accurate? You also mention:
I need infrastructure or some feature in MongoDB that will allow me to merge all documents from the coll_A of all clients into the coll_A located in the central database.
Does this mean that you want a copy of all client’s data in the central server?
If this is the case, then I don’t think any built-in MongoDB feature is suitable for the use case.
I’m wondering, if having a central server is the main use case, why do you need to install a local MongoDB on the client at all? Is it possible to just use a single centralized database and all client connect to it instead?
Does this mean that you want a copy of all client’s data in the central server?
Yes, this is what we want.
I’m wondering, if having a central server is the main use case, why do you need to install a local MongoDB on the client at all? Is it possible to just use a single centralized database and all client connect to it instead?
Unfortunately, each client needs its own local database. I agree that this would be the best scenario.
I am almost achieving my goal with Debezium and Kafka MongoDb Sink Connector. With Debezium, I have a CDC pipeline and Kafka streaming all my data to the global database.
For now, my setup is just failing for update operations. I am going to create a new topic with all the information regarging this issue.