Hi all,
I am new to mongo db and I created a large collection of ~ 1 TB (since I read the guide and it says there is no limit on # of doc in a collection but suggests a finite number of collections…). After a month of data acquisition, I started to work on it and realized a huge problem, querying data is extremely slow (basically takes forever). I am hosting the data on a dockerized mongo 4.4 running on my NAS with 4 core CPU, and dealing with them with a mongo 6.0 running on my mac m2.
Now I have the following questions:
- I didnt create index when inserted these documents. Now I tried to create index for the populated collection from mongosh command line on mac, but it complains that “standalones can’t specify commitquorum”. I didnt find any repica or commitquorum settings in my mongo.conf. I dont know how to turn if off?
Another way around is that I tried in mongo compass it seems to be able to run create index. But it is extremely slow. I have a streaming data writing into the collection every 4 hours. I am not sure if this is the reason but after a long time the create index eventually failed in mongo compass. - I am writing new data to the collection. I first cumulate them on my mac, and then dump and restore on NAS. I want to create index before dumping and restoring in the future. Can I restore the indexed collection into a populated un-indexed collection?
- what is the proper way to create index on this large collection running in a standalone instance?
- If the above question is hard, is there any way to split this large collection by some of its categorical field
into multiple smaller collection efficiently?
Many thanks for any comments/suggestions!