Accidentally overwrote collection

Ok, I think I messed up royally.

I have (or rather, I had…) a collection with 50M documents. I have another collection with only 20K documents that I wanted to merge into the bigger one.

Hence, on the smaller collection I wrote a pipeline consisting of $out.

I read the warning text about overwriting the documents in the collection, but apparently it didn’t say that…it said it will overwrite THE collection!!!

End result: 50M+ documents gone.

This is a 3 member replicaset, from which I have a 3 week old mongodump backup (with all 40+ collections).

How screwed am I? Is there any way to get back my collection?

If running on Atlas you may have an automatic backup created. Failing that if you have a very large oplog you could look at replaying that post-backup but that’s a massive stretch.

I think it could be a painful lesson.

Lesson learned: before using anything , check doc first. it says the collection will be replaced.

I did read the docs first but misread this as any document with same id will
Be overwritten…

Anyway, I managed to use a mongorestore incantation on my archive to get back most data (still running though, +13GB).

On restore in a replica set, where do you restore to? Always the primary? I’ve given the uri with all three nodes to the mongorestore cmd.

A very painful lesson indeed.

It may be worth setting up a second setup for testing, this could be a simpler setup that the main one, i.e. a single node replicaset.
Obviously if you want a real “pre live” testing environment it should mirror the prod/live environment closely but for a testing environment it can be a cheaper setup.
You could then automate a process to restore this testing environment from the real one.

We have multiple environment for different testing (dev/sit/uat and prod) which is probably overkill for most people but needed for various regulations and auditing. But it means the developers can happily destroy the dev environment with little effect than getting the cluster refreshed by the DBA team.

Also regular backups :slight_smile:

Sorry for the loss of data but time to look at putting prevention in place, also TEST YOUR BACKUP STRATEGY! If you’ve not tested a restore then the backups should be viewed as useless (this goes for any type of backup really).

I’ve not done much with backup strategies myself (lucky enough to have a DBA team and moved to Atlas which does a lot out of the box) but you should be able to do incremental backups of some description (again test anything you put in place) or could do monthly / weekly / daily so you can always go back to last month / week or any day within the last 7 etc, given the cost of spinning disks these days you could hold of lot of backups on a cheap disk.

1 Like