Change datatype of field for 10 million documents

Hi,
so I have two fields in documents of string data type, now I want to change them to date data type
but I have 10 millions documents in that collection with 12 indexes .

May I run following to do so :

[
  {
    $addFields: {
      created_at: {
        $convert: {
          input: "$created_at",
          to: "date",
        },
      },
      updated_at: {
        $convert: {
          input: "$updated_at",
          to: "date",
        },
      },
    },
  },
  {
    $out: "test_date",
  },
]

After it creates a new collection, I will drop the current one and rename the new one.

Is it okay to run this aggregation pipeline for 10 millions documents. or is there a better way ?

Thanks

1 Like

It is.

It depends of your goals.

Is that a production system?

Could you afford downtime?

Do you want to have the fastest way or way that least load the server?

Do you have indexes? [EDITED] It was mentioned in the OP that it has 12.

Thanks for replying.

I tried my solution and it took <10 mins.

Yes, it is a production system, so is there any better solution to avoid downtime and update value in place instead of creating new collection.

I am looking for fastest way

Yes, it has indexes, so I need to create those on new collection as well, which takes more time.

On a production system I would try to throttle the operation by converting a limited number of documents directly into the database rather than using a temporary one. This should reduce the usage spike caused by both the conversion and the index rebuilding.

The first stage would be a $match for the $type of created_at and updated_at. I then would $limit to may be 100_000. You would then have to run the pipeline 10 times to convert all. Last stage would be a $merge.

Converting your string date to use the date data type is the smart thing to do. Kudos