Internals - Checkpoints and Journaling

Hello folks,

WiredTiger triggers checkpoints at intervals of 60 seconds from buffer cache writing to disk and journal write data to disk at 100 milliseconds. My doubt is when occurs journaling the process checks if there is dirty page on cache and replace the page at data cache from the journal files or in the checkpoint process that page will be check if exists in journal files and update the data cache ?

Thank you and cheers !!

Alexandre Araujo

Hi Alexandre,

A checkpoint in WiredTiger is basically a snapshot of the state of the database where all data files are consistent with each other.

WiredTiger implements a write ahead log in the form of a journal. That is, a write that is written to the journal is considered durable (i.e. will survive restarts).

See WiredTiger journal and WiredTiger Storage Engine for a high level description of how the process works.

The journal files are only used for recovery purposes in case of unclean shutdown. During normal operation, once the dirty pages in the cache (i.e. in memory) are checkpointed and marked clean, WiredTiger will clean up the now-unneeded journal entries.

Please note that these are quite specific implementation details, and may change between MongoDB versions. Out of curiosity, what is the reason for the question? Are you simply curious about how things work under the hood, or is there another reason?

Best regards,
Kevin

2 Likes

Hi Kevin,

Thank you for all the clarifications. The reason is about how things work under the hood.

If i may one more doubt, so between the checkpoints, the dirty data will be live in journal or there is a step in write ahead log also update the pages in memory ?

Regards,
Alexandre Araujo

Hi Alexandre,

Between checkpoints, the dirty pages stay in the WiredTiger cache. These dirty pages will then be flushed to the data files during checkpoint.

So when a write comes in, it will be written in two places: the journal on disk (synced every 100ms), and the pages in the cache (where they will be marked dirty). The journal is only for safekeeping purposes and is not involved in checkpoints during normal operations. It only comes into play when there is an unclean shutdown, where WiredTiger will restart from the latest known good checkpoint, then replay the journal entries if there are uncommitted writes.

Hopefully it’s clear & helpful :slight_smile:

Best regards,
Kevin

3 Likes

Hi Kevin,

Perfectly clear & helpful :grinning:

Thank you and have a great day.

Alexandre Araujo

1 Like

This topic was automatically closed 5 days after the last reply. New replies are no longer allowed.