WiredTiger history_store.file_max causes catastrophic replica set failures with no recovery path

venkataraman_r · August 12, 2025, 6:39pm

The WiredTiger history_store.file_max configuration option can cause complete replica set unavailability in distributed deployments. When the history store file exceeds the configured limit, MongoDB immediately panics and terminates, potentially causing loss of quorum and complete service outage with no automatic recovery mechanism.

This is the extension of the issue i reported earlier Mongod doesnt have control over WiredTigerHS.wt History Store - #4 by venkataraman_r which was closed by Mongo using SERVER ticket https://jira.mongodb.org/browse/SERVER-84108 stating 5.0 was EOL. BUt this issue exists in all the versions that supports HS.

Bug Type

Severity: Critical
Priority: High
Category: Storage Engine / Replica Sets
Component: WiredTiger History Store

Environment

MongoDB Version: All versions with WiredTiger history store support(tested in 7.0 as well)
Storage Engine: WiredTiger
Deployment: Replica Set

Problem Description

Current Behavior

When history_store.file_max is configured and exceeded, MongoDB immediately panics with WT_PANIC
The panic causes immediate process termination via fassert()
No graceful degradation, warnings, or recovery options are available
After restart, the oversized history store file persists, causing immediate re-panic on first write operation
This creates an infinite restart loop until manual intervention

Critical Failure Scenario

In a 5-member replica set across 3 sites (2+2+1 arbiter):

One site (2 members) goes down
Remaining primary handles increased load → history store grows
History store exceeds file_max → primary panics and shuts down
Loss of primary + previous site failure = no quorum
Entire replica set becomes unavailable

Root Cause Analysis

Design Flaws

No graceful degradation: Immediate panic instead of warnings or throttling
No startup validation: Size check only occurs during write operations, not at startup
No automatic cleanup: No mechanism to reduce history store size during emergencies
Poor failure isolation: Storage limit can cause replica set quorum loss

Code References

// src/third_party/wiredtiger/src/history/hs_rec.c:766
if ((uint64_t)hs_size > max_hs_size)
WT_ERR_PANIC(session, WT_PANIC,
“WiredTigerHS: file size of %” PRIu64 " exceeds maximum size %" PRIu64,
(uint64_t)hs_size, max_hs_size);

// src/mongo/db/storage/wiredtiger/wiredtiger_util.cpp:191
fassert(28559, retCode != WT_PANIC || storageGlobalParams.repair);

Expected Behavior

Immediate Fixes Needed

Configurable panic behavior:

history_store=(file_max=10GB, on_limit=warn|throttle|panic)

Graceful degradation options:

warn: Log warnings but continue operations
throttle: Slow writes, reject long-running reads
cleanup: Auto-truncate oldest history entries

Startup validation: Check file size during startup and provide recovery options
Emergency recovery mode: Allow startup with temporary limit override

Long-term Improvements

Proactive monitoring: Built-in metrics and alerting before hitting limits
Automatic cleanup: Background process to manage history store size
Better documentation: Clear warnings about replica set availability risks
Default behavior change: Consider making file_max=0 (unbounded) the recommended production setting