On this page
You can use the Atlas UI and API to simulate an outage on your Atlas multi-region cluster and observe how your application handles an outage in one or more regions.
When you submit a request to test an outage using the Atlas UI or API, Atlas simulates an outage event. During this process, Atlas removes network connectivity to nodes in the selected regions.
If your application takes more than 15 minutes to notice connection loss to some nodes, we recommend that you reduce your TCP retransmission timeout values. To learn more, see modify tcp_retries2 value.
To simulate a Regional Outage in the Atlas UI:
Log in to the Atlas UI.
For the cluster you wish to perform outage testing, click the ... button.
Click Test Resilience.
Select Regional Outage. Atlas displays a Test Resilience modal with the steps Atlas takes to simulate an outage event. To learn more, see Simulate Regional Outage Process.
Click Select Regions.
Select the tab corresponding to the type of outage you want to simulate:
Select Simulate Regional Outage to begin the test. Atlas notifies you when the outage occurs.
Select a tab corresponding to the type of outage you are performing:
To verify that the outage is successful, monitor your application and ensure your read and write operations are working as expected.
A regional outage or regional outage simulation that affects the highest priority regions in a sharded cluster could cause the cluster to become inoperable for read operations. To restore the config servers, do the following: