Chaos Engineering

chaos engineering.jpg

We’ve all heard about the recent Facebook outage. And while events remain unclear, any major outage is a worthwhile opportunity to discuss the benefits of chaos engineering. Specifically, the principle of breaking things on purpose to build confidence in a system’s ability to perform. 

This isn’t a new concept. A noteworthy example would be when Netflix scrutinized the reliability of their systems by failing servers and clusters, filling up random hard drives, etc. during their move to AWS in 2011. Doing so enabled Netflix to reduce the mean time to resolution (MTTR) for critical environment incidents. 

What’s new is how we can fail fast, learn fast on the facility side of the coin, thanks to data center digital twins. Digital twins accelerate this process by providing a risk-free, virtual environment.  

By testing data center nightmares—we’re talking fallout from a cooling unit failure, or the implications of racks running the passive half of an active-passive redundant application ramping up, or even what happens if you just switch a few servers on—in a virtual environment like a data center digital twin, you can gain confidence that your systems are ready for whatever comes next. 

I’ve discussed this concept in depth with DCD here if you’re interested to hear more of my thoughts on chaos engineering. You can also check out a recent video case study of the 6Sigma Digital Twin if you’d like to learn more about digital twin technology.

CTA DCX Thesee

Blog written by: Dave King, Product Manager

Other Recent Posts

Embracing Change in the Data Center Industry [eBook]

Earlier this year, Future Facilities and Bisnow launched Data Center Simulations & Predictive Techn…

Read More

Prevent Outages with Airflow Analysis

Think back to a time you had to install a large amount of new equipment in your data center. What d…

Read More

8 October, 2021

Back to entries