Chaos Engineering is becoming one of the core practices of modern cloud infrastructure. More and more companies are embracing it internally; some call it Chaos Engineering while others call it Reliability Engineering. Regardless of the name, the predominant idea is to create reliable systems. In this episode of Let’s Talk, we sat down with Kolton Andrus, Co-Founder and CEO of Gremlin, to talk about Chaos Engineering.
The topics we covered in this discussion include the origin of Chaos Engineering, how Gremlin came into existence, how Chaos Engineering can help companies avoid unexpected outages as we experienced recently, how companies are embracing the practice internally, what is Gremlin doing to help engineers get trained and certified for Chaos Engineering as well as the future of Chaos Engineering and Gremlin.
About the guest: Kolton Andrus is co-founder and CEO of Gremlin. Prior to this, he was an engineer at Amazon and Netflix, focused on reliability and performance. At both companies, he built Chaos Engineering platforms and served as ‘Call Leader’, managing the resolution of company-wide incidents.
About the company: Gremlin is the world’s first hosted Chaos Engineering service with a mission to help build a more reliable internet. It turns failure into resilience by offering engineers tools to safely experiment on complex systems, in order to identify weaknesses before they impact customers and cause revenue loss.