CloudDevelopersDevOpsFeaturedLet's TalkNews

John Egan Gets Observant With Incident Management

0

TFiR discusses the importance of incident management with John Egan.

John Egan is the CEO and Founder of Kintaba, an incident management suite that is designed to make it such that any organization can implement incident response. Egan defines incident management as a service that “is really there to make sure that you follow those best practices. Every time you have a major incident in such a way that you’re not thinking about that overhead and administration, you’re focusing on fixing the problem, learning and becoming more resilient as an organization.”

Kintaba focuses primarily on those organizations that are trying to follow the lead of tech unicorns like Facebook and Google, which tend to be rapidly growing companies that lean on modern infrastructure implementation processes. Egan indicates these companies tend to be built on top of cloud services such as AWS and are, to some degree, practicing DevOps for automated deployment. At that level, companies tend to run into more and more incidents, so they need the internal tools to deal with them day-to-day.

Egan points out that organizations often don’t know what’s going to trigger an incident. In fact, Egan says, “…a big part of setting up an incident management team inside of your company in incident response process is understanding that you can really only bucket these things at generally their top level.” Incidents can be triggered by cascading database failures or privacy situations, both of which can constitute a risk to a business, and therefore, according to Egan, “doesn’t just belong in a project management’s task tracking application…”

Culturally, Egan discusses chaos engineering and how it plays into the basic idea within incident management, which is to say the more resilient companies understand that incidents are going to happen and it’s critical that their practices are well built and in place before they occur. Egan further illustrates his point by highlighting how both Netflix and Google approach incident response and how they implement the type of culture necessary to be successful on such a level.

Summary by Jack Wallen

Login/Sign up