CloudDevelopersFeaturedLet's Talk

Impact Of Great Resignation On High Availability & Disaster Recovery


Guest: Cassius Rhue (LinkedIn)
Company: SIOS Technology (Twitter)
Show: Let’s Talk

Every industry has been affected by the great resignation (or shuffle), and the world has discovered brand new opportunities, thanks to remote work. But when it comes to technology, such a mass exodus has immediate and lasting ramifications. We invited Cassius Rhue, VP of Customer Experience at SIOS Technology, to talk about the company’s experience, how the great resignation has affected the high availability and disaster recovery and what can be done about it.

Key highlights of the discussion:

  • In 2021, we saw what many call the big shuffle or massive resignations across industries. What impact is it going to have on the high availability space and specifically when it is about keeping critical applications up and running?

“Every industry has been impacted by the great resignation or great shuffle or some of this awakening to new opportunities that are now available because the whole world went into this kind of remote work for many different industries. It created these opportunities for people to begin to consider work in places they hadn’t before and that has had a huge impact on high availability (HA) and disaster recovery (DR). Mainly as teams and people change roles and jobs and leave roles within an IT team responsible for critical HA infrastructure that leaves a lot of those teams scrambling to cover that loss of a person.”

  • Are there any trends that worry or surprise you?

“One of the trends we saw as well, we saw people leaving to take other jobs. We also saw a trend where companies trying to protect themselves from that or to recover from the loss of key staff and the loss of key knowledge choosing to go more the route of professional services or contracting, right? So losing resources and rather than replacing cloud expertise in-house, choosing to augment them with cloud experts, consultants, hiring professional services to help them build out their HA architecture and IT administration.”

  • What impact has it had on day-to-day operations of businesses?

“A lot of what we’ve had to work with those businesses with on the day-to-day operations is helping them understand the HA space, they’ve lost some critical resource and so helping them understand their architecture, helping them understand the products that they have deployed, and then being HA experts in the space, reminding them of what are the critical tasks and operations that are a part of making sure applications are highly available.”

  • And if we just narrowed it to high availability and disaster recovery, can you also talk about what kind of impact it is having on this segment?

“One real impact we’ve seen in the last weeks is that, in the short term, you’ve lost a critical resource, your team’s now short-staffed. We’re also seeing people working longer hours. You have people trying to make sure that they mitigate the risk of having lost team members in HA. It’s not a business where you can’t afford downtime, so you’re looking to make sure that even at the loss of knowledge, even at the loss of resources and staffing, that you have plans in place, and that you have your partners, your contractors, your consultants are all on board to make sure your applications stay available.”

  • I think that’s where a challenge of tribal knowledge also comes into place, to avoid knowledge going away with that team and also creating a lot of technical debt in-house. Can you talk about this issue?

“You start assessing your team proactively, you start looking at where are things documented? What are your procedures that you have in place? Proactively looking at can you onboard new resources and start thinking about staff augmentation or growing your staff to de-risk those critical pieces and parts. I had read before the great resignation took place, some experts in our field talked about doing the sort of mock testing and chaos testing to simulate disasters, and then trying to have your team walk through handling that disaster, using their existing playbooks and figure out where the gaps are. So that’s an option for de-risking or managing your team so that you close some of these gaps that could occur as a result of the great resignation.

  • Can you also talk about the importance of chaos testing for high availability and disaster recovery, so that it becomes part of their strategy?

“We worked with a customer that introduced sort of the same chaos testing, and then realized somewhere midway through that they did not have the person with the right administrative permissions in the exercise. And so that’s a discovery that they made that yes, you had admins on there, but they didn’t have the level of permissions needed to perform certain operations on the storage. That’s something you want to find out when everyone’s working their normal hours, not something you want to find out at 2:00 AM or 3:00 AM when there is a real disaster going on.”

The summary of the show is written by Jack Wallen

Read Transcript

Don't miss out great stories, subscribe to our newsletter.

Datadobi Appoints Matthias Nijs As Vice President, EMEA Sales

Previous article

Kubernetes Dominates Container Management: Platform9 Report

Next article
Login/Sign up