SIOS Technology Corp. offers high availability and disaster recovery solutions to ensure availability and eliminate data loss for critical Windows and Linux applications operating across physical, virtual, cloud, and hybrid cloud environments. The company recently announced the SIOS Protection Suite for Linux Version 9.6.0 which enhances data integrity protection in Microsoft Azure and enables up to five times faster recovery time for SAP HANA databases. To learn more about this release and its new capabilities, we hosted Adrienne Cooley, Senior Product Owner at SIOS Technology Corp.
Topics we covered:
- What exactly is a split-brain scenario in a clustered environment?
- What are the methods used to avoid Split Brain?
- Are there other ways to avoid Split Brain?
- So in addition to protecting data integrity, you also need to ensure fast recovery time. What kind of things does your product do to ensure faster recovery?
- Are there any other new application recovery kit (ARK) features in the latest version?
Swapnil Bhartiya: Hi, this is your host Swapnil Bhartiya and welcome to TFR Let’s Talk. SIOS recently announced SIOS production suite for Linux version 9.6.0. That enhances data integrity production in Microsoft Azure and enables up to five times faster recovery time for SAP HANA databases. To talk about some of these new capabilities and features, today we have with us Adrienne Cooley senior product owner at SIOS technology. Adrienne, welcome to the show.
Adrienne Cooley: It’s great to be here. Thank you for having me.
Swapnil Bhartiya: Adrienne. One of the features of this release is close integration with native Microsoft Azure fencing agent to eliminate a potential threat to data integrity called split brain. This feature not only protects data integrity, but it also saves customers money by eliminating the need to deploy and pay for an additional server for fencing capabilities. Can you tell us what exactly is a split brain scenario in a clustering environment?
Adrienne Cooley: As you’re aware in a clustering environment, you’ll have most reapplications running on your primary node, and then your high availability of software will detect a failure and fail over to the secondary node to keep operations seamless. What happens in a split brain scenario is communication may be lost between the primary and the secondary node. So the high availability software doesn’t necessarily know which one is primary and which one is secondary. This is a condition called split brain. And in these cases, if both nodes sync their primary, they’ll both start editing the same data causing data integrity issues.
Swapnil Bhartiya: Can you talk about what are the methods that are used to avoid this split brain?
Adrienne Cooley: Sure. One popular method is you may have heard it referred to as witness or quorum server. In these type of situations and additional node or storage cluster such as AWS S3 would be used to perform a witness or quorum function where they judge which node should be the primary, and then that lets the high availability software like size protection suite or other applications know what the primary node is and how to handle the situation.
Beyond these, are there any other ways to avoid split brain?
Yeah. So one other method that we support is called stonith, and we’ve recently in our 9.6.0 release of the protection suite for Linux integrated with the cloud native version called Azure fencing agent. And stonith basically determines which node should be active, and then turns the other one off so that there’s no confusion and the data isn’t updated.
Swapnil Bhartiya: The fact is that in addition to protecting data integrity, you also need to ensure fast recovery time. Can you talk about what kind of things does this suite do to ensure faster recovery as well?
Adrienne Cooley: As you’re aware with critical applications, fast recovery time is the most important thing. People need those applications up all the time. So what we do on the SIOS protection suite for Linux is we make these modules called application recovery kits or ARKs, and they basically are application aware and help us with the most efficient, reliable way to orchestrate failovers in these systems for these applications. One example that we have of that in our SPSL release and we’ve enhanced it in our 9.6.0 Release that’s coming out, is for the SAP HANA ARC that we have that protects SAP HANA databases. So we’ve added two enhancements to this to really help with orchestration. One is determining when we should failover immediately to the secondary database, and when we should let the HANA system try to repair the database.
There’s sometimes when a repair is faster and there are sometimes when you just need to fail over quickly. And so we’ve added this capability into the ARK to handle these situations. We’ve also added the ability to add temporal recovery policies to the HANA ARK so that for example, a user could say, if there’ve been, let’s say three failures in the last 15 minutes, even if our high availability software was able to recover from them, we still want to fail over because that those failures could be indicative of a larger problem. And we don’t want to wait until something bad happens for the system to fail over.
Swapnil Bhartiya: You mentioned ARK or the application recovery kit. We have covered that a lot in our discussion, the SIOS. Have you folks added any new features in the latest version?
Adrienne Cooley: Yeah, we’ve also enhanced the arc that we have for the AWS EC2. There are some configuration settings. It could be a little tricky, especially if you’re not a cluster maintainer. And if those are the source and destination checks need to be turned off when you’re in a clustering environment. It could be easy to leave those on if you’re not familiar with them. There are some situations where the EC2 console will try to be helpful and turn those on for you when it doesn’t need to. So we’ve enhanced our ARK to go back and turn those settings off just to help these maintainers not be in situations that they shouldn’t be.
Swapnil Bhartiya: Adrienne, thank you so much for taking time out today and talk about some of the features of this release. And I would love to have you back on the show. Thank you.
Adrienne Cooley: Oh, thanks so much for having me. I look forward to coming back.