Moving IT systems to the cloud has a wide range of economic and technical benefits. However, one of the key challenges that still remains is the high availability (HA) of service-level agreements (SLAs) for mission-critical applications such as SAP. SIOS Technology provides solutions to tackle these HA challenges and failover protection for SAP components and the entire SAP landscape.
“Although moving to the cloud can present a wide range of economic and technical benefits, there are some areas where the public cloud platforms are still a little bit lacking specifically with regard to meeting high availability SLAs for mission-critical applications like SAP,” says Harry Aujla, EMEA Technical Director at SIOS Technology, on this episode of TFiR Insights.
Key highlights from this video interview are:
- Aujla discusses the use-case of one of their customers in the polymer manufacturing sector. He explains the challenges the customer is facing with high availability in SAP.
- While many are moving all of their IT operations to the cloud, meeting demanding high availability SLAs for mission-critical applications like SAP can present challenges. Aujla explains why in some instances it is important for customers to have high availability technology that is certified by SAP themselves and how that influenced their high availability strategy.
- Aujla details how SIOS Technology helped their customer in the polymer manufacturing sector overcome their challenges with high availability in SAP and how the solution worked.
- Although some of SIOS Technology’s customers are in the manufacturing industry, the company serves other sectors too. Aujla discusses the challenges with high availability in the healthcare industry and why it is increasingly being targeted by ransomware attacks.
- Aujla discusses a healthcare use-case from Australia, why it was so important for them to have high availability of their systems and how SIOS Technology’s DataKeeper Cluster Edition helped them achieve this.
- Aujla imparts some of the best practices for implementing high availability strategies for both manufacturing and healthcare sectors.
About Harry Aujla: With over 20 years of experience in the IT business continuity sector, he adds a wealth of expertise for enterprises who are looking to build IT continuity strategies across their businesses. Consulting across many vertical sectors such as manufacturing, transportation, government, and more, he has been privileged to represent vendors who specialize in high availability, disaster recovery, and fault tolerant computing techniques. During his career, Harry has spent time in various roles such as technical consultant, salesperson, and technical trainer.
About SIOS Technology: SIOS Technology high availability and disaster recovery solutions ensure availability and eliminate data loss for critical Windows and Linux applications operating across physical, virtual, cloud, and hybrid cloud environments. SIOS clustering software is essential for any IT infrastructure with applications requiring a high degree of resiliency, ensuring uptime without sacrificing performance or data – protecting businesses from local failures and regional outages, planned and unplanned. Founded in 1999, SIOS Technology Corp. is headquartered in San Mateo, California, with offices worldwide.
The summary of the show is written by Emily Nicholls.
Here is the automated and unedited transcript of the recording. Please note that the transcript has not been edited or reviewed.
Swapnil Bhartiya: Hi, this is your host Swapnil Bhartiya and welcome to another episode of TFiR Insights. And today we have with us, once again, Harry Aujla, EMEA Technical Director at SIOS Technology. Harry, it’s great to have you back on the show.
Harry Aujla: Yeah, it’s great to be here again as well. Thank you.
Swapnil Bhartiya: Yeah. And today we are going to talk about best practices for high availability in SAP. If I’m not wrong, you folks have a customer in the polymer manufacturing industry that had some unique challenges related to providing high availability production for the SAP landscape. First of all, tell us a bit about this customer and what are the challenges that they were facing there.
Harry Aujla: Yeah, of course. So this was a customer in the polymer manufacturing sector, as you mentioned, and they operate at a global level. So what they were doing was they were running a wide network of production sites around the world, and each one of these production sites had its own associated IT landscape with it as well. And what the customer was previously doing is they were running these IT operations within a group of data centers that were dotted around the globe. But they got to a point where… The infrastructure at the time was based on a mix of physical service and virtual service. And as their business met new growth, that they found this infrastructure was getting a little bit expensive to run and a bit difficult to manage.
So, like a lot of customers, they made the bold decision to shut down all of their data centers and move their entire IT operation into the cloud. Now, the challenge they sort of came across when migrating into the cloud was in relation to their SAP landscape. Although, moving to the cloud can present a wide range of economic and technical benefits, there’s some areas where the public cloud platforms are still a little bit lacking specifically with regard to meeting demanding high availability SLAs for mission critical applications like SAP.
So the customer realized that the cloud provider that they were working with was not going to be able to deliver the availability SLA that they were looking for. And to try and find those SLAs that they used to get when they were running on-premise, they were going to have to consider a third party approach to try and meet some of those high availability needs. But in addition to that, there was also a couple of other factors that they needed to consider if they were going to go up this third party road. The SAP landscape that they were running was one very much of a heterogeneous nature, where they were running a little bit of SUSE Linux in one side of the infrastructure, there was a little bit of Oracle Linux elsewhere, and it was a similar picture from a database standpoint as well. There was a bit of SAP HANA, a bit of Sybase, and also a bit of Oracle Database as well.
So if you imagine trying to manage separate high availability solutions for all of these different OSes and databases, that’s sort of somewhat going to be a little bit problematic, on a sort of a day-to-day administration level. And then one of the other things they were looking at was that… Yep, seeing that SAP was such a core part of their operations within the organization. It was absolutely vital for them that they adopt a high availability technology that was also going to be certified by SAP themselves. So with those sort of the areas, there were several key criteria that they really needed to meet if they were going to consider this HA strategy with a third party.
Swapnil Bhartiya: You talked about the challenges that this customer was facing. Now, can you also talk about how you helped them overcome these challenges?
Harry Aujla: What we ended up doing is just to deliver that high availability and failover protection for those SAP components and the entire SAP landscape. We ended up deploying a range of Linux based clusters using our SIOS LifeKeeper and SIOS DataKeeper solutions. So this was all deployed into the cloud platform of their choice. And the clusters were deployed in a way where they were spread out across multiple availability zones. And the idea behind that was just to help them avoid any single points of failure in that architecture. So what I ended up giving them was a level of continuous availability on all of the major SAP components and all of those associated operating systems, databases, and also the underlying networking. And I also gave them that single high availability solution that provided them with the complete HA coverage for the entire SAP landscape.
So to some extent, you could say they standardized all of their SAP high availability on SIOS. So to sort of capture that in a nutshell, what the SIOS approach provided them with, was a comprehensive high availability solution that firstly, it supported all the mission, this mission criticality of the SAP landscape in its entirety. It’s delivering a 99.99% availability SLA, which is what they were looking for, whilst at the same time, delivering that capability to protect a mixed environment of different operating systems and different databases. And then also, giving them the assurance that the high availability of choice was also certified by SAP themselves.
Swapnil Bhartiya: If I’m not wrong, this is just one of the examples of one of the many kind of customers that you helped. Last time, we talked about business maintenance systems as well. So the fact is that manufacturing industry is not the only customers that you folks help with. So can you also share with us some examples of customers in a different industry who are facing either similar or different high availability challenges?
Harry Aujla: Yeah, certainly. So another interesting example I can share with you today actually comes from the healthcare sector. So if you think about it, downtime for applications and storage in this industry and without meaning to exaggerate, can literally be a matter of life and death. So it’s imperative to ensure that you have reliable access to critical systems. And we talk about systems like, electronic health records and medical imaging technology, picture archiving, and any associated communication systems as well. If these kinds of systems that they need to be running all of the time… And the healthcare industry, from a technology perspective, is also quite vulnerable in that it’s also been increasingly targeted in things like ransomware attacks, which as we all know, can also lead to significant downtime as well on an infrastructure.
And sometimes in Australia, we work very closely with the Chris O’Brien Lifehouse Hospital, who specialize in state-of-the-art research and treatment of rare and complex cancer cases. So they’ll provide a wide range of treatments, services, and additional to support to a person with cancer might need. So what Lifehouse were doing, they were using an application called MEDITECH, which was being used for things like patient administration and central storage of patients’ electronic health records. So this system was actually quite vital for them, because if you think about it, if this kind of system goes down, you can’t access the patients’ records, and that could potentially paralyze the hospital’s operations.
So within their data center, they were running this MEDITECH application on a Windows Server Failover Cluster, and this was based on a traditional SAN storage configuration. And like a lot of organizations, Lifehouse plan to migrate this application to the cloud because like a lot of organizations, they wanted to take advantage of cloud agility and affordability. So Lifehouse chose one of the common public cloud providers, and their expectation was to be able to take the existing on-premise environment and just do a simple lift and shift into the cloud environment.
And to be able to simulate the on-premises SAN storage configuration, they chose a cloud volume service that was being offered by the cloud vendor itself through their marketplace. And they took these cloud volumes and they attached them to their Windows Server Failover Cluster. But what they found really quickly was that, there was a substantial adverse impact to the performance throughput requirements using this cloud volume storage service. And they realized very quickly that this service was not going to be suitable for their MEDITECH application. So this meant they had to go back to the drawing board, and after conducting an exhaustive search, they concluded that the best solution that could meet both their availability and performance requirements was the SIOS DataKeeper Cluster Edition solution.
Swapnil Bhartiya: And now, can you go a bit deeper into how you actually help them with their MEDITECH system?
Harry Aujla: Yeah. So what DataKeeper Cluster Edition provided was that high performance synchronous data replication that Lifehouse were looking for, what they needed. So by using real time block-level replication, or block-level mirroring between the local storage attached to the active and the standby instances of the cluster, the solution overcame the performance issues that they were previously experiencing. And the resulting SANless cluster, shall we call it, is compatible with Windows Server Failover Clustering.
It supplies continuous monitoring for detecting failures, both at the application level and at the database level. And it also offers configurable policies for failing over and failing back. And where they were sort of pleasantly surprised with the solution was by how easy it was to implement and operate. So the DataKeeper interface is very much like using the Failover Cluster Manager interface. So if you are familiar with administrating failover clusters, you are in very familiar territory when you are working with the DataKeeper Cluster Edition interface.
Swapnil Bhartiya: Excellent. When we look at all these use cases and these examples, and the best thing is that there’s always something to learn from them. As you try to solve a problem, there is lesson to be learned, which helps with the next customer, all the same customers as well. So can you talk about what lessons we can learn from these examples, which are two different industries, and what are the key takeaways here?
Harry Aujla: Yeah. That’s a great question. And thinking about some of the examples we’ve just discussed there, I think firstly, what I would say is, pay very close attention to some of the high availability SLAs that are being offered by the cloud vendors. Keep in mind that their SLAs are only related to the underlying cloud architecture, such as the VM instances, their storage, the underlying physical architecture. And depending upon the type of service that you’re consuming from the cloud vendor, the SLA of the applications and databases running within the instances, it’s actually your responsibility as the customer, not the cloud vendor. So if you’ve got a highly critical app or database that you need to protect in the cloud, you may also need to consider an application level availability solution that will supplement the SLAs that the cloud vendor is delivering at an infrastructure level.
The other thing I think I would say is, try to adopt a high availability strategy that can adapt and grow as and when your IT environment evolves as well. So in other words, avoid falling into the trap of using high availability solutions that are designed for specific operating systems, applications, or database. Otherwise, you might find yourself managing different availability solutions in a mixed IT landscape. So in other words, try to remain as agnostic as possible for your availability needs, and that will be very beneficial for you.
And then the last point I think I would consider is, you’d also want to ensure that your choice of availability technology, and the way it’s architected and the way it’s deployed, doesn’t ultimately add an adverse impact to the performance of the application or the database that will try and protect. If you recall, that’s what the team at Lifehouse experience. Yeah, it’s all very well having a highly redundant application and database architecture, but you don’t want to deliver that at the cost of a painfully slow service. So it’s key to find that balance between high availability and application performance, which in most cases can only truly be established through testing or proof of concept exercises.
Swapnil Bhartiya: Harry, once again, thank you so much for joining me today and sharing these insights. And as usual, I would love to have you back on the show. Thank you.
Harry Aujla: Great. Thank you as well.