CloudDevelopersDevOpsFeaturedLet's TalkOpen Source

A Different Approach to High Availability in the Cloud

0

Guest: Bobby Jagdev
Company: SIOS Technology Corp.
Show: Let’s Talk | SIOS Cloud Availability Symposium

The cloud is everywhere, from end-users to enterprise businesses and everywhere in between. Every sector of business knows this and has gone out of their way to migrate their systems such that they include the cloud at some level. Anyone from finance, to development, SAP knows this. So when Bobby Jagdev, Independent SAP Technical Lead and Architect at SIOS Technology Corp. talks about trends, they apply to every business.

One of the most obvious trends Jagdev talks about is cost and suitability. But he also looks at high availability in an upcoming talk he’ll be giving. His idea is that, traditionally, businesses tend to focus on the hardware side of availability to say, “If this bit of infrastructure fails, how do I fail the services over to another bit of infrastructure that is there and ready to run those services?”  Jagdev believes application awareness is an area that is often misunderstood. To that end, Jagdev brings up databases in the SAP world, when he says, “If we look at databases, then we look at SAP systems as applications, they can also fail without the underlying hardware itself failing.” He adds, “So application awareness in clustering allows for those services to be monitored, and those clustering solutions to effectively be able to monitor a failure of a service and then make a decision on what to do with that service. Now that is fairly crucial in being able to actually run redundant application solutions.”

Jagdev recently did a comparison of SUSE LINUX Enterprise High Availability Extension and SIOS Protection Suite for Linux. What the company did was take two HANA databases and implemented the HANA system replication between two HANA databases on a site. They then implemented the SIOS Protection Suite for HANA to enable automated failover. On the other side, they installed two other HANA databases but used SUSE HAE as a cluster.

What they wanted to understand with this experiment was how easy it is to install the products straight from documentation, to downloading media, to executing the installs, and then how easy it would be to configure the cluster packages between the two applications.

But how does this apply to the cloud? Jadgev explains, “The cloud has made it very easy for us to actually spin up servers and to implement infrastructure quite quickly. So now I talk to some of my customers and just guide them through the process of effectively doing similar bake-offs themselves.” With the cloud, they were able to make this entire bake-off as agnostic as possible. They could even migrate it from one cloud host to another. This experiment helped them realize that “a lot of organizations have implemented clustering solutions, often some of those other clustering solutions fail to deliver some of the benefits and value that’s required or what would be expected from the outset. When we look at the poor implementations of clusters, they have a tendency to introduce more downtimes that are necessary and also take longer to actually initiate service than could be required, or should I say, that might be the case if there was a manual intervention to resume service.”

Jagdev is excited about the various industry minds coming together in upcoming panels at the Cloud Availability Symposium, where they can engage in insightful conversations and sharing experiences. The panels will be touching on subjects that will cut through high availability from a different approach.

The summary of the show is written by Jack Wallen


Here is the rough, unedited transcript of the show:

Swapnil Bhartiya: This is your host Swapnil Bhartiya and welcome to a special episode of Let’s Talk for SIOS Cloud Availability Symposium. And my next guest is Bobby Jagdev, independent SAP technical lead and architect at SIOS technology corp. Bobby, it’s great to have you on the show.

Bobby Jagdev: It’s nice to be here, Swapnil. Nice to meet you.

Swapnil Bhartiya: This is the first time we’re talking, so tell me a bit about your background.

Bobby Jagdev: I’m an independent freelance technology advisor. I’ve been working in the SAP space for nearly two decades now, so good 20 years. So my specialism is technical architecture around SAP. So over that long period of time, now I’ve been designing, implementing SAP technical solutions from new implementations, to upgrades, to migrations across a variety of industry sectors. So usually my work involves me designing all the way from requirement gathering, blueprinting, to buying hardware, system migrations and moving customers into steady state.

Swapnil Bhartiya: Across industries, digital transformation companies are moving towards the cloud. What kind of trends are you seeing looking at your own area of scope and expertise?

Bobby Jagdev: I think a lot of the trends, really, is the transformation to the cloud. And in particular, I think a lot of focus around cloud migrations is around cost and availability. Cloud is largely the area where a lot of organizations are attempting to land at the moment. So that definitely is the biggest trend, and I think the biggest focus at the moment around that is cost and suitability.

Swapnil Bhartiya: But you are also speaking at the upcoming Cloud Availability Symposium by SIOS. Talk a bit about your participation there.

Bobby Jagdev: I’m going to be involved in a few sessions at the symposium. And one of the sessions I’ll be presenting what I call a requirement led solution design approach to high availability. And as you touched on, the move to the cloud, availability of systems is starting to become more apparent for various reasons. Now, in one of the sessions there, I’m going to be sharing what I call an RPO and RTO design led approach. So for those that are unaware, RPO, as in Recovery Point Objective and RTO, Recovery Time Objective. So the former being how much data you could potentially lose in the event of a high availability event, as in hardware failure, and the RTO being how long it takes to resume those services. So in one of those sessions, Swapnil, I’m going to be sharing, I guess my approach to how to actually design for high availability, and they are in the cloud.

And as I mentioned, it’s a requirement led approach. Now I’ve always been a very strong advocate that as a solution designer, we shouldn’t necessarily just be there translating requirements into solutions, that we should really be there helping organizations in actually defining their requirements based on the available solutions and then the pros and cons of those solutions as well. So the focus of one of those sessions will be my advocacy in not necessarily having solution design as a transactional process, but something where we engage with clients to help them define their requirements and design the right solutions at the same time.

Swapnil Bhartiya: And as you earlier mentioned that you will be participating at a lot of talks panels, other sessions. You’re also participating on a panel about importance of application awareness in high available environment, talk a bit about that panel.

Bobby Jagdev: That’s going to be an interesting, an insightful conversation. So application awareness is an area that I have a lot of personal interest in.

Again, as I mentioned, many years of experience and actually implementing high availability solutions as well, is that it’s an area that’s also largely misunderstood too, because when we look at availability, most of the time organizations are looking at availability from a hardware failure perspective. So if this bit of infrastructure fails, how do I fail the services over to another bit of infrastructure that is there and ready to run those services? So traditional availability looks at it from hardware, to provide redundancy for hardware failure. But application awareness is an area that is often misunderstood and sometimes neglected. But inevitably, if we look at databases, then we look at SAP systems as applications, they can also fail without the underlying hardware itself failing. So application awareness in clustering allows for those services to be monitored, and those clustering solutions to effectively be able to monitor a failure of a service and then make a decision on what to do with that service. Now that is fairly crucial in being able to actually run redundant application solutions.

Swapnil Bhartiya: Excellent. And if I’m not wrong, you recently also did a side by side comparison of SUSE’s High Availability Extension and SIOS production suite for Linux. Talk a bit about the setup and what were your findings of that comparison?

Bobby Jagdev: That’s an interesting question and it was certainly an interesting experiment, Swapnil. I like to use the term experiment because it’s effectively what we’re doing. So what we did, is we took two HANA databases, just say two HANA databases on this site. And we implemented the HANA system replication between those two HANA databases and then implemented SIOS protection suite for HANA to allow for automated failover and takeover between those two HANA databases. Now, at the same time, on the other side, we installed two other HANA databases. But in this example, we implemented SUSE HAE as a cluster against that cluster pair. And the objective, really, Swapnil was–and I think we ended up calling it a bake off in the end between them both–was to actually do that comparison, to understand the merits between the two solutions.

Now, the key areas that we wanted to understand was when we’ve got SIOS protection suite over here, SUSE HAE over here, to compare and contrast how easy is it to actually install the products straight from documentation, to downloading media, and then executing the installs, then to how easy is it to actually configure the cluster packages between the two applications and then finally, ease of maintenance and operation of those two clusters as well. Now, the interesting thing is Swapnil, this is a question that we’re often asked out there is we’re looking at comparing two different product sets, but we very rarely get the opportunity to actually run side by side tests of two different products at the same time. So it was something that I was very interested in doing, because I often talk to my clients around the advantages of using the SIOS protection suite.

But when you run a POC in itself and you have actual data by standing up two examples together, having those data points certainly helps in being able to communicate some of those advantages. And one of the things that I’ve seen now off the back of that, is as I talk to customers about the bake off, is they want to try something themselves. Now the cloud has made it very easy for us to actually spin up servers and to implement infrastructure quite quickly. So now I talk to some of my customers and just guide them through the process of effectively doing similar bake offs themselves. And we executed this in the Azure Cloud because we needed somewhere to do it. But one of the things that was a principle that we applied from the outset actually to this was, we need somewhere to run this from, but can we make the bake off as agnostic as possible to a cloud based solution?

That’s effectively what we did. So if I have customers that are doing something in AWS, we can effectively run the bake off in AWS, as well, as an example. So yeah, was a very interesting experiment, something that I think we all got very excited in actually doing, and actually seeing the results from that. And it also addressed, I think another point, was where a lot of organizations have implemented clustering solutions, often some of those other clustering solutions fail to deliver some of the benefits and value that’s required or what would be expected from the outset. When we look at the poor implementations of clusters, they have a tendency to introduce more down times that are necessary and also take longer to actually initiate service than could be required, or should I say, that might be the case, if there was a manual intervention to resume service.

So one of the things that I get asked a lot, and we also tested this in the bake off, is if I’ve got a product clustering SAP over here, how can I retrofit a different product onto that existing solution? Now, one of the things that we did in the bake off is we did exactly that. We had a running system, running SUSE HAE, and we implemented SIOS under the covers as a retrofit to take over the monitoring and the fail over those services. Now there is a slight disclaimer on that, anyone wanting to do anything like this in a production environment, want to rehearse and test these things, of course, but it was another value add from the bake off. So the goal answering your question Swapnil, and I hope I answered it, was that we wanted to do a compare and contrast between two products and aim to answer some of the questions that we get asked a lot, but actually doing it in real life.

Swapnil Bhartiya: Of course, you are participating in a lot of sessions that day, but if I ask you, what are the things that you are excited about when we look at this symposium?

Bobby Jagdev: I’m most excited about just the various industry minds coming together over a couple of days, engaging in some insightful conversations, sharing some experiences. And those sharing of those experiences is one of the most, I think, valuable areas. And also the fact that we are going to be touching on some topics and hopefully cutting through high availability from a different approach than I think most people have been accustomed to over the last few years.

Swapnil Bhartiya: Bobby, thank you so much for taking time out today and talk about not only your own background, but also your participation, the upcoming symposium, and also the comparison, the bake off that you talked about. Thank for your time today and I would love to have you back on the show. Thank you.

Bobby Jagdev: That you’re welcome. Thank you, Swapnil.

Don't miss out great stories, subscribe to our newsletter.

Login/Sign up