How Cloudify Simplifies Things In The Cloud-Native World

When things can get complicated very quickly in the cloud-native, Kubernetes-centric world, Cloudify aims at simplifying end-to-end cloud services and network automation with its open-source, intent-based orchestration platform connecting any cloud, device, or third-party automation tool. In this episode of Let’s Talk, we sat down with Nati Shalom, CTO and Founder of Cloudify, to talk about the company and how it’s helping customers with the open-source ‘Environment as a Service’ platform.

Here are some of the topics we covered in this episode:

What was the specific problem you were trying to solve that you created the company?
How do you define ‘Environment as a Service’ and what does its next generation look like?
How is Cloudify actually helping customers with the Environment as a Service platform; Shalom talks about the solutions being offered by his company.
In the cloud-native world, things get complicated very quickly. Shalom talks about simplifying things for users.
In a recent podcast, Shalom talked about the trillion-dollar paradox arguing for putting the focus on cloud optimization instead of repatriation; he shares his insights into that discussion.
We then talked about the 5G rollout and the role of Content Service Providers (CSPs).

Guest: Nati Shalom (LinkedIn, Twitter)
Company: Cloudify (Twitter)
Show: Let’s Talk

Swapnil Bhartiya: Hi, this is your host, Swapnil Bhartiya, and welcome to another episode here, let’s talk. And today we have with us Nati Shalom. CTO, and founder of Cloudify. Nati it is good to have you on the show.

Nati Shalom: Thank you very much. Glad to be on the show. I’ve read a lot about you and the show. And I must admit that I really enjoy following up all the content that you guys are delivering. So I’m excited to be here.

Swapnil Bhatiya: And I’m equally excited to have you on the show. This is the first time we are talking to each other. So I would love to know a bit more about the company since you’re the founder. What is the specific problem you’re trying to solve that you created the company. So tell me the story of the company.

Nati Shalom: Sounds good. I’ll start with maybe a quick introduction of myself, as you mentioned, I’m the CTO and founder of cloud previously GigaSpaces and the background is that what excites me is really the intersection between technology and business. And in general, as we probably know, open source and cloud really introduces a lot of opportunities in this regard. And in that context, I’m really actively pursuing a lot of thought leadership type of activities, podcasting, et cetera.

I also find all the in complexity of dealing with highly distributed system, quite fascinating. And that explains the background that I had also with GigaSpaces, standing with no sequal, grid computing HPC, and now it’s multi-cloud edge. And what we’re going to talk about is environment as a service, which is the next generation of DevOps in my view. And really it’s all about how to drive a new innovation and making a lot of those distributed system, as simple as local system. That’s, I would say in a nutshell, the type of things that drives a lot of the ideas that I’m working on through my career, and also a lot of the innovation that I’m working on with strategic partners, the, AWS, ServiceNow, Windriver, VMware, et cetera.

Swapnil Bhatiya: Excellent, thanks for sharing that background. Now you use the term in the environment as service today. We do look at most of things as service and the goal is to make it easier, lower the barrier of entry for companies. So they can quickly jump in there. Otherwise, if they had to do everything themselves, it would take them years to get up and running. So talk about how do you define environment as service? And I think you also call it version 2.0. So let’s start with what it is and how do you see the next generation of it?

Nati Shalom: Yeah, so maybe again, a quick background on how we came with the term. So I think when we started the terms of orchestration and automation, I would say the generation one was really about moving manual processes into an automated processes. So a lot of the focus was really done around on that. Fast forward 2020, 2021. We’re now at a point in time in which there is no short of automation, everyone, I think understand the concept of automation.

And we actually, at the point in which we have too many automation, a lot of that is driven from the fact that we actually need faster time to value. And therefore we find that having one automation to rule them all isn’t enough, and therefore we’re going to see a lot of those specialized automation that are specific to each problems. And we are now seeing more of that coming along with DevSecOps, then with Emitlops, and Finops and that list grows and grows. Environment as a service is really environment as a service 2.0, I would say, is really positioned into that world where we have a lot of automation tools where we have a lot of machine templates.

People have started the journey with Kubernetes and Terraform as part of that, but finding them self now, dealing with a lot of Kubernetes clusters and a lot of Terraform templates and are looking a way to solve a lot of that complexity. Traditionally, what happened is that they put DevOps into that problem. So what we found is that a lot of those teams was writing a lot of the glue code to integrate all those automation tools, building the automation around the environment through a lot of custom work. And obviously as system grows at scale, that doesn’t scale. And that’s how the product environment as a service came along just to provide a built-in solution rather than doing a lot of that custom integration.

Swapnil Bhatiya:Well, let’s talk about the company and it’s offering so that we can also understand when you do talk about, environment as a service, how do you actually help? So talk about what kind of offerings you have for your users and customers.

Nati Shalom: Yeah. So let’s say use a use case. And I think that helps to expand the concept. So think about development and production that everyone has. And most of the development production system you’d have Kubernetes, you’d have Terraform, you’d have Database, you’d have networking and the production environment would have requirements to deliver everything in a highly available session because we optimized availability and scalability. Development environment, on the other hand, very different requirements. They would use the same stack, but we want it to be completely isolated. We wanted to run it very fast and roll it very fast so that we can run a lot of tests around those type of environments. And we also want to optimize it for cost because a lot of resources doesn’t have to be redundant. We just need the API endpoint. We don’t really care if it’s highly available or not for the testing period.

So those are two, I would say, a very interesting example of the same if you like environment, but with very different needs per environments. And right now managing a lot of those variation of this type of environment, and it could be development and production, but it could also be different products.

It could also be between the same environment between different customers that we’re serving in our SaaS environments. There’s a lot of cases in which we have almost the same thing, but it needs to behave and provide different behavior around that. That leads to a lot of complexity, because when we have this variation, that really means that we are duplicating a lot of the work, we’re duplicating a lot of the maintenance work, we’re duplicating a lot of the configuration work, a lot of the high ability work and that leads organization to really run in a highly, I would say in an environment that is far less optimized because of that complexity. Environment as a service is really all about decoupling the infrastructure from the workload and allowing organization to do the dynamic matchmaking between the right infrastructure and the job. That’s the whole concept behind it.

So the idea is that if we can decouple it, we can now one simplify, so we can have the dev environment as one environment and the production environment as one environment, and it would be dynamically adjustable to fit the SLA of development when we’re running it in development and would run if you’d like the stack in a single instance mode, for example. Or it would be optimized for production if we’re running it in production, the developer that is running workload against this environment, doesn’t need to be exposed to all those differences.

So the whole idea of environmental as a service is really through that decoupling to provide the right infrastructure for the job and simplifying how this matchmaking is happening behind the scenes. It is an open source product. It’s an open source orchestration product, and it’s actually running today in very, I would say, highly demanding type of environment in enterprise and software companies it’s in production for a couple of years already running a lot of those, like the AT&T of the world, the Morgan Stanley of the world, the TD bank of the world, a lot of those type of organization. We interestingly enough, starting to heat the scale out companies that are Webscale type of companies, which are software vendors. And that’s, I would say the next wave of adoption of Cloudify that we’re seeing right now in the market.

Swapnil Bhatiya: And now, I’ll go back to the point that you were making about Kubernetes and things get complicated very quickly. And I think by design, you cannot avoid complications when you are talking about cloud or Kubernetes. And that’s when you also talk about… look it as orchestrator of orchestrators. I would, once again, it feels like you’re trying to simplify and make things easier for users. So, explain, what do you really mean there? And of course you alluded to how you’re helping them, but let’s talk about that.

Nati Shalom: Yeah. So again, let’s start with the real use case. Usually when you started the journey with cloud native and Terraform and infrastructure is called, you’d start with few templates and with single Kubernetes cluster very quickly, you get to the point where you have multiple Kubernetes clusters and many Terraform templates and many other templates that you already have in scripts, et cetera. So the first thing is really, and what the organization have done. They put DevOps into the problem. So they put people to actually write glue code, glue code, meaning how do I take this Terraform template and Kubernetes cluster and move workload here and move workload here, run it in a pipeline for this use case and this use case. A lot of that glue code is done in a way that is very proprietary to one use case.

And once you need to make changes to that glue code, that’s where things become very complex because it’s like a spaghetti. What environment as a service really does is we provide a law that integration with Kubernetes, with Terraform, with [inaudible 00:09:14] , with cloud formation, with Azure. A lot of that integration is out of the box. So we have building blocks that we’ve already invested in, and you don’t have to write this integration yourself. That I would say number one and second, we allow you to organize it in a templateized environment. So the entire environment has a state behind it, and we can therefore manage update scenario in a much simplified and generic way. So instead of update being, going to a pipeline and changing a script, update being, just running an operation that says, oh, just add another component into the system at another firewall, change configuration, it will know how to deal with that delta and apply to the end to end environment to reach the end state of what you’re looking at.

Swapnil Bhatiya: Now, I want to just quickly change topic and talk about some of the use cases you are mentioning and it’s something that we could work closely at here as well, and is personally interesting to me as well, which is 5g. I would understand from your perspective, what is the role of communications that service providers or CSPs in the 5g rollout? And also last year, if you remember the US government, they release some spectrum CSP to further democratize 5g rollout as well. So talk about that. And of course the role of CSP’s there.

Nati Shalom: The things that you mentioned about 5g, people think about 5g as a better version of 4g, but you touch on something that other thing is fundamentally different, which is called private 5g, which is… think about it as a better version of wifi. And second it’s something that is called network slicing, which means that instead of having one network that is shared between everyone, you could attach or adjust network to fit specific users. The relevant to clarify is that, at the end of the day, leads to a highly complex and dynamic type of network, unlike any other distributed system requires a lot of that automation that previously was done half manual, semi-manual. And things of that line with 5g, CSPs are now forced to do two major things that are very different to their nature if you like.

One of them is have everything fully automated and forget about a lot of those manual work. And second, also collaborate with public cloud to support those private 5g, because they also need to collaborate, not just for getting the scale in the infrastructure, but also to allow them to get the mind share of developers into that. And that’s changed a lot of the dynamics. And in my view, the change is really similar or analogous to the change that we’ve seen between Netflix and Blockbuster. If you recall the analogy, Netflix realized that the change that is happening into the media company, linear vertical is so dramatic so they move public cloud and they adopted a lot of the DevOps practices, quite aggressively. With Blockbuster were left behind and we all know where each one of them ends. So I think that the role of CSPs is getting to a very similar type of point, intersection point to the one of Netflix and blockbuster in that regard.

Swapnil Bhatiya: Now, I talk a bit about a recent podcast that you folks did with Martin [inaudible 00:12:32]. And it was an interesting podcast up there on the site, it was about the trillion dollar paradox and where you argued about that the focus on cloud optimization instead of repatriation, that is an ideal case. So I want to understand your… of course, people can go and listen to the podcast and also read that whole transcript there. But I want to hear your views on that to just go a bit deeper into that podcast.

Nati Shalom: Yeah, definitely. So I think Martin [inaudible 00:13:02], he call it in the podcast cloud war 1.0 versus cloud war 2.0, which I liked. The idea of cloud war 1.0 was really cloud providers seeing the check that they need to pay to hardware vendors like Dell and HP, et cetera. And the point in which it reached that 50% of their total sale revenue, they reach the point of the conclusion that they have to look at the full supply chain and optimize that stock. And there is no way they can pay and can afford paying such amount to a supplier, a single supplier. And that’s how they started to run their own hardware venture if you’d like, which at that time, that was 10 years ago. At that time, it looked almost insane. I think we all look at that backward and I think that looks much more sensible than it was back then.

He points to a similar trend that is happening right now with the hyperscale software vendors. Where what he points out right now is that instead of hardware vendors, it become the cloud providers, the equivalent of those hardware vendors, SaaS providers now pay almost 50% of the revenue, even more to public cloud providers because of the lack of efficiency. And it was starting to point on Spotify, Dropbox, and other companies that have got to the point where similar to what cloud providers did at the past, that took their supply chain, find out how much they’re spending on public cloud and say, okay, it will repatriate some of that workload, by repatriation we’ll take some of that workload outside of the cloud and manage it in a much more optimized stack ourself because it’s very special. It doesn’t mean that they moved everything into their private cloud.

It means that they took the workload that is, I would say, highly sensitive to cost and is very special in terms of its usage. It’s not generic workload and obviously have the impact on cost if they repatriated. And that’s the part that they started to optimize. My argument there is that if you really look at the argument, it talks about efficiency in more general terms and how much companies are not really putting attention to efficiency, because they’re very much focused on velocity, meaning adding more features, new product, they’re very much incentivized and not being measured on how well, not just how they grow and add more customers and features and products, but also how they keep the margin high. And I think he points to that in what he called cloud war 2.0. And I took that point and generalized it and says, it’s not just repatriation.

Repatriation is one option to get to that level of optimization. There’s much more into it. Even if you look at Amazon itself, there is 400 VMs of EC2. 400 options to run EC2 on Amazon itself, let alone Azure and other clouds. There is many different platforms to run the same workload, ECS, EKS. We now have London services and the list grows on and on and on and on.

And if you look at efficiency from that angle, even within the same cloud, even if we don’t really repatriate, there’s a lot of options that most organizations don’t have a clue how to deal with that. And the main lesson there is that it’s, one you have to gain that practice, especially when you reach that scale. And second, it’s a continuous engineering work that you have to be prepared to put resources and effort around it. It’s not like you’re going to throw a tool there that’s going to do some magic and optimize all of a sudden what you’re doing. It’s going to be a lot of engineering work and you have to prioritize it in the same way that you’re prioritizing other product features customers. And it has to be higher in the priority list of most of those companies than it is right now.

Swapnil Bhatiya: If you look at edge use case, that’s where we talk more about hardware. Plus, when I look at cloud net, I look at it as more of a process of way of doing thing versus a thing itself. You can have a purely cloud net environment in your data center as well, but going back to the point of hardware, how do you look at edge use case? Because that’s where developers have to deal with hardware because sitting in some remote location there, you can, of course, orchestrate it using capabilities, but from Cloudify’s perspective, how do you look at that? Because that’s what we are talking about very little.

Nati Shalom: Yes. So I think that’s actually interesting because we usually think of edges, network use cases, but now if you look at a lot of security companies, you see that a lot of security companies now, for latency, have to put a lot of point of presence, it is called POPs around the globe for latency, think of the Z scale type of scenario and things of that line. And there’s a lot of use cases where you would need that low latency access, where it becomes a good practice. Now, there are a couple of things that drives that hardware, type of optimization. one of them is cost, so obviously a lot of the edge when you get to that volume of the number of use instances that you have, where it’s for content delivery network, and whether it’s for those POP instances, you’re talking usually about thousands of instances that you have to manage.

And obviously if you could optimize every element there, the gain in terms of cost becomes substantial. And in many cases you could optimize it because you’re not looking for something generic, you’re looking for something very specific and there’s a lot of way to optimize it. So that’s the business case of going in that route when it comes to edge. The other thing that I think is important when it comes to edge is the fact that it’s, in many cases running on somewhere else environment, not necessarily in just the cloud and because of that, there’s going to be some, if you like placement and other type of things that needs to be done in terms of provisioning and managing that provisioning. And I think also drives the thinking of edges, as a separate use case, or is it a different use case, than I would say other workload that are running in the cloud.

That’s from, I would say, the use case itself. Now from a management perspective, we actually take a very different view in which we want to look at edge as just another compute instance in the cloud. And why is that? Because you want simplicity. We don’t want complexity. If we look at edges yet another compute instance in the cloud, then we can apply the same practices, tools that we have in the cloud and edge becomes just another compute instance. So this, if you like interesting contention between having specialized use case of edge for optimizing it for efficiency and cost versus simplifying and managing it in a simplified way where you want to look at edge, as just another compute. And that’s going to about say, my view on that.

Swapnil Bhatiya: Excellent Nati, thank you so much for taking time out today and talk about Cloudify environment service and you know, other points, thanks for those insights. And as I said, I would love to have you back on the show. Thank you.

Nati Shalom: Thank you very much.

Read Full Transcript & Technical Deep Dive

How Cloudify Simplifies Things In The Cloud-Native World

Why Developers Should Choose Appwrite Over Google Firebase

How Flexify.IO Helped Musify Migrate Over 500 Terabytes Of Data

Why Developers Should Choose Appwrite Over Google Firebase

How Flexify.IO Helped Musify Migrate Over 500 Terabytes Of Data

You may also like

Why NVIDIA Donated Its DRA Driver to KubeVirt Community | Ryan Hallisey at KubeCon EU | TFiR

GPU Costs Are Killing AI Budgets—Volcano’s Unified Scheduling Cuts Waste | Jesse Stutler, Volcano

Why AI-Era Infrastructure Needs Control Planes | Bassam Tabbara, Crossplane

Multi-Cloud Chaos Is Killing Enterprise Margins | Dirk Alshuth, emma

Why Crossplane Fails at Kubernetes Scale: Julian Fischer, anynines | TFiR

Your Container Images Ship With 300 Vulnerabilities Before You Deploy | John Morello, Minimus