Data on Kubernetes Community (DoKC) started last year to address the issues of using Kubernetes for data. After discussions with thousands of companies and individual data workloads on Kubernetes, a need for a sharing of patterns and concerns on how to build and operate data-centric applications was discovered.
Bart Farrell, CNCF Ambassador and Data on Kubernetes Community Leader, says, “The idea is to make this process simpler by pooling resources from different practitioners around the world who can speak from their experience.” Within the DoKC are SREs, DBAs, and just about any stakeholder working with the Kubernetes (especially on the data side).
DoKC recently announced founding sponsors and a governance structure to accelerate the emergence and development of techniques for the use of Kubernetes for data. As to why anynines joined the DoKC project, Julian Fischer, CEO of anynines says, “I need a mission, I need something that I believe that the world needs to have or there must be a particular way the world needs to be changed.” He adds, “My personal gain is the feeling that I do something that’s meaningful for the organization. We are exploring opportunities in the Kubernetes ecosystem by providing automation tooling that solves these problems we are talking about.”
Fischer poses two very important questions, with regards to data on Kubernetes: What kind of challenges do you see and what are the trends you feel the need to create? Fischer also intimates that it’s not just about being able to answer those questions quickly, but with enough details to know where someone is coming from.
To that, Farrell adds, “For a lot of practitioners, the standard practice up until now has been to maintain everything statelessly…to keep the data out of Kubernetes.” He mentions Kelsey Hightower, who tweeted in June that the folks at Google had been working on the El Carro operator, which allowed the Oracle database to run on Kubernetes. Farrell says, “So now we’re even seeing a shift in terms of the mindset and the way of looking at this, even from folks, such as himself, that previously had gone against that.”
Farrell then points out that more and more companies and end users are running Kubernetes in production and are taking this notion very seriously. He says, “There are different reasons as to why folks would want to run data on Kubernetes and those were the questions that we’re trying to answer in the community.”
According to Fischer, Kubernetes has gone a long way to address stateful workloads directly. With that in mind, Fischer believes Kubernetes is killing OpenStack. “And that was basically addressing the idea that OpenStack once was a potential candidate for providing a unified API to access infrastructure. And it appears that Kubernetes is actually consuming crowds thinking and believing in this idea,” explains Fischer.
One of the challenges Kubernetes faces is when customers need to run workloads across the globe. On that, Fischer says, “Depending on the particular needs, sometimes data should be operated on virtual machines and sometimes it’s more in favor of a Kubernetes-based installation.” Fischer offers an example of the Cloud Foundry platform running as Kubernetes-native loads. To that, Fischer says, “this idea can also be transferred to a Kubernetes stack that doesn’t use Cloud Foundry. So if you were to run a distributed system, purely on Kubernetes, you’re the infrastructure abstraction offered by Kubernetes, assuming there will be a Kubernetes on every infrastructure.”
Once a workload is describing Kubernetes, it can be run anywhere. That’s the charm of the technology and it’s perfectly compliant with the anynines’ mission to provide automation to application developer platforms, no matter the underlying technology.
anynines @ KubeCon
As far as to what you can expect from anynines at KubeCon, Farrell says, “Well, in terms of the event, we’re very excited to be having folks like Julian who really come to the table with the level that we expect in our community because it’s one thing to understand Kubernetes and that’s already limited, but to really understand what we’re talking about when we’re speaking about Data on Kubernetes, that’s what we’re looking for.” Farrell goes on to get into specifics about what the company is bringing to the KubeCon event such as panels from vendors and end users (such as Zalando, Macquarie Bank, and Dreamworks), as well as talks about running workloads on Kubernetes, running Kafka on Kubernetes, storage, operators, databases, etc.
Farrell does mention that it will be 100% virtual, with him onsite in Los Angeles, with the DoKC team. anynines already has over 2,500 people signed up.
The summary of the show is written by Jack Wallen
Here is the rough, unedited transcript of the show…
Swapnil Bhartiya: Hi, this is your host, Swapnil Bhartiya and welcome to TFiR Let’s talk about Kubernetes, a special show for KubeCon and CloudNativeCon. Data on Kubernetes community is organizing DoKC days at the upcoming KubeCon and anynines is one of the sponsors of the event. To learn more about the event and the community, today we have with us, Bart Farrell, CNCF Ambassador and Data on Kubernetes Community Leader and Julian Fischer, CEO of anynines. Bart, Julian, is good to have you both on the show.
Bart Farrell: Thanks for having me.
Julian Fischer: Hi, thanks for having me too.
Swapnil Bhartiya: Bart, let’s start with you. What is the Data on Kubernetes Community? Who, I mean, what kind of folks are in a part of this community? So give us a quick overview of the community.
Bart Farrell: Okay. The community started last year and basically addressing the issue of running data, stay for workloads on Kubernetes. This has been a challenge up until now. So the idea is to make this process simpler by pooling resources from different practitioners around the world who can speak from their experience. So in terms of the folks that are in our community, we have SREs, DBAs, DBREs. Anybody in terms of stakeholders that are going to be working with Kubernetes and specifically on the data side.
Julian Fischer: Right. If we just look at Kubernetes, Kubernetes is a big giant monolith, unopened intended there. There are so many components, but let’s just focus on data up. What kind of challenges that you saw or you see, or what are kind of trends that you see that you felt the need to create, I mean, I needed to do that quickly, but I want to go just quickly, a deep details so that we do know where you’re coming from.
Bart Farrell: Of course. For a lot of practitioners, the standard practice up until now has been to maintain everything statelessly. It’s to say, keep the data out of Kubernetes. The sort of trend coming from folks like Kelsey Hightower has been as much easier, simpler to keep your data out of Kubernetes. So just have everything be stateless.
Until this year, when Kelsey himself tweeted in June, that, the folks at Google had been working on the El Carro operator and it was allowing Oracle databases to run on Kubernetes. So now we’re even seeing a shift in terms of the mindset and the way of looking at this, even from folks, such as himself, that previously had gone against that.
We’re now seeing more and more companies, more and more end-users that all are running Kubernetes in production and are really taking this notion seriously. Perhaps they want to have all of my stack in the same place, perhaps for cost reasons it’s better or perhaps I have sensitive data. There are different reasons as to why folks would want to run data on Kubernetes and those were the questions that we’re trying to answer in the community.
Swapnil Bhartiya: Excellent. Thanks for explaining that. Julian, I’ll come to you now. We have talked about, data anynines, you folks have anynines data service platform as well, but what did you see in this community and of course this event that you decided to support it?
Julian Fischer: Well, Kubernetes is going a long way and I think it’s fair to say that after the introduction of stateful sets, Kubernetes has definitely taken steps to address stateful workloads more directly. I have a few, I don’t know, a few months ago, I think I’ve written an article about whether Kubernetes is killing OpenStack. And that was basically addressing the idea that OpenStack once was a potential candidate for providing unified API to access infrastructure. And it appears that Kubernetes is actually consuming crowds thinking and believing in this idea. So accumulating members in a community believing that data could be operated on Kubernetes is an idea that I personally find meaningful because the role Kubernetes plays in decorating infrastructures.
Swapnil Bhartiya: And how does this, participant at the event and supporting the community fit into the larger picture of any line?
Julian Fischer: Well, if you, if I think about customers that we have, we often see global organizations who need to run workloads across the globe. Sometimes they do have a existing platform technologies such as Cloud Foundry, sometimes they come up with Greenfield Kubernetes environments.
Depending on the particular needs, sometimes data should be operated on virtual machines and sometimes it’s more in favor to a Kubernetes based installation.
For example, if a Cloud Foundry environment is to be reduced to its minimum. A Cloud Foundry based on Kubernetes, currently the trending Cloud Foundry as well would be desirable to use and then it would also be interesting to have an operator instead of standalone virtual machine automation so that the entire Cloud Foundry platform can run as Kubernetes native loads.
Now, of course, this idea can also be transferred to a Kubernetes stack that doesn’t use Cloud Foundry. So if you were to run a distributed system, purely on Kubernetes, you’re the infrastructure abstraction offered by Kubernetes, assuming there will be a Kubernetes on every infrastructure. And that once a workload is describing Kubernetes, you can run it everywhere. That’s basically the charm of it and it’s perfectly compliant with our mission to provide automation to application developer platforms, no matter what the technology underlying is.
Swapnil Bhartiya: You mentioned, the Kubernetes will be everywhere. I want to quickly talk about, we have talked about that earlier Edge use cases because they kind of pose a different challenge than a data center where you have all the resources handy. You can send your team into the data center as well. So do you see any unique challenges when it comes to data and Edge that Kubernetes will be handling because there are a lot of lightweight Kubernetes distributions out there all the way from Sue Anto Mirantis to Canonical and all those companies out there.
Julian Fischer: Yeah, sure. I mean, if you look into a data center, we often see, data centers with a lot of redundancies. So they are, let’s say a three availability zones and they are highly stricted, isolated, highly isolated from each other to contain fires or outages in the network or in the power supply. And if you deploy a database, for example, deploy a Kubernetes cluster, I’d assume that nodes are spread across availability zones too. If you then schedule a database, let’s say it’s a stateful set, I’d also annotate the ports accordingly so that these ports are spread across availability zones. And with such a configuration, a database could survive, assuming it’s clustered in this automatic fail over, the failure of an availability zone.
Now, if you take Kubernetes into an Edge scenario where there is no such redundancy, because for example, you’re running Kubernetes on a physical server running in Iraq, somewhere in a plant, for example, then you may have degraded availability compared to regular data center. However, the charm of it is, if you think that you’re running an organization with a lot of different plans where there’s a lot of IT anywhere, anyways, and you would like to have a more uniform way on managing the life cycle of that software, then Kubernetes is a big step into the right direction, even if there are some compromises to accept compared to regular more cloud native data centers.
Swapnil Bhartiya: If you can just give us a quick glimpse of what should be expected there in. Are they going to be sessions, demos talking also is going to be both in-person and virtual? So could you tell us about what we should expect from the event?
Bart Farrell: Yeah. Well, in terms of the event, very excited to be having folks like Julian who really come to the table with the level that we expect in our community because it’s one thing to understand Kubernetes and that’s already limited, but to really understand what we’re talking about when we’re speaking about Data on Kubernetes, that’s what we’re looking for.
So there will be a different talks as well as panels from vendors, as well as end users. End users such as Zalando, Macquarie Bank, Dreamworks. We’re going to have different folks from those companies to come on and speak about their experience running, say for workloads on Kubernetes and then a mixture of different other practitioners that will be coming on and talking about some of the different elements that can be ranging from running Kafka on Kubernetes, obviously touching on issues related to storage, things related to operators, databases, etc.
There’ll be kind of a little bit of everything. In terms of the format, it’s going to be a hundred percent virtual. I will be onsite in Los Angeles with the DoK team, with Melissa Logan, our director and [inaudible 00:08:55], our head of content. And so we’ll be doing a live stream from there and then obviously interacting with all the folks who will be attending the main KubeCon event as well.
Swapnil Bhartiya: Excellent. No, I remembered the old days of when I used to be a print journalist and I will go to events, I would try to attend almost all the sessions happening. But now, as you said, it’s virtual, so it’s also, people also became more picky and they try to be more targeted. So who should attend these sessions or talks demos? What is your target audience?
Bart Farrell: Great question. Because we understand the nature of the technologies that we’re talking about are still quite innovative. Anyone who anticipates within the next year or two years to be tackling this issue of starting to run state for workloads on Kubernetes should definitely be there.
We’re already looking at the attendee list. We’ve got over 2,500 people signed up very excited about. We’ve got everything from dev ops, all the way to CTO CIO. So it’s kind of a range, but, and that’s also why we have quite a few talks that are playing for the day that I’ve got a better rather short. So that way people can get a sampling from a wide cross range of different folks and have a deeper and richer understanding of exactly what we’re talking about when we refer to Data on Kubernetes.
Swapnil Bhartiya: Julian, I want quickly talk to you that how is any nice participant, are you folks delivering in these sessions, talks, panels talking about your participant at the event?
Julian Fischer: Yeah, I mean, the talker about principles in designing operators is addressing some of the challenges that part has mentioned, if engineers are confronted for the first time with writing those operators, it’s not only about, taking the operator SDKN and putting a stateful set together, there is a large set of challenges that should be addressed.
And in such a such… By sharing experience in a more general way, we’d like to accelerate the adoption of Data on Kubernetes. And the talk is one way to do that, contributing to meetups is another way to do that and sponsoring the community is another way to do that. So it’s all about getting the mission done, which is about managing data services across infrastructures for application developer platforms at scale. The same mission we’ve been committed to for years is now manifesting in the Kubernetes ecosystem and we think that’s the DOK community is a perfect environment to participate.
Swapnil Bhartiya: If I look at anynines, you are getting involved this community, what is in there for any nines?
Julian Fischer: I think that if you get up in the morning where these… that’s how I am, I need a mission, I need something that I believe that the world needs to have or there must be a particular way the world needs to be changed.
And I think that the data service challenge in general has been extremely underestimated by all application developer development platform technologies, whether it’s Cloud Foundry or Kubernetes.
So first of all, my personal gain is the feeling that I do something that’s meaningful for the company, the organization of anynines. We are exploring opportunities in the Kubernetes ecosystem by providing automation tooling that solves these problems we are talking about, is, I think helping a community to move forward we’ve been sharing openly the ideas of our products. For example, since we’ve started doing data service automation like nearly a decade ago.
I think that people will, some people will take the ideas and implement it, some people will ask us and give us consulting opportunities and others will buy our products. So I think by sharing knowledge, everybody gains and wins something. And that’s what we get out of such a community too, is achieving a mission and collaborating with people who have demand.
Swapnil Bhartiya: Also mentioned that as a company you’re exploring opportunities, does that mean that you will be involved or you will be sponsoring and participating in more projects also? Of course you can not talk a lot about them at this point, but just give us a kind of glimpse.
Julian Fischer: Well, like the customers we serve usually large organizations who want to build a developer experience as part of their digital transformation. They don’t want to run a single Kubernetes cluster with, let’s say a two operators to have four or five different databases. They run organizations with hundreds, sometimes thousands of developers, organizations with hundreds of thousands of employees, they have serious demands in software development.
And at this scale, managing platforms is very different from managing just the fiscal of data services and dozens of microservices. And we focus on building automation that makes that possible and reduces the operational friction for platform operators as well as the friction for application developers.
And if you look at the adoption of Kubernetes, for example, in large organizations, we’ve talked about this quite a few times, that’s often the case that there are numerous Kubernetes clusters. And the question will be, do you want to have data services centralized in a particular cluster? Or do you deploy your operators on every cluster? Do you want to have them in containers? Do you want to have them virtual machines?
So for these organizations, there are so many non-trivial challenges and questions to be asked that they are absolutely overwhelmed. And I think if you provide them with meaningful tooling, that has some opinionation built in with respect to operational best practices around automating data services regardless whether they are on the VMs or whether they are containers, that’s a big added value. And that’s basically what we do and keep on doing. It’s just adopting a new technology that provides interesting opportunities.
Swapnil Bhartiya: Julian, Bart, thank you so much for taking time out today and talk about not only the [inaudible 00:15:19] community, but also the coming event. And I look forward to our next conversation soon. Thank you.
Julian Fischer: Well, thanks for having us and it’s always great to talk to you and see you next time.
Bart Farrell: Thanks for having me.