Cloud Native ComputingDevelopersDevOpsFeaturedOpen SourcePredictions

2022 Will See Emergence Of Open Source Snowflake Stack | Dipti Borkar

0

Guest: Dipti Borkar (LinkedIn, Twitter)
Company: Ahana (Twitter)
Show: 2022 Prediction Series

Dipti Borkar, Co-founder and Chief Product Officer at Ahana, joins us on our 2022 Prediction Series and makes a prediction about open data lake analytics for data warehouse workloads. Borkar sees an open flake stack emerging, which means an open source Snowflake stack that will allow users to run similar data warehouse workloads, but with open source, open formats and without some of the lock-in and the high costs that might be a part of the data warehouses these days. “That certainly I want to look forward to, and we’ll see transition initially in augmentation of Snowflake warehouses, and over time transition into this open data lake model,” quips Borkar.

She also believes that the next generation of databases are going to be even more powerful for the kind of data that everyone is generating, and to be able to analyze that data at great speed, as well as for the best analytics. Check out the above video to know what else Borkar is predicting for the year ahead.

[expander_maker]

Swapnil Bhartiya: Hi, this is Swapnil Bhartiya and welcome to our series on predictions videos for 2022. And today we have with us once again, Dipti Borkar, Co-founder and Chief Product Officer at Ahana. Dipti, it’s great to have you on this show.

Dipti Borkar: Always a pleasure as well. Nice to be here.

Swapnil Bhartiya: Before I ask you to grab a crystal ball and share your predictions, I would love to know a bit about Ahana. Tell us about the company.

Dipti Borkar: Absolutely. Ahana is a managed service for Presto in the cloud. It makes SQL analytics on data lakes very, very easy, and that brings the power of Presto to every data platform team, no matter how big or small it is. Presto was built at Meta as a replacement for Hadoop and Hive, and is an incredible engine that can power analytics, interactive querying, and increasingly newer use cases like transformation as well.

Swapnil Bhartiya: Excellent. Now it’s time for you to pick up your crystal ball and share with us what predictions do you have for the next year?

Dipti Borkar: Yeah, I particularly enjoy this time of the year, thinking through what happened in the past year, as well as what’s going to change in the future, and particularly next year. And there are three areas that I think are very, very interesting coming up.

The first one I would say is open data lake analytics for data warehouse workloads. Now, if you look back, Snowflake was one of the biggest IPOs in history. There’s a lot of users for the warehouse. In 2022, the next year, I see an open flake stack emerging, which means open source Snowflake, an open source Snowflake stack that will allow users to run similar data warehouse workloads, but with open source, open formats and without some of the lock-in and the high costs that might be a part of the data warehouses these days. That certainly I want to look forward to, and we’ll see transition initially in augmentation of Snowflake warehouses, and over time transition into this open data lake model.

My next prediction is also around databases. Recently, there was a benchmark that was published called TPC-DS. It’s an industry benchmark that was published by Databricks, and Snowflake responded to it. There was a bit of back and forth. It seems like the database wars are back again, which means data based engineering is cool again. Database engineers have been developing core databases, distributed systems for many, many years, but given these benchmarks that have been published, we see that the next generation of databases are going to be even more powerful for the kind of data that everyone is generating, and to be able to analyze that data at great speed, as well as for the best analytics.

My next prediction and my final prediction is around this pandemic, and hopefully the post-pandemic era, if we can even call that. Things are evolving. But what’s happening is that, data platform teams and infrastructure teams are really looking at out of the box solutions rather than building it themselves, or doing it themselves rather than the DIY model. And why is that? With the pandemic, a lot of businesses have had the need to evolve and change what they’re doing very, very fast. Just take, for example, Uber.

Uber had a big impact on their primary business and they had overnight to switch over to Uber Eats or other kind of products that could keep them going. With the need to be able to change and evolve this fast, having managed services, having out of the box solutions in the cloud are becoming very, very useful for IT teams and data platform teams, and give them the ability to move fast at the speed that they need. We see that managed services, easy to use products, which save operational time, operational costs and give a faster time to market, will really, really evolve and be adopted really widely in 2022 as well.

Swapnil Bhartiya: Thanks for sharing these predictions. And now if I ask you, what is going to be the focus of the company in 2022?

Dipti Borkar: Yeah, great question. This is also the time we look back and look at how we want to invest, and change and focus as a company. Ahana is, as I mentioned earlier, as a managed service of Presto to make SQL on data lakes very, very easy. And given these predictions actually, our focus and our areas are along these lines. We are very much focused on making Ahana easier to use, continuing on that ease of use dimension so that data platform teams spend a lot less time actually managing operational aspects for these platforms, and really get value from the data itself and do more with the time that they have.

And the second big area is performance. We work very much with the open source community and the Presto Foundation and members of the foundation like Meta and Uber and others, and are evolving Presto to make it even faster. There are many projects that we are working on to improve core performance of Presto on the data warehouse, use cases for these workloads.

There’s a major rewrite of Presto that’s ongoing from Java to C++, which we see bringing orders of magnitude, better performance, more bandwidth, lower utilization. This is a big area that we’re looking forward to in 2022 as well.

Swapnil Bhartiya: Dipti, thank you so much for not only sharing those predictions, but also sharing the focus with the company. And I would love to have you back on the show next year, to not only check how many of these predictions turn out to be true, but also get the next set of predictions for the next year. But thanks for your time today.

Dipti Borkar: Absolutely. I’m keeping a scorecard for myself to see how we are doing. I’ll be back again, Swap. Thanks so much. Take care. Bye-bye.

[/expander_maker]