Guest: Vid Jain (LinkedIn, Twitter)
Company: Wallaroo Labs (Twitter)
Keywords: Machine Learning, MLOps
Show: Let’s Talk
Summary: Wallaroo Labs, a New York-based MLOps platform for enterprises, enables push-button deployment, fast compute, and advanced model insights in one integrated platform. The company recently raised $25 million in a Series A round led by Microsoft’s venture arm, M12. We invited Vid Jain, Founder and CEO of Wallaroo Labs, to learn more about the company, the recent $25 million Series A funding, the key areas Wallaroo plans to target with the funds, and more.
“We built something from the ground up specifically for high-performance machine learning workflows. It really gives us a fundamental breakthrough in terms of speed, efficiency and simplicity,” said Jain. “If you’re dealing with complex models, large amounts of data, or lots of models, we enable you to do things that you couldn’t do before and drive much better business results.”
Highlights of the show:
- Intro to the company. What led to the creation of Wallaroo Labs?
- What are the areas Wallaroo plans to focus on with the latest funding round?
- What sets Wallaroo apart from other players in the MLOps market?
About Vid Jain: Vid Jain is the founder and CEO of Wallaroo Labs which enables enterprises to launch AI much faster, simpler, and at far lower cost – and provide the ability to measure, test, and iterate to create value. Vid started his career on Wall Street as in Strategic Product Development for Global Execution Services at Merrill Lynch.
About Wallaroo: Wallaroo’s breakthrough platform facilitates the last-mile of the machine learning journey – getting ML into your production environment to impact the bottom line. Headquartered in New York City, Wallaroo Labs is privately-held and backed by leading venture capital firms including Microsoft’s M12 and Boldstart Ventures.
Here is the full unedited transcript of the show:
- Swapnil Bhartiya: Hi, this is Swapnil Bhartiya and welcome to TFiR Let’s Talk. And today we have with us Vid Jain, Founder and CEO of Wallaroo. Vid it’s great to have you on the show.
Vid Jain: Hey, thanks for having us or having me. I’m really excited to be chatting about what we’re up to and our recent fundraise.
- Swapnil Bhartiya: Exactly. So this is the first time we are talking to each other. I would love to know a bit more about the company, especially that you are also a founder. So tell me what problem you saw in the space that you wanted to solve, that you created the company. And also tell me, what is the story behind the name Wallaroo okay.
Vid Jain: Well, I think the best way to start is actually a little bit of my background. So I came out of academia. I have a PhD from Berkeley, but ended up in New York and actually ended up at Merrill Lynch. And at Merrill Lynch, I helped build their high frequency trading business. There was a very specialized small group of us that took that business from nothing to about a billion dollars a year in revenue. And, that is all about machine learning at scale, right? We had dozens of exchanges throughout the world that we ran these trading models on, risk management, trade surveillance. And while we had, 25 data scientists, we used to call them Quants back then that were developing all these different algorithms. We had about 600 people doing other stuff. We had about a hundred million a year in infrastructure costs.
And so, it’s already hard to do machine learning for end to end, but what’s really hard is to do, is to deploy it at scale. And so when I left Merril Lynch at that point, everybody in, variety of different industries was working on, hiring data, scientists, manufacturing, retail, ad tech, eCommerce security. And so again, the problem that we set out to solve at Wallaroo was really not about cleaning your data, not about data wrangling or creating data lakes, but really once you have machine learning models or, algorithms that you want to have running in production and really impacting the bottom line, that part, doing it in a mature way, doing it so you get business value doing it at scale, that was hard and that’s the problem that we set out to solve. And, the way we set out to solve that was to take a totally different approach, right?
We wanted to, there are a lot of tools out there that people use all the way from Apache spark containerization to other open source solutions as at, Merrill Lynch and after I left Merrill Lynch as a consultancy, we tried a lot of these things. They didn’t work very well. So our goal was to really re-architect redesign re-implement, how these things were done from the ground up, as far as the name Wallaroo comes from, there’s a cute little story there. So, our primary users are developers, data science developers, and machine learning engineers. And, we have tons of those folks internally at the company. And so when we’re like, oh, what should we call the product? We had a little competition. And basically we asked everybody internally, Hey, we need to name this product. This was right after we raised our initial seed round. And, we ended up with three or four names, but eventually were process of elimination. We ended up with Wallaroo, cute Kali animal out of Australia, sort of a small kangaroo. And the name sticks people, remember the name. And, so it just stuck with us.
- Swapnil Bhartiya: Can you also talk about how you have seen the market evolve all the way, because now things are progressing so much so fast, especially after COVID things have changed. Every company wants to have a digital transformation. Every company wants to have a cloud strategy. So from your perspective, how you have seen things that changed or they gradually change.
Vid Jain: I, I think you’re right, actually, so we, it was a, for a couple of years, we’re doing this. I think we were very early to the market. And while people were very interested in what we were working on, they didn’t have the need. And then when COVID hit, what was, what was strange to us in some ways was while it was personally bad for everybody. And obviously, has caused a lot of havoc and deaths for the company as a whole, I think it was actually an accelerator. And I think one of the things that changed is if you think about the driving value from your data science and your machine learning, when all the stuff up to the a point of getting into production is cost building a giant data lake is costly. Hiring a bunch of data Scientists to develop models is costly, hiring a bunch of data engineers to clean up the data and get it ready for processing is costly. But none of those things by themselves generate in the ROI. And I think what happened when COVID hit is two things. One is that a lot of companies had to decide like, is this a research project that we can’t afford? Because the markets have become volatile and, uncertain, or is this something that we really need to now focus on generating business value out of, right? So the people had to basically decide that they had to take this out of R&D and really create business value from all the money they’d spent up to that point. And of course, since we’re focused on that last mile, which is getting it into production and generating business impact that made the conversations that we were having go much faster and smoother.
That was one thing that happened that was good for us. The second thing that happened was that a lot of companies started to leverage their data much more. So if you think about things like supply chain, logistics, e-commerce where they’re so affected by the disruptions and the volatility that was happening, that they had to double down on the use of data to manage those disruptions, right? And so not only did the overall market shift towards, hey, we have to actually get some value out of all of this money we spent, which, which obviously drove production machine learning, but there was more emphasis put on using data to manage the disruptions caused by, COVID, right? And so those, so it is, again, strangely, we didn’t expect this, we didn’t really anticipate this, but in hindsight, that was actually an accelerator for us and our company.
And so, I guess the side effect or the thing that caused it again, is the fact that I think getting those projects, those R&D projects into production has become a lot more important in places over the last two years, 18 months, right? And so it is accelerating, but I would still say what we’re seeing is that most organizations are still very early on in their machine learning journey, right? So this isn’t, we’re not even at the, we’re probably at the 15% point in that journey. We’re not at the 50% point. We’re not at the 33% point. We are still very early in terms of what people can, should be able to achieve machine learning. And the reason is that because there’s so many obstacles and is because it’s so difficult and a lot of organizations haven’t thought through the entire end-to-end support and skillset needed to actually operationalize and deliver on the promise of AI that, they’re addressing each of these bottlenecks or roadblocks one by one as they come to them, right?
And so, there are a few places, obviously Google, Microsoft, Apple, Goldman Sachs, Merrill Lynch, they use machine learning at scale as does, obviously places like Amazon or DoorDash, right? Their, their businesses are built on machine learning, but for the most part, 95% of organizations are still early in their journey. And, so I, the good thing is, there’s a lot of upside for us, but obviously it means that there’s also some education involved in terms of the market and the customers to try and explain the value proposition and try and explain where what they need to do to actually get ROI from their investments.
- Swapnil Bhartiya: Thank you so much for explaining that in detail. Now, I want to just talk a bit about the company itself. Of course, you folks announced 25 million in series A funding, and you also added two capabilities called model insights. So let’s start with the funding. Talk a bit about who the investors are and what are the areas that you’re trying to grow with this funding? Or what are the plans you have there?
Vid Jain: Yeah. Look, I’ve been really fortunate with some great investors. They, a lot of companies, they get investors, they get money. That’s what they get, right? And, building an infrastructure company where a data infrastructure company, a machine learning infrastructure company is hard. And you need investors that are not only giving you money and supporting you with their money, but they’re smart about go to market. They’re smart about how to build a network around you. They’ve got other portfolio companies that you can learn from, they’re bringing other investors to the table. And so I’ve been very fortunate with my initial set of seed investors, including Bold Start, Iyak, Contour, Greycroft, etc. Fantastic, fantastic investors. And then, I think I’m really excited about partnering with Microsoft’s venture arm, M12. So we had, I met them towards the end of September last year, and we just started a conversation.
And I really, I felt like they would not distract from my existing set of investors. They would add on top of that beyond the money they could bring. And it’s been great just through the due diligence process and interacting with them just in the last couple of weeks after the round closed, I’m really excited. I think they’re going to be excellent, excellent partner. And they’re already open opening doors within Microsoft for us, because I think there’s a lot we can do there as well with Microsoft, not only with the Azure ecosystem and being part of that, but also, some of our existing customers are Azure customers and there’s a lot we could do together. So I’m really excited so far. And I’m think it’s going to be a great partnership. Now what are we doing with the funds perspective?
We have three key initiatives. One is around community edition. So we want to get Wallaroo into the hands of as many users as possible. And the best way to do that is, is release a free version that anybody can download. Data science to this machine, learning engineers, kick the tires on it. And the, what’s that’s going to do for us is it’s going to give us really solid, rapid feedback on the product, which will let us iterate very quickly and make it even better. And the second thing is it’s going to create champions for the product within whether it’s a small startup or a larger, internet SaaS company, or it’s a large enterprise. The more champions that we have internally, the more usage and commercial uses that’s going to drive us. So the community edition is a huge initiative for us this year and we finally will have resources to do it.
The second thing, as you mentioned, is we already had some of this where we’re going to double down and accelerate our investment in what we’re calling model insights. And the reason that’s important is because, once you get a model into production, that’s really not sufficient, right? So again, thinking back to my trading days, if I get a, if we got a trading model into production, something that was trading using algorithms, and that thing actually did a bad job, you could lose money, you could lose a lot of money, right? And this is the case with many of these other use cases as well. Think about a recommendation algorithm. That’s recommending what customers should buy. If that recommendation algorithm is bad, it will decrease your sales. It will, will not increase them, right?
And so you have to be able to understand how not just get these models of production, but you have to be able to run large scale experiments compare four or five models with each other to figure out which one’s better. You have to be able to understand if the model starts drifting. If, for example, it used to be good two weeks ago. And all of a sudden it’s producing results, which are not as good or have the same efficacy. You need to be able to get to the root cause very quickly. You can’t have 50 data scientists sitting around all day looking at the screens, trying to figure out what’s going on, right? You need to be able to understand, Hey, these three models are having a problem. And within these three models, this is the root cause. Like these, the data that’s going in here, these specific features or fields that are going and have changed fundamentally from what it was before.
And therefore the model is making these different predictions, right? And so get to the root cause very quickly and be able to understand how to make them better without having to wait for your revenue system, right? Or your trading system to tell you, because usually those systems that are not real tight, right? They’re like two weeks later, they’re like, oh, by the way, your revenue dropped by 20%, right? At that point, somebody’s going to be ready to fire you or scream at you. And so you really need all those tools. And so I think, again, we are focused on getting to business value, getting to business results, right? You got to get them in production. You have to be able to iterate them, improve them quickly in production. You have to be able to see who did what, what was done, how are they performing? What’s the reason for the under performance.
All those, the things are what we call the last mile of machine learning. That’s everything that we’re focused on. And the third key initiative this year is integration. So we want anybody, no matter what kind of machine learning training environment they use, no matter what kind of cloud environment or on-prem environment that they use to move their data around or store things. We want anybody in any of those ecosystems be able to have a very simple and delightful experience in using Wallaroo, right? So that requires systematic integration. We already have a bunch of integrations. For example, from AWS age maker, you can deploy into Wallaroo from a data bricks notebook. You can deploy into Wallaroo. So we, we already got some of these integrations underway, but we want to do more. We want to make it more seamless.
And so the idea is if it’s delightful, if it’s simple, if it saves the CFO 80% typically we’re seeing that we can save organizations and enterprises 80% in their influencing infrastructure costs. So people are going to use, people are going to be like, this is the best thing out there. I’m going to use this. I love using it. And that’s what we’re seeing so far, right? The data scientists just love using our product. And then the budget owners, the CIOs, and the CFOs love the fact that they can save significant infrastructure costs and labor overhead in, using Wallaroo.
- Swapnil Bhartiya: Thanks for sharing all those insights. Now if I ask, this is kind of a crowded space. There are those companies, other solutions, what sets you folks apart from others?
Vid Jain: Good question. So we get this all the time from customers. So yes, there’s, there’s a lot of folks out there. And in fact, one of the interesting things was that Microsoft, the M12, the investment arm, they told us they speak to huge number of these machine learning software startups, right? They’re coming, knocking on their door. And that most of them basically sounded exactly the same. And we were from the beginning, we stood out because of two things. One is our technology differentiation. And the second thing is, is the excitement and sheer delight that our customers have had in using us. And so let me just talk about both of those things. So from a technology differentiation, we didn’t just take existing open source tools, right? Every other solution either is leveraging something like Apache spark or some other strategy, or basically taking containerization or Docker containers and using that to deploy machine.
Or we built something from the ground up specifically for high performance machine learning workflows. It’s written in a language called Rust. And it really gives us fundamental breakthrough in experience in terms of speed and efficiency and simplicity. And, and when we do POCs or, or with customers, they can see that difference, right? And especially if you’re dealing with complex models or large amounts of data, or lots of models, dozens or hundreds, or even thousands of models that speed and efficiency that we provide is enabling it lets you do things that you couldn’t do before and drive much better business results. The other thing is that we built this thing to delight the data scientists, to make the data scientists really excited to use it. A lot of these other solutions don’t really primarily put the data scientists hat on, right? So a lot of these things were built for machine learning engineers or they’re open source things where you need like some kind of deep engineering talent to make all the stuff work.
We really design for simplicity and power that data scientists like, and, and when they use it, they, they really love it. And then secondly, as I said, we are significantly cheaper or lower cost, let’s put it, to run and operate then any of these other solutions out there. So you’ve got, all these big players out there, obviously DataRobot, anything with data in it, Databricks, DataRobot, data this, hundreds of sales people running around trying to sell their stuff. But really whenever we do, bakeoffs we, we win those bakeoffs. Customers love us, CFOs love us, CIOs love us. And so that’s really, I think what drove Microsoft to invest. And that’s what we’re seeing in the marketplace.
- Swapnil Bhartiya: Vid, Thank you so much for taking time out today. And of course, talk about the company, the problem area, and of course the growth area, your solutions. I would love to have you back on the show because this is a topic which is, there’s so much talk, we could sit for hours and talk about that, but I really appreciate your time today. Thank you.
Vid Jain: Yep. Well, thanks for having me on and this was a lot of fun. Let’s do it again.