News

Serverless could be the soul of Artificial Intelligence

0

The modern world is a data driven world. But what good is data without machine learning and AI making sense of that data. We may not realize it, but machines are making low level decisions for us. All around us. All the way from Nest thermostat to Tesla cars. Machine Learning (ML) and AI are invading almost every aspect of modern IT infrastructure. They are helping find bugs in software. They are helping find a cure to cancer.

With the increasing adoption of ML and AI, there is a growing need for new business models and technologies that empower customers to consume ML and AI without having to worry about the baggage that comes with managing the infrastructure that’s needed for these capabilities.

Paperspace is a company that specializes in that.

A brief history of Paperspace

Daniel Kobran and Dillon Erb, co-founders of Paperspace, have a background in building in the architectural world. At the time when Dillon was working on a physics engine for building simulation using Nvidia’s CUDA, he was fascinated by the way GPUs were entering into the cloud computing market. They were creating new workflows and new applications that could work on the GPU.

Dillon Erb, co-founder, Paperspace

“We witnessed how the GPU shifted from being a generic computer deviceto being the main component for training deep learning models,” said Dillon. This shift created an opportunity to help customers get started with integrating GPU compute into their workflow. Kobran and Dillon founded Paperspace to tap into the emerging market. Today, Paperspace is a three year old New York based company that’s looking at GPU compute in the cloud as a potential market.

Paperspace started off with just offering GPU compute, including CAD/CAM simulation, HPC visual effects and so on.But, thanks to the changing market dynamics, AI and training deep learning models became the largest audience for Paperspace. The market is still evolving and changing. Today, serverless computing is the hottest buzzword after containers. People have started to used serverless in production. Once again, Paperless saw a huge opportunity in the space and launched a product called Gradient, a tool stack for machine learning and AI developers to more quickly and easily leverage GPUs in the cloud.

What’s Serverless?

Serverless computing is a relatively new buzzword and there is quite some confusion around what it means. Cloud Native Computing Foundation (CNCF) came out with a white paper defining what it means. However, different people look at serverless computing differently.

When Paperspace looks at serverless computing, in context of Gradient, it sees it as a serverless AI for end users. “A developer or a company that’s looking to integrating machine learning into their existing stack can use Gradient to deploy very complicated models. They can do complex things like training, deployment and model versioning without ever thinking about the infrastructure,” said Dillon.

That’s the crux of serverless. A developer doesn’t have to concern themselves with the nitty-gritty of provisioning, scaling, monitoring – all the baggage that comes with servers.. “Our expertise is in infrastructure. We have done all the necessary plumbing that makes it all work. We abstract all of it from users and offer tools that they can use without have to interact with that infrastructure,” said Dillon.

Paperspace has a Python integration so a customer can easily run their code on the cluster of GPUs that Paperspace manages. “As its evidently clear, serverless doesn’t mean there aren’t servers. There are servers, but we manage them. What it means for customers is that they don’t have to worry about about web servers and can start leveraging this higher level tool stack that we offer,” said Dillon.

Who is using serverless for AI?

Paperspace still has traditional customers who use GPUs for things like effects and desktop delivery, but that customer base is shrinking. Now people want to run Tensorflow and need access to GPU capabilities. “Today, our biggest audience is the machine learning audience, as responding to the demand we are moving deeper into that space.”

It’s an untapped market. There is a huge opportunity to develop in this space. “What’s happening is that people who want to get started with building cloud machine learning pipelines don’t have a lot of good tools at their disposal. There are many open source projects; if you are a DevOps kind of person, you can start using Amazon or Google, but if you want to build a more sophisticated pipeline, then things get complicated. There’s a lot of value in creating that as a layer on top of GPU compute that really abstracts away all the complexity of managing the actual hardware itself,” said Dillon.

It’s all about multi-cloud

Paperspace runs in two kinds of environments, managed Colo GPU cluster and on public cloud. In the case of managed cluster, Paperspace takes care of managing it for customers. In terms of public cloud, Paperspace leans more towards Google Cloud. “The reason is that if a customer already has a lot of data in the cloud, you want compute tools as close as possible to the data,” said Dillon.

Paperspace is designed with multi cloud in mind. Customers can easily move customer workload from one environment to another. A good example would be Google’s Tensor Processing Unit (TPU), which competes with Nvidia’s GPU. TPU is also integrated with Paperspace, by changing one line of code, customer workload runs on TPU instead of GPU. “As ecosystem becomes robust and more hardware comes out, we still need an orchestration layer to abstract low level details of computer infrastructure so customers don’t have to worry about multi-cloud support.”

Customers don’t like to be locked into a single vendor. Paperspace has solved that problem by creating cloud gateway connectors that can connect Paperspace network to any cloud providers including AWS and GCP.

“When customers train deep learning models, they don’t train them on the entire collection of that data,” said Dillon. A customer may have three terabytes of data. A security company, for example, may be doing some anomaly detection. They would plug Paperspace into their Amazon VPC and pull the data for machine learning into Paperspace and perform desired tasks- train the models, evaluate the models, collaborate on new technologies… or whatever they want to do.

Paperspace enables them to plug into different environments, thus supporting the multi-cloud strategy and bringing AI capabilities within their reach.