OctoML, the MLOps automation company, has announced early access to Octomizer, its machine learning platform for automated model optimization and deployment.
Octomizer brings the power Apache TVM, an open source deep learning compiler project, to machine learning engineers challenged by model deployment timelines, inferencing and throughput performance issues or high inferencing cloud costs.
Accessible through both a SaaS platform and API, the Octomizer accepts serialized models, enables users to select specific hardware targets, and optimizes and packages models for the selected hardware.
According to the company, the Octomizer makes use of TVM’s technical performance capabilities to deliver up to 10 times model performance improvements. It enables deep learning teams to improve model performance, cut inferencing costs, and reduce time and effort for model deployment.
The Octomizer currently makes available all cloud-based CPU and GPU as well as ARM A-class hardware targets, with additional hardware targets identified for early 2021.
OctoML’s early partners include Computer Vision (CV) and Natural Language Processing (NLP) machine learning teams focused on improving model performance on various targets such as NVIDIA’s V100, K80, and T4 GPU platforms, Intel’s Cascade Lake, Skylake, and Broadwell x86 CPUs, and AMD’s EPYC Rome x86 CPUs.