Open source MLOps framework ZenML raises $2.7M

Hear from CIOs, CTOs, and other C-level and senior execs on data and AI strategies at the Future of Work Summit this January 12, 2022. Learn more

Munich-based ZenML, a startup providing an extensible and open source MLOps framework to accelerate and simplify the delivery of machine learning models, and research and production, today announced it has raised $2.7 million in a seed round of funding. The company plans to use the investment, which was led by Crane Venture Partners and multiple notable AI researchers, towards strengthening its technology team and further building out its tooling suite for data scientists.

Despite the ever-evolving MLOps landscape, the process of taking a machine learning project to production or live environments continues to be extremely hard. Unlike traditional applications, ML systems bring a lot of complexity with dependence on both code and data. Data, in particular, is very hard to wrangle and can change in expected ways, affecting the performance of the model. As a result, data science teams have to handle a deluge of tooling options and processes to ship their model, which not only adds to the confusion and fragmentation but also requires multiple skill sets.

“Most tools separate workflows into islands that mainly concentrate on the early development phase for data scientists, or the later deployment phase, which is largely owned by engineering. This causes systemic failures in the entire system like a lack of reproducibility or provenance across the pipeline,” Hamza Tahir, cofounder of ZenML, told Venturebeat.

A standardization layer for MLOps

To solve this particular problem, Tahir started ZenML with Adam Probst in July 2021. The startup offers a tooling and infrastructure agnostic framework that acts as a standardization layer and allows data scientists to iterate on promising ideas and create production-ready machine learning pipelines.

Available as a lightweight Python library, ZenML’s framework enables data scientists to express their ML workflows as pipelines. The steps within can be defined as simple Python functions that could handle arbitrary tasks such as preprocessing data or training a model. Teams, could then easily plug and play their infrastructure and tooling needs right into their ML pipeline, with a few simple configuration changes.

“With ZenML, every ML project will have the same user experience as a simple Python project. The only difference is that you’re working on real machine learning use cases that instantly can be brought into production. Nobody will need to do the heavy lifting of setting up infrastructures or coordinating between DevOps teams and data scientists,” Tahir said.

Differentiation

While there are workflow automation tools that let users define workflows as pipelines, including players like Airflow, Prefect, and Luigi, ZenML claims to set itself apart by treating ML-specific artifacts like models, data drift, and feature statistics as first-class citizens. The framework then offers data scientists a path to solve complex problems such as reproducibility and versioning of data, code, and models.

“These tools are built on a hard-to-understand syntax, which often can be scary to the data scientist persona. We aim to do the exact opposite (with a unified syntax in familiar language) so our users can become more invested in working on their native solutions rather than learning how to use the tool they are using,” Tahir emphasized.

Though ZenML is still in the early stages of development, the company claims to have seen a tremendous response, with over 1,000 GitHub stars and downloads growing 20% to 40% every week. It has also successfully handled a couple of paid projects from Airbus Defence and Space, focusing on object detection on new high-resolution satellite images.

“In the last few months, we have rewritten the ZenML codebase to be more robust and user-friendly, Tahir noted. “We have also tripled our team in the space of a few months and released ZenML 0.5 that includes support for writing pipelines with standard artifacts like Tensorflow or PyTorch models with Kubeflow.”

Moving ahead, the company plans to grow its team of MLOps technologists and expand the framework by integrating more tooling libraries to match the needs of data science teams across organizations. This would include libraries such as Evidently/WhyLogs/GreatExpectations for validation and BentoML/Seldon/KServe for deployment.

VentureBeat

  • up-to-date information on the subjects of interest to you
  • our newsletters
  • gated thought-leader content and discounted access to our prized events, such as Transform 2021: Learn More
  • networking features, and more

Source: Read Full Article