Navigating MLOps Solutions

A narrative and helpful tips for using MLOps

Many organizations in recent years have been taking advantage of the power of machine learning (ML). From natural language processing (NLP) to image recognition, fields from advertising to zoology have benefitted from the still-emerging power of ML. Humans are getting better at making sense from a glut of data. Recent technical advances have improved our ability to clean, engineer, and label our data. New tools have also helped us design and train ML models.

While these tools have helped data engineers and data scientists accelerate their work, most tools only act as components of the machine learning pipeline. Data scientists and ML engineers move data from one tool to another, develop in another, train in another, and so on. This requires added cycles, slows the process, and is not scalable. Further, often a gap remains: model serving and model operations.

We just got our data science team up and running — there’s more to it?

Congratulations! You have hired some smart, talented people to comb through your data, find gaps, and engineer creative ML solutions to solve problems and find insights. What many companies and government organizations are finding is that their headaches with data, hiring, model development, F1 scores, and model bias are early-stage problems.

A celebratory happy hour is in order when your team produces its first functioning ML pipeline. Well done, bottoms up.

The Morning After — Stark Realization

For many organizations, the next day reveals emerging complexity that causes many teams to rethink how they’re organized and scratching their head as to how they’ll manage their success.

“Ohhh no, there’s a lot more” Source.

Our model worked! Now what….

  • Where do we put it, do we have a model repository?
  • How do we deploy our model to production; will we use an API call? Will it be containerized?
  • How do we monitor for bias, accuracy, precision, drift, and effectiveness?
  • Are the models actually helping our business — have we linked them to business metrics?
  • How do we pull our models back and retrain them?
  • When do we retire a model altogether and build from the ground up?
  • Will all these be ad hoc decisions every time? Can we automate some of these processes?
  • Can this scale? How do we do all this with many models operating at the same time?
  • How does all this interact with our infrastructure — will we need more people or tools to scale up and down?
“This will probably scale well…right?” Source

A Growing Sea of Choices

Many players, big and small, open-source and commercial, have recognized these issues. They’ve entered the market to answer these questions and offer solutions. You’ve typed MLOps into Google. What will work for your team today and what will scale with you?

“SOS, who will work out best?!”, Source

One Tool to Rule Them All?

Here at Anno.Ai we have been experimenting with many of these platforms and will continue to in the coming months. We’ll be documenting our journey, and finding the strengths and weaknesses of many popular MLOps platforms for different types of teams. For example, an MLOps solution that works well for a team of five may not work well for a team of 50 or for a team of 2,000.

Image source

Kubeflow is a toolkit designed to automate and scale ML development operations, that was originally used internally at Google as a simplified way to run TensorFlow jobs on Kubernetes. Kubeflow accomplishes this by harnessing the benefits of container orchestration frameworks, often Kubernetes. At its core, Kubeflow offers end-to-end ML stack orchestration to efficiently deploy, scale, and manage complex machine learning workflows. Kubeflow allows for containerized deployment of models across an enterprise, agnostic to the production platform (GCP, AWS; OpenShift, Rancher, on-prem infrastructure; edge devices, etc.). Kubeflow utilizes a microservices approach when building enterprise-scale machine learning workflows. This allows data scientists the ability to effectively collaborate and build reusable machine learning solutions thereby eliminating duplicate efforts.

Image source

Mlflow is a platform that is also designed to help users manage ML operations using four components: tracking, projects, models, and registry. Mlflow is designed to work with the Apache Spark framework and is designed to scale from one user to large, enterprise operations and its open interface is designed to work with a variety of languages and tools. Mlflow also seeks to enable greater collaboration between data scientists, allowing users to share components, code, and data preparation and training tasks.

Image source

DVC is for machine learning project version control — think Git for data. In fact, the DVC syntax and workflow patterns are very similar to Git, making it intuitive to incorporate into existing repos. DVC also supports pipelines to version control the steps in a typical ML workflow (e.g., data transformation, feature engineering, augmentation, and training). While DVC fills a niche for versioning and managing ML data, it also offers other features that address additional aspects of the MLOps workflow including pipelines and experiment tracking and visualization. The DVC team also recently launched a sister project, CML, for CI/CD of ML workflows.

We’ll not only be evaluating each of the above platforms, but we’ll also be looking under the hood of the open-source platforms to see if component parts from one platform can be removed and integrated with components from another.

Evaluation Criteria

We’ll be evaluating these open-source platforms against the following criteria:

  • Automated infrastructure provisioning
  • Data/model versioning
  • Model serving
  • Experiment management/logging
  • Scalability
  • User-friendliness & ease of integration
  • Collaborative
  • Works across cloud platforms
  • Dependencies
Chart by Anno.AI

What’s Next

We’ll be providing some basic assessments of each platform as we take a look under the hood and test them out. Have some additional important criteria? See it through a different lens? Drop us a line and let us know!

Be sure to follow us on Twitter and LinkedIn!

Operationalizing applied machine learning for the mission.