Machine Learning Platform

Defining its components and the reason for building one

Aug 26, 2023

👋 Hi, this is Sarah with the weekly issue of the Dutch Engineer Newsletter. In this newsletter, I cover data and machine learning concepts, code best practices, and career advice that will help you accelerate your career.

The data field is synonymous with building and having broken pipelines. Business needs a metric. And the data team complies. That cycle continues until we end up having many custom pipelines. I am responsible too for creating those custom pipelines, but now I built platforms where I standardize them too. In this article, I share why there is a need for a machine learning platform and the components of those platforms.

Thanks to Delta for sponsoring this newsletter! I am a huge fan of Delta Lake and use it every day both in Data Engineering and Machine Learning.

The need for machine learning platforms

The business started out small and needed a pipeline here and there. Engineers were able to meet those requirements by building custom pipelines for each individual request.

Custom pipelines will break. Engineers who often wrote the custom pipeline maintain them. Why? Because teaching other engineers the specificities of these pipelines or writing documentation can be time-consuming. Those activities are not building new features for the business and are often forgotten. At a smaller scale, custom pipelines are manageable. Yet, if the business requires more pipelines than the current engineers can handle, adding more engineers may suffice for a while. They can continue building custom pipelines and sharing their knowledge. But every team reaches a point where adding more engineers is no longer viable. Engineers are a limited and expensive resource.

The alternative would be to increase the productivity of engineers in their jobs. As you may have guessed, creating custom pipelines is an inefficient process. What if we could establish a standard method for building pipelines? That's what a platform is for. Platforms standardize pipelines so that engineers can solve most pipeline-related issues. Additionally, the platform can be extended for cases that are not covered. This applies to engineering teams within and outside of data.

However, having a platform for machine learning teams is more crucial than for most engineering teams. Machine learning teams often construct their pipelines with the assistance of data scientists. The focus of data scientists is on researching new problems for the business and translating those results into features for the feature store, and then training models. As data scientists concentrate more on experimentation than standardization, machine learning engineers often receive a notebook with different functions and are required to translate those into pipelines. The platform standardizes most of these pipelines, but we also need to ensure that the data science team uses the standardized functions. This is the challenging part, but it is where the majority of the efficiency boost will come from.

Components of machine learning platform

The machine learning platform caters to the needs of both data scientists and machine learning engineers. Its success depends on improving the efficiency of the data scientists' processes. Thus, when going through the machine learning components, keep in mind that their productivity is also a key factor to consider.

A machine learning platform often consists of these components — compute, feature store, model tracking, model registry, monitoring, model serving, and testing framework.

Compute - Computing power supports the interface where machine learning engineers and data scientists research and create their models. Meeting the changing computing needs of data scientists can save them a lot of time, as models vary in their computing requirements.

Feature store - A feature store is a centralized repository that stores data for models. Data scientists use a feature store to reduce their time in model training.

Model tracking - Model tracking keeps a record of the models we have trained. By consolidating experiments in one place, data scientists can spend less time searching for previous experiments and more time devising new experiments.

Model registry - A model registry stores and manages machine learning models. making them accessible to developers and data scientists. By saving models in a registry, data scientists and machine learning engineers can reduce the time it takes for a model to go from training to production.

Monitoring - Monitoring is a complex issue in the machine learning pipeline. We need to watch models in two ways.

The initial method involves inspecting whether a pipeline failure can be attributed to a clear cause, such as the addition of a new value to a data column. This addition might result in an extra column during one-hot encoding, which the model had not been trained to handle.

The next method involves recognizing subtle shifts in the underlying data that lead to a decline in a model's performance. It is important to note that these shifts do not manifest as outright errors; rather, the model's accuracy gradually diminishes. For instance, imagine a weather prediction model trained on historical weather data from a specific region. If the climate patterns in that region change gradually due to unforeseen factors like urban development or climate change, the model's accuracy might deteriorate over time as it struggles to adapt to these new patterns it has not encountered in its training data.

Model serving - Model serving provides the capability (e.g. adding an endpoint) to embed machine learning models in other systems.

Model testing framework - The model testing framework is a structured methodology for evaluating the performance of different machine learning models (think like A/B testing for a feature in the UI but then for models).

I have separated the compute and model testing frameworks, which could have been included in other components. Model testing frameworks compare the effectiveness of different machine learning models, so we can improve models that already answer business questions. Separating compute also serves as a reminder to machine learning engineers that the machine learning platform is important for both data scientists and machine learning engineers.

Final Thoughts

In this article, we dove into what makes up a machine learning platform – breaking down its key components and why the heck we even need this kind of platform in the first place.

Other interesting items

Article written by
Gergely Orosz
on Platforms in Uber from a Manager's perspective.

The Pragmatic Engineer

The Platform and Program Split at Uber

👋 Hi, this is Gergely with an originally 🔒 subscriber-only issue of the Pragmatic Engineer Newsletter — in fact, this was the first paywalled issue. 2 years after publishing this issue, I’ve opened it up to the public, celebrating 450,000 readers of the newsletter…

4 years ago · 76 likes · 5 comments · Gergely Orosz

A talk given by
Mikiko Bazeley

Dutch Engineer’s Newsletter

Discussion about this post