Planning for a machine learning project requires careful consideration, unlike standard products and services. Due to the nature of ML projects, they are never completed or fully handed over to the customer. In fact, they are ongoing projects that adapt to the changing needs of users.
To simplify the explanation of the most important points, each machine learning project is divided into three main parts, prototyping, deployment and monitoring. Each part describes the items that you should consider in planning.
Prototyping
The goal of prototyping is to decide if the application is workable and worth deploying. During this phase, it is acceptable to consider manual preprocessing of data. Take extensive notes/comments. The prototyping process should include the following steps:
1. Obtain dataset:
- Define the list the datasets which is needed for the project
- Define the strategy for labeling data, in-house / outsourced / crowdsourced
- Describe other datasets you believe are important to this project, especially meta-data (the data about data).
2. Define a baseline: a baseline is a simple model that provides reasonable results on a task and does not require much expertise and time to build. Baseline gives an estimate of the irreducible error and indicates what might be possible, it also helps to determine the amount of time and effort required to develop the final model. It will also be useful to make clearer the required data and the missing data and even hardware needs.
Way to establish a baseline:
- Human level performance (HLP)
- Literature search for state-of-the-art/open source
- Quick-and-dirty implementation
- Performance of older system
Common baseline models include:
- Linear regression when predicting continuous values
- Logistic regression when classifying structured data
- Pre-trained convolutional neural networks for vision related tasks
- Recurrent neural networks and gradient boosted trees for sequence modeling
3. Clarify Auditing framework: Check for accuracy, fairness and bias.
- Brainstorm the ways the system might go wrong.
- Performance on subsets of data (e.g., ethnicity, gender).
- Prevalence of specific errors/outputs (e.g., FP, FN).
- Performance on rare classes.
- Establish metrics to assess performance against these issues on appropriate slices of data.
- Get business/product owner buy-in.
Deployment
The deployment is an iterative process. Consider the following points to complete the deployment process:
1. Clarify the software engineering requirements and issues involved in designing a prediction service
- Realtime or Batch
- Could vs. Edge/Browser
- Compute resources (CPU/GPU/memory)
- Latency, throughput (QPS)
- Logging
- Security and privacy
2. Clarify data pipeline to make sure the data is replicable
- Tools like TensorFlow transform, Apache beam, Airflow,…
- Keep track of data provenance ( where it comes from) and lineage (sequence of steps)
3. Clarify the type of deployment:
- New product/capacity
- Automate/assist with manual task (shadow deployment)
- Replace previous ML system
4. Clarify the deployment pattern:
- Canary deployment
Monitor system and ramp up traffic gradually. - Blue green deployment
The old version can be called the blue environment while the new version can be known as the green environment. As you test and deploy to your green environment, you keep your blue environment running seamlessly for production users, until successful deployment and testing on green environment.
Monitoring
1. Examine the concept drift and data drift according to the investigation of the Productization phase.
2. Provide monitoring dashboard
- Software metrics
- Memory
- Compute
- Latency
- Throughput
- Server load - Input metrics
- Average image brightness
- Num missing values
- Avg input volume - Output metrics
- times return null
- times user redoes search
- times user switches to typing
The definition of what is y given x changes
Success criteria
A final point concerns the success criteria of each project. Both the technical and business teams should agree on success criteria that they are comfortable with. In order to achieve this, the machine learning team might stretch a little bit further to business metrics, and the business teams might stretch a little bit further to the machine learning metrics. Generally, the closer one gets to business metrics, the harder it becomes for a machine learning team to make a guarantee.
Key metrics:
- ML metrics (accuracy, precision/recall, etc.)
- Software metrics (latency, throughput, etc. given compute resources)
- Business metrics (revenue, etc.)
Reference:
MLOps Specialization course which recently developed by DeepLearning.AI and Coursera
Written by the team at PhazeRo.