Live on the cloud
Living on the cloud goes behind cloud migration. This is where the cloud underpins innovation across your enterprise. Teams need to adapt to the development, testing, and training of models on cloud services and resources, for scalability, optimal use, and cost control.
The following steps ensure a successful strategy for living in the cloud:
- Develop the ML model and choose the best model based on quantitative and qualitative measures, ensuring reproducibility through a version control of data and models along with parameters in the ML system
- Deploy the chosen model to production
- Integrate with the required output, such as business intelligence (BI) dashboards, custom applications, or third-party APIs
Optimization and model monitoring are essential for maintaining a feedback loop from the deployed model to the building model. The ML and DevOps engineers must set up a model monitoring metrics stack and automate the monitor in real time to ensure that models remain relevant in the context of the most recent data in production.
Three broad categories of metrics must be monitored:
- Stability metrics to capture data distribution shifts in production
- Performance metrics to identify concept shifts in data and track the change in the relationship between independent and dependent variables
- Operational metrics to identify ML system health issues such as IO/memory/CPU usage, disk utilization, ML endpoint calls, and latency