Data Source Mismatch: Training & Production Data

From The Foundation for Best Practices in Machine Learning
Technical Best Practices > Monitoring & Maintenance > Data Source Mismatch: Training & Production Data

Data Source Mismatch: Training & Production Data

Control

Define and deploy methods to detect the degree to which data sources and Features, in Model training and production data, match one another. If mismatch is detected, take measures to ensure that data sources and Features are adequately matched in both Model training and production data.


Aim

To (a) reduce nonsensical predictions of the Model due to (i) missing data, (ii) lack of data incorporated, or (iii) data measurement scaling, encoding and/or meaning; (b) to reduce the discrepancy between training and production data; and (c) highlight associated risks that might occur in the Product Lifecycle.


Additional Information