Missing and Bad Data Assessment

From The Foundation for Best Practices in Machine Learning
Technical Best Practices > Data Quality > Missing and Bad Data Assessment

Missing and Bad Data Assessment

Control

Document and assess (a) the occurrence rates and (b) co-variances of missing values and nonsensical values throughout the Model data. If either is significant, investigate causes and consider discarding affected data dimension(s) or commit dedicated research and development to mitigating measures for affected data dimension(s). (See Section 12.3.1. - Live Data Quality for further information.)


Aim

To assess (a) the risk of low quality data introducing bias to Model data and/or Outcomes; and (b) whether Model dataset(s) quality is sufficient for Product Definitions; and (c) highlight associated risks that might occur in the Product Lifecycle.


Additional Information