Data Lineage Manifest

From The Foundation for Best Practices in Machine Learning

Data Lineage Manifest

Control

Utilise a data lake for production data, intermediate results, and end results. Each step should be documented in a manifest that is passed from one step of the process to the next and always accompanies stored data and results.


Aim

To (a) create a structured way for tracing where data has been, what was done to it, and results; and (b) highlight associated risks in the Product Lifecycle.

Additional Information