Product Traceability

From The Foundation for Best Practices in Machine Learning
Technical Best Practices > Product Traceability
Jump to navigation Jump to search


Hint
To view additional information and to make edit suggestions, click the individual items.

Product Traceability

Objective
To ensure the clear and complete Traceability of Products, Models and their assets (inclusive of, amongst other things, data, code, artifacts, output, and documentation) for as long as is reasonably practical.


21.1. Product Definition(s)

Objective
To document and maintain an overview of the requirements necessary to complete the Product and the interdependencies in the Product design phase.
Item nr. Item Name and Page Control Aim
21.1.1. Document Storage

Define a single fixed storage solution for all reports, documents, and other traceability files.

To (a) prevent the usage and dissemination of outdated and/or incorrect files; (b) prevent the haphazards storage of Product reports, documents and/or files; and (c) highlight associated risks in the Product Lifecycle.

21.1.2. Version Control of Documents

Ensure that document changes are tracked when changes are made. Subsequent versions ought to list version number, author, date of change, and short description of the changes made.

To (a) track changes to any and all documents; (b) ensure everyone is using the same and latest document version; and (c) highlight associated risks in the Product Lifecycle.

21.1.3. Architectural Requirements Document

Document which information technology resources are necessary for each element of the Product to provide a necessary overview of system requirements and cost distribution. Document the reasons each resource was chosen along with justifications.

To (a) provide clear documentation of which system resources are used, where they are used, why they are used, and costs; and (b) highlight associated risks in the Product Lifecycle.

21.2. Exploration

Objective
To document the impact analysis of each requirement.
Item nr. Item Name and Page Control Aim
21.2.1. Document Impact Analysis of Requirements

Document and complete an impact analysis on the resources and design of the Product that can result in technical debt.

To (a) avoid Product failures due to unresolved technical debt by documenting potential sources of friction and the solutions; and (b) highlight associated risks in the Product Lifecycle.

21.2.2. Resource Traceability Matrix

Provide and keep up to date a clear view of the relationships and interdependencies between resources in a documented matrix.

To (a) document and show resource coverage for each use case; and (b) highlight associated risks in the Product Lifecycle.

21.2.3. Design Traceability Matrix

Provide and keep up to date a clear view of the relationships and interdependencies between designs and interactions thereof in a documented matrix.

To (a) document design and execution status; (b) clearly trace current work and what can be pursued next; and (c) highlight associated risks in the Product Lifecycle.

21.2.4. Results Reproducibility Logs

Throughout the entire Product Lifecycle, whenever a Product component - inclusive of Models, experiments, analyses, transformation, and evaluations - are run, all parameters, hyperparameters and results ought to be logged and/or tracked, including unique identifier(s) for runs, artifacts, code and environments.

To (a) enable Absolute Reproducibility; (b) validate Models and Outcomes through enablement of analysis of logs, run comparisons and reproducibility.

21.3. Development

Objective
To document and maintain the status of each product and the testing results. Ensure 100% test coverage. Prevent inconsistencies between project elements and prevent feature creep.
Item nr. Item Name and Page Control Aim
21.3.1. Backlog

Ensure that an effective backlog is maintained to track work items and serve as a historical representation and timeline of completed features and velocity.

To (a) ensure a comprehensive breakdown of Features and tasks necessary to achieve full product functionality; (b) provide highly readable coarse-grained versioning; and (c) highlight associated risks in the Product Lifecycle.

21.3.2. Documentation for Technical Contributors

Maintain technical documentation that enables all current and future contributors to efficaciously and safely develop and maintain the Product, including such information as description of each file, the workflow, author, environments, accrued technical debt.

To (a) maintain Product technical integrity by ensuring safe contribution and maintenance practices; and (b) highlight associated risks in the Product Lifecycle.

21.3.3. Version Control of Code

Maintain uninterrupted version control systems and practices of all code used by, in and during the Product and its Lifecycle.

To (a) contribute to Absolute Reproducibility; (b) ensure change lineage; (c) ensure Product Outcome lineage and traceability; and (d) highlight associated risks in the Product Lifecycle.

21.3.4. Docstrings and Code Comments

Document in each function the author of code, purpose of code, input, Output, and improvements to be made. Document the source of inputs and potentially a short business description of data used.

To (a) ensure Model clarity as to technical progress; and (b) highlight associated risks in the Product Lifecycle.

21.3.5. Project Status Reports

Ensure that all status reports and similar communications to Management and Stakeholders are stored and maintained, inclusive of team updates, reports to the Product Manager, and Stakeholder reports by request.

To (a) maintain a formal written record of decisions, progress and context evolution; and (b) highlight associated risks in the Product Lifecycle.

21.4. Production

Objective
To document the observed impact of updates to the product. Document product runs and their input for reproducibility.
Item nr. Item Name and Page Control Aim
21.4.1. Version control through CI/CD

Maintain distinct production versions to easily revert or roll back to a working previous Product, if production issues arise. Properly set up CI/CD enables easy redeploy of any artifact and version.

To (a) provide functional Product to users at all times; (b) seamlessly redeploy Product versions if needed; and (c) highlight associated risks in the Product Lifecycle.

21.4.2. Data Lineage Manifest

Utilise a data lake for production data, intermediate results, and end results. Each step should be documented in a manifest that is passed from one step of the process to the next and always accompanies stored data and results.

To (a) create a structured way for tracing where data has been, what was done to it, and results; and (b) highlight associated risks in the Product Lifecycle.