Definitions

From The Foundation for Best Practices in Machine Learning

Introduction

As used in this Best Practice, the following terms shall have the following meanings where capitalised. All references to the singular shall include references to the plural, where applicable, and vice versa. Any terms not defined or capitalised in this Best Practice shall hold their plain text meaning as cited in English and data science.


Dictionary

Absolute Reproducibility
means a guarantee that any and all results, outputs, outcomes, artifacts, etc can be exactly reproduced under any circumstances.

Adversarial Action
means actions characterised by mala fide (malicious) intent and/or bad faith.

Assessment
means the action or process of making a series of determinations and judgments after taking deliberate steps to test, measure and collectively deliberate the objects of concern and their outcomes.

Assets
means information technology hardware that concerns Products Machine Learning.

Best Practice Guideline
means this document.

Business Stakeholders
means the departments and/or teams within the Organisation who do not conduct data science and/or technical Machine Learning, but have a material interest in Products Machine Learning.

Confidence Value
means a measure of a Model's self-reported certainty that the given Output is correct.

Corporate Governance Principles
mean the structure of rules, practices and processes used to direct and manage a company in terms of industry recognised and published legal guidelines.

Data Generating Process
means the process, through physical and digital means, by which Records of data are created (usually representing events, objects or persons).

Data Governance
means the systems of governance and/or management over data assets and/or processes within an Organisation.

Data Quality
means the calibre of qualitative or quantitative data.

Data Science
means an interdisciplinary field that uses scientific methods, processes, algorithms and computational systems to extract knowledge and insights from structured and/or unstructured data.

Domain
means the societal and/or commercial environment within which the Product will be and/or is operationalised.

Edge Case
means an outlier in the space of both input Features and Model Outputs.

Error Rate
means the frequency of occurrence of errors in the (Sub)population relative to the size of the (Sub)population

Ethical Practices
means the ethical principles, values and/or practices that are encapsulated and promoted in an 'artificial intelligence' ethics guideline and/or framework, such as (a) The Asilomar AI Principles (Asilomar AI Principles, 2017), (b) The Montreal Declaration for Responsible AI (Montreal Declaration, 2017), (c) The Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems (IEEE, 2017), and/or (d) any other analogous guideline and/or framework.

Ethics Committee
means the committee within the Organisation charged with managing and/or directing organisation Ethical Practices.

Evaluation Error
means the difference between the ground truth and a Model's prediction or output.

Executive Management
means the managerial team at the highest level of management within the Organisation.

Explainability
means the property of Models and Model outcomes to be interpreted and/or explained by humans in a comprehensible manner.

Fairness & Non-Discrimination
means the property of Models and Model outcomes to be free from bias against protected classes.

Features
mean the different attributes of datapoints as recorded in the data.

Guide
means an established and clearly documented series of actions or process(es) conducted in a certain order or manner to achieve particular outcomes.

Hidden Variable
means an attribute of a datapoint or an attribute of a system that has a causal relation to other attributes, but is itself not measured or unmeasurable.

Human-Centric Design & Redress
means orienting Products and/or Models to focus on humans and their environments through promoting human and/or environment centric values and resources for redress.

Implementation
means every aspect of the Product and Model(s) insertion of and/or application to Organisation systems, infrastructure, processes and culture and Domains and Society.

Incident
means the occurrence of a technical event that affects the integrity of a Product and/or Model.

Label
means the Feature that represents the (supposed) ground-truth values corresponding to the Target Variable.

Machine Learning
means the use and development of computer systems and Models that are able to learn and adapt with minimal explicit human instructions by using algorithms and statistical modelling to analyse, draw inferences, and derive outputs from data.

Model
means Machine Learning algorithms and data processing designed, developed, trained and implemented to achieve set outputs, inclusive of datasets used for said purposes unless otherwise stated.

Organisation
means the concerned juristic entity designing, developing and/or implementing Machine Learning.

Outcome
means the resultant effect of applying Models and/or Products.

Output
means that which Models produce, typically (but not exclusively) predictions or decisions.

Performance Robustness
means the propensity of Products and/or Models to retain their desired performance over diverse and wide operational conditions.

Policy
means a documented course of normative actions or set of principles adopted to achieve a particular outcome.

Procedure
means an established and defined series of actions or process(es) conducted in a certain order or manner to achieve a particular outcome.

Product
means the collective and broad process of design, development, implementation and operationalisation of Models, and associated processes, to execute and achieve Product Definitions, inclusive of, inter alia, the integration of such operations and/or Models into organisation products, software and/or systems.

Product Lifecycle
means the collective phases of Products from initiation to termination - such as design, exploration, experimentation, development, implementation, operationalisation, and decommissioning - and their mutual iterations.

Product Manager
means either a Design Owner and/or Run Owner as identified in the Organisation Best Practice Guideline in Sections 3.1.4. & 3.1.7. respectively.

Product Owner
means the employee charged with (a) managing and maximising the value of the Product and its Product Team; and (b) engaging with various Business Stakeholders concerning the Product and its Product Definitions.

Product Subjects
means the entities and/or objects that are represented as data points in datasets and/or Models, and who may be the subject of Product and/or Model outcomes.

Product Team
means the collective group of Organisation employees directly charged with designing, developing and/or implementing the Product.

Project Lifecycle
means the collective phases of Products from initiation to termination - such as design, exploration, experimentation, development, implementation, operationalisation, and decommissioning - and their mutual iterations.

Protected Classes
mean (Sub)populations of Product Subjects, typically persons, that are protected by law, regulation, policy or based on Product Definition(s)

Public
means society at large.

Public Interest
means the welfare or well-being of the Public.

Representativeness
means the degree to which datasets and Models reflect the true distribution and conditions of Subjects, Subject populations, and/or Domains.

Root Cause Analysis
means the activity and/or report of the investigation into the primary causal reasons for the existence of some behaviour (usually an error or deviation).

Safety
means real Product Domain based physical harms that result through Products and/or Models applications.

Security
means the resilience of Products and/or Models against malicious and/or negligent activities that result in Organisational loss of control over concerned Products and/or Models.

Selection Function
means a (where possible mathematical) description of the probability or proportion of all real Subjects that might potentially be recorded in the dataset that are actually recorded in a dataset.

Social Corporate Responsibilities
means the structure of rules, practices and processes used to direct and manage a company in terms of industry recognised and published legal guidelines to positively contribute to economic, environmental and social progress.

Software
means information technology software that concerns Products Machine Learning.

Special Interest Groups
means a specific body politic, or a particular collective of citizens, who can reasonably be determined to have a material interest in the Product.

Specification
means the accuracy, completeness and exactness of Products, Models and/or datasets in reflecting Product Definitions, Product Domains and/or Product Subjects, either in their design and development and/or operationalisation.

Stakeholders
mean the department(s) and/or team(s) within the Organisation who do not conduct data science and/or technical Machine Learning, but have a material interest in Product Machine Learning.

Subjects
means the entities and/or objects that are represented as data points in datasets and/or Models, and who may be the subject of Product and/or Model outcomes.

(Sub)population
means any group of persons, animals, or any other entities represented by a piece of data , that is part of a larger (potential) dataset and characterized by any (combination of) attributes. The importance of (Sub)populations is particularly high when some (Sub)populations are vulnerable or protected (Protected Classes).

Systemic Stability
means the stability of Organisation, Domain, society and environments as a collective ecosystem.

Target Variable
means the Variable which a Model is made to predict and/or output.

Target of Interest
means the fundamental concept that the Product is truly interested in when all is said and done, even if it is something that is not (objectively) measureable.

Traceability
means the ability to trace, recount, and reproduce Product outcomes, reports, intermediate products, and other artifacts, inclusive of Models, datasets and codebases.

Transparency
means the provision of an informed target audiences understanding of Organisation and/or Products Machine Learning, and their workings, based on documented Organisation information.

Variables
mean the different attributes of subjects or systems which may or may not be measured.

Workflows
means the coordinated and standardised sequences of employee work activities, processes, and tasks.