From The Foundation for Best Practices in Machine Learning

About This Document and Wiki

Over the course of the last decade, the adoption of Machine Learning within organisations has seen a rapid rise. Although the promise and value of Machine Learning is great, and ought to be encouraged, a myriad of pitfalls currently accompany this modern practice of data science. These include, amongst other things,

  1. a lack of organisational maturity in implementing and facilitating Machine Learning,
  2. the immediate bulky and burdensome process of automating organisational process(es) and decision-making through Machine Learning,
  3. the lack of transparency, and probable bias, Machine Learning fosters, and
  4. the scarcity of experienced and skilled Machine Learning professionals.

The above shortcomings have generated acknowledged operational, ethical, legal and governance risks. This, in turn, has created a need for a clear and thoughtful guideline of best practices on how to ethically and responsibly govern, manage and implement Machine Learning within any organisation. This Best Practice Guideline attempts to fill this void through a “organisational governance” perspective, while still encouraging the maximisation of Machine Learning.

The Best Practice Guideline advises on the structure(s) and control(s) ought to be implemented within organisations when designing, developing and/or operationalising Machine Learning. These structure(s) and control(s) have been sourced from:

  1. legal precedents,
  2. analogous compliance frameworks, and
  3. best practices as experienced in industry.

The Best Practice Guideline has been developed principally by senior Machine Learning engineers, data science managers, and governance professionals for Machine Learning engineers, data science managers, governance professionals, legal practitioners, and, more broadly, management.

In approaching Machine Learning governance, the Best Practice Guideline adopts an “integrated-holistic” approach. This means that it attempts to:

  1. cater for all the organisational features of Machine Learning and its practice; as well as
  2. locate these within organisations as part of an integrated whole - as opposed to an isolated department.

In turn, it envisions that the structures and controls codified herein will be implemented relatively depending on organisation size, business-needs, business-risks, finances, technical feasibility, and, most importantly, the anticipated societal impact of any Machine Learning Product. What this pragmatically means is that, for small organisations looking to implement this Best Practice Guideline, defined roles herein (such as Data Science Manager(s), Design Owner(s), and/or Run Owner(s)) can be assumed into one another, if necessary. This is likewise for its various policies and/or procedures. However, if avoidable and whenever possible, the practice of compressing this Best Practice Guideline - and, in turn, not implementing it in its entirety - is ill-advised. This is because the Best Practice Guideline has been specifically designed to warrant sufficient organisational structure(s) and control(s) to ethically and responsibly govern, manage, and implement Machine Learning, which must always be of principal concern.

The Best Practice Guideline is premised on the assumption that concerned organisations implement a hybrid of managerial “philosophies”, specifically Waterfall, Agile, and Scrum, when approaching Machine Learning. It, therefore, incorporates aspects of each of these “philosophies” within its structure(s) and control(s). The Best Practice Guideline also hypothesizes that organisations operationally divide Machine Learning into (a) design and (b) run stages. Should any organisation not follow the above, the structure(s) and control(s) codified herein can be adjusted relatively. However, any adjustments should be done cautiously and with consideration, as they might compromise the integrity of this Best Practice Guideline.

It is important for readers to note that at Section 5, this Best Practice Guideline clusters themes relating to Model and/or Product structures and controls together, instead of approaching them “linearly” based on model design, development and implementation. The reason for this is two-fold:

  1. it prevents unnecessary repetition, which would complicate this Best Practice Guideline further than is necessary; and
  2. it promotes an appreciation that, when approaching Model design, development and implementation, clustered themes ought to be continuously addressed and assessed.

This further affirms the Best Practice Guideline’s “integrated-holistic” approach.

Note that although this Best Practice Guideline has been specifically designed and developed to responsibly and ethically govern, manage and implement Machine Learning, it can - and ought to - be applied to all automated and/or algorithmic decision-making processes. Note further that for clarity on specific Machine Learning Model structures and controls, we refer you to our Machine Learning Model Governance: Objectives, Controls and Aims Guideline.