(Sub)population Proxies and Relationships

From The Foundation for Best Practices in Machine Learning
Technical Best Practices > Fairness & Non-Discrimination > (Sub)population Proxies and Relationships

(Sub)population Proxies and Relationships


Document and assess the relationship between potential input Features and (membership of) (Sub)populations of interest based on, amongst other things, (i) reviews with diverse Domain experts, (ii) explicit encoding of (Sub)population membership, (iii) correlation analyses, (iv) visualization methods. If relationships exist, the concerned input Features should be excluded from Model datasets, unless a convincing case can be made that an (adapted version of) the input Feature will not adversely affect any (Sub)populations, and document this.


To (a) prevent Model decisions based directly or indirectly on protected attributes or protected class membership; (b) reduce the risk of Model bias against relevant (Sub)populations; (c) understand any differences in data distributions across (Sub)populations before development begins; and (c) highlight associated risks that might occur in the Product Lifecycle.

Additional Information