Five Tools for Fairness
So, you already have a ML model (or are building one) and want to investigate options available to you for evaluating fairness metrics. Where do you start?
If you are new to Fairness metrics, I strongly recommend the engaging and easy reading provided at MachinesGoneWrong to understand why it is so difficult to define fairness. At a very basic level, fairness really depends on the context (use case/application), the stakeholders, and the approaches to overcome the Impossibility Theorem where metrics can be at odds. The more I read about the topic, the deeper the appreciation I gained on the complexity and challenges of fairness metrics. If you want to geek out further, check out the Tutorial on 21 Fairness definitions – it is an hour long but worth the time investment. You can also delve deeper into the still-incomplete FairML online book or review this Survey Paper on Bias and Fairness. The big take-away from going down this rabbit hole is that while scientists and mathematicians are attempting to quantitatively define algorithmic fairness, the over-arching conversation really needs to happen with a multi-disciplinary stakeholder group to help drive the best outcomes for all concerned parties.
Once I better understood the subject and its limitations, I researched open source toolkits that will help examine models to extract quantitative fairness metrics. Here are my top five.
- The AIF360 open source toolkit from IBM is the earliest published and most comprehensive tool for evaluating AI Fairness. It has been revised multiple times since its initial launch and IBM recently donated this asset, in addition to other Trusted AI tools to the Linux Foundation. The live demo, tutorials and guidance materials available with the tool make it easy for not just understanding the concepts but also integrating the code.
- Microsoft’s Fairlearn toolkit released in late 2019 is playing catch-up on available metrics and algorithms. Start with this paper before you dive in.
- Google’s What-if tool is an interesting visualization approach to analyze ML models but is also limited in the number of metrics and the types of analyses.
- Aequitas began as an academic project and enables easy integration in the ML workflow – more details in this paper.
- FairML is a simple Python toolbox for auditing a black box model, based on this project.
I found references to internal tools from Facebook (Fairness Flow) and Accenture (ResponsibleAI), however, I wasn’t able to locate any open source tools.
While these tools help provide some transparency into ML models. I believe they need to always be included within a broader context of Responsible AI practices.