Do we trust the plane or the pilot? The problem with ‘trustworthy’ AI
Cédric Lombion - October 19, 2020 in Open Knowledge
- Human agency and oversight
- Technical Robustness and safety
- Privacy and data governance
- Transparency
- Diversity, non-discrimination and fairness
- Societal and environmental well-being
- Accountability
“Public Impact Algorithms are algorithms which are used in a context where they have the potential for causing harm to individuals or communities due to technical and/or non-technical issues in their implementation. Potential harmful outcomes include the reinforcement of systemic discrimination (such as structural racism or sexism), the introduction of bias at scale in public services or the infringement of fundamental rights (such as the right to dignity) »The problem does not lie in the definition of trustworthiness: the ethical principles and key requirements are sound and comprehensive. Instead, it arises from the aggregation behind a single label of concepts whose implementation presents extremely different challenges. Going back to the seven principles outlined above, two dimensions are mixed in: the technical performance of the AI and the effectiveness of the oversight and accountability ecosystem which surrounds it. The principles fall overwhelmingly under the Oversight and Accountability category.
Technical performance | Oversight and Accountability |
Technical robustness and safety Transparency |
Human agency and oversight Privacy and data governance Transparency Diversity, non-discrimination and fairness Societal and environmental well-being Accountability |
Building a trustworthy plane
The reason why no one uses the expression ’trustworthy’ plane(1) or car (2) is not because trust is not essential to the aviation or automotive industries. It’s because trust is not a useful concept for legislative or technical discussions. Instead, more operational terms such as safety, compliance or suitability are used. Trust exists in the discourse around these industries, but is instead placed in the ecosystem of practices, regulations and actors which drive the industry: for the civil aviation industry this includes the quality of pilot training, the oversight on airplane design, or the standard of safety written in the legislation (3). The concept of ‘trustworthy AI’ displaces the trust from the ecosystem to the tool. This has several potential consequences:- Trust could become embedded in the discourse and legislation on the issue, pushing to the side other concepts that are more operational (safety, privacy, explicability) or essential (power, agency(4)).
- Trustworthy AI could become an all encompassing label —akin to an organic fruit label— which would legitimize AI-enabled tools, cutting off discussions about the suitability of the tool for specific contexts or questions about whether these tools should be deployed at all. Why do the hard work of building accountable processes when a label can be used as a shortcut?
- Minorities and disenfranchised groups would again be left out of the conversation: the trust that a public official puts into an AI tool will be extended by default to their constituents.
We should not trust AI
Behind Open Knowledge’s Open AI and Algorithms programme is the core belief that we can’t and shouldn’t trust Public Impact Algorithms by default. Instead, we need to build an ecosystem of regulation, practices and actors in which we can place our trust. The principles behind this ecosystem will resonate with the definition given above of ’trustworthy’ AI: human agency and oversight, privacy, transparency, accountability… But while a team of computer science researchers may discover a breakthrough in explainable deep learning, the work needed to set up and maintain this ecosystem will not come through a breakthrough: it will be a years-long, multi-stakeholder driven and cross-sector effort that will face its share of opponents and headwinds. This work can not, and should not, simply be a bullet point under a meaningless label. Concretely, this ecosystem would emphasize:- Meaningful transparency: at the design level (explainable statistical model vs black box algorithms)(7), before deployment (clarifying goals, indicators, risks and remediations)(8) and during the tool’s lifecycle (open performance data, audit reports)
- Mandatory auditing: although algorithms deployed in public services should be open source, Intellectual Property Laws dictate that some of them will not. The second best option should consequently be to mandate auditing by regulators (who would have access to source code) and external auditors using API designed to monitor key indicators (some of them mandated by law, others defined with stakeholders)(9).
- Clear redress and accountability processes: multiple actors intervene between the design and the deployment of an AI-enabled tool. Who is accountable for what will have to be clarified.
- Stakeholder engagement: algorithms used in public services should be proactively discussed with the people they will affect, and the possibility of not deploying the tool should be on the table
- Privacy by design: the implementation of algorithms in the public sector often leads to more data centralisation and sharing, with little oversight or even impact assessment.
(1) The aviation industry talks about ‘airworthiness’ which is technical jargon for safety and legal compliance https://www.easa.europa.eu/regulations#regulations-basic-regulation
(2) The automotive industry mainly talks about safety https://ec.europa.eu/growth/sectors/automotive/legislation/motor-vehicles-trailers_en
(3) which is why federal aviation agencies (FAA) generally do not re-certify a plane validated by the USA’s FAA: they trust their oversight. The Boeing scandal led to a breach of trust and certification agencies around the world asked to re-certify the plane themselves. https://en.wikipedia.org/wiki/Boeing_737_MAX_groundings
(4) I purposefully did not mention fairness here. See this paper discussing the problems with using fairness in the AI debate: https://www.cs.cornell.edu/~red/fairness_equality_power.pdf
(5) It was published on February 2020, which means that they already had access to the draft version of the Ethics Guidelines for Trustworthy AI https://algorithmwatch.org/en/story/ai-white-paper/
(6) See also the report from Data Ethics Commission of the Government which defines 5 risk levels https://algorithmwatch.org/en/germanys-data-ethics-commission-releases-75-recommendations-with-eu-wide-application-in-mind/
(7) Too little scrutiny is put on the relative performance of black box algorithms vs explainable statistical models. This paper discusses this issue: https://hdsr.mitpress.mit.edu/pub/f9kuryi8/release/5
(8) As of October 2020, Amsterdam (The Netherlands), Helsinki (Finland) and Nantes (France) are the only governments having deployed algorithm registers. But in all cases, the algorithms were deployed before being publicized.
(9) oversight through investigation will still be needed. Algorithm Watch has several projects in that direction, including a report on Instagram. This kind of work relies on volunteers sharing data about their social media feeds. Mozilla is also involved in helping them structure this kind of ‘data donation’ project https://algorithmwatch.org/en/story/instagram-algorithm-nudity/