Innovating Public Procurement to Mitigate the Risks of Artificial Intelligence Systems
Artificial Intelligence (AI) systems are increasingly deployed in the public sector. Existing public procurement processes and standards are in urgent need of innovation to address potential risks and harms to citizens. Read our primer based on our research and on input from leading experts in the public sector, data science, civil society, policy, social science, and the law to learn about pathways forward.
Supporters: Northeast Big Data Innovation Hub, NYU Center for Responsible AI, Parity, Institute of Electrical and Electronics Engineers (IEEE), NYU Tandon School of Engineering, NYU Alliance for Public Interest Technology
The Challenges with AI Procurement
The COVID-19 pandemic has underlined how biases can manifest in many different aspects of public use technology. For example, federal COVID-19 funding allocation algorithms have favored high-income communities over low-income communities due to historical biases prevalent in the training data. AI solutions that can be implemented fast are typically provided by private companies. As more and more aspects of public service are infused with AI systems and other technologies provided by private companies, we see a growing network of privately owned infrastructure. As government entities outsource critical technological infrastructure (such as data storage and cloud-based systems for data sharing and analysis) to private companies under the guise of modernizing public services, we see a trend towards losing control over critical infrastructure and decreasing accountability to the public that relies on it. Unlike private companies, which have a responsibility to serve their shareholders and to generate profit, public entities must consider their entire population when delivering a solution and are tasked with mitigating harms introduced by AI to the communities they serve.
There are six tension points that emerge in the context of procuring AI systems. Acknowledging these points of tension can serve as a way to identify the most urgent areas of improvement within existing procurement processes and practices in order to address the societal impact of AI technologies. Read about the tension points below and download the primer to learn about how to address them.
The AI space in general is challenged by a panoply of terms that remain undefined. The first set of definitional challenges pertains to technologies and procedures: there is no agreed-upon notion of AI, or even algorithm. This can hinder, for example, the cataloging of existing socio-technical systems within government, but also the development of procurement innovation that is specific to these technologies. Agencies and local governments sometimes define these technologies for themselves, for example in registries or compliance reporting, or as part of new regulation, but there is no cross-agency coordination. Similarly, there are no agreed upon definitions and procedures of AI impact and risk assessments or audits.
The second set of definitional challenges pertains to legal frameworks and principles, particularly justice. There are vastly different notions of what constitutes justice in the context of AI. In order to arrive at a workable definition of justice that, indeed, is just, it is necessary to include those who are affected by AI injustice.
The third set of definitional challenges pertains to metrics, and particularly metrics of success, both for the AI system, as well as the process through which it was procured. There can be an absence of appropriate metrics of success, or the metrics of success can be conflicting. For example, financial fraud detection models measure success based on identifying anomalous behavior (e.g., out of the norm for an individual), which may mean (but is not limited to) fraudulent behavior. Anyone who has had their credit card frozen while on vacation likely appreciates this distinction. These systems are not only used by banks, public agencies may also deploy them to detect benefit fraud. While the technical definition of success may evolve around detecting as much fraud as possible, the practical definition of success may be to only detect those cases that are the most likely to constitute fraud.
The procurement process has been designed to prevent abuse, but as a result has solidified sets of procedures that tend to only allow large vendors to meet standards and compete for government contracts. Long-term contracts that result from these sets of procedures lock in government agencies over years and can create path dependencies. These path dependencies can also spill over into the newly emerging space of algorithmic auditing, where large vendors which already have contracts with government agencies add on algorithmic auditing to their service portfolio, incentivising government agencies to contract these services through their existing vendors. Tension in the procurement process also occurs with regards to bottlenecks that can exist in the procurement of various services, for example when there are delays or distortions in one of the procurement stages (particularly bid solicitation, vendor selection, contracting, and execution). As the disproportionate impact of AI systems on individual citizens and communities becomes more evident, there also is a pressing need to define at what point risk and impact assessments should be required in the procurement process, which – importantly – should also include the execution stage. As the procurement process is redefined and redesigned, points at which public participation expertise and reasoned deliberation can happen and be integrated need to be defined.
The incentives that underpin both the procurement process in general, as well as the different organizations that come together through procurement, can be detrimental to creating AI impact assessment structures. Vendors are driven by capitalist incentives and are focused on profit-generation in the offering of their services to public entities. The primary responsibility most vendors have is toward their shareholders, not necessarily their clients, or the people who use their services and systems. This means the current procedural and cultural set-ups of procurement are propped up by incentive structures that are not conducive to measures that would protect the public or mitigate algorithmic harm, not least because these could potentially slow down the procurement process. At this point, government and vendor incentives actually are aligned: both are incentivised to solve a problem fast and contract it out, once it has been identified and budgets have been approved. Both actors are also incentivized to frame a technological solution as the most efficient solution to any given problem. Government employees who are tasked with managing the procurement process may also not be incentivized to change the procurement process in order to account for the potential harms of AI technologies. Their task is to procure as efficiently as possible, there is no organizational or career reward for changing existing procedures.
The institutional structure of government can challenge procurement innovation to account for the specifics of AI systems, particularly when it comes to the timeframe for solving problems and achieving such innovation. According to former U.S. Chief Data Scientist DJ Patil, government policies are generally constructed within a 10-year time frame. Policymakers seek to create impact within their (much shorter) tenure, often with a view for reelection. Industry is moving at an even faster pace. Policymakers and industry speeds converge when decision-makers are confronted with having to solve issues within their tenure, which does not necessarily align with the organizational priorities and timeframes of government. Another issue is that “government” is often treated as monolithic, excluding other forms of government (such as Indigenous governments) from the conversation, and therefore from efforts to innovate. Furthermore, the ways in which procurement is currently organized can negate and foreclose infrastructural sovereignties for First Nations.
Procurement is the gateway for technology infrastructure implementation, and therefore has long term effects on cities, communities, and on agencies themselves. Due to the lack of capacity and/or resources to build technology in-house, procurement invites the building of large-scale technology infrastructure, including AI, through private vendors. This dynamic does not foster, but prevents transparency: often protected by tradelaw, private vendors are not obligated to open up the “black box” and share insight into their training data or their models. Relatedly, a promised outcome or a narrative often is treated as more important than the technological backend. These promised outcomes and narratives often are a point of convergence for agencies and the private sector, and they are treated with higher priority than the establishment of accountability in the procurement process, for example in the context of the climate emergency and “clean tech”, or technologies that are deployed for public health management in the context of the COVID-19 pandemic.
AI systems that are implemented in this way, and that can cause harm to communities, such as through over-surveillance and -policing, become infrastructural and therefore are unlikely to be taken down, even when harm is proven, for example in the context of smart sensors installed in street lights. More transparency and oversight in the procurement process (vs. attention to the promise of a technology) can prevent the implementation of AI infrastructures that can prove to be harmful by design. Agencies are in need of resources to build up literacies and capacities around the impacts of procuring AI systems, which includes knowledge sharing across agencies.
Procurement innovation is impossible without considering legal implications, particularly pertaining to protecting the procuring institutions from liabilities. Public services are often held to a higher standard for their services and outcomes than private companies, and liabilities are often treated as net risk, rather than as an adaptive framework for risk management. Additionally, the impact of AI systems creates new complexities that challenge existing liability practices and regimes.
Currently there are few, if any, meaningful legislative safeguards to protect against the evolving discriminatory impacts of AI systems, such as the violation of human rights or antidiscrimination laws. Even when public agencies are subject to legal obligations covering AI or other technologies, these obligations are not necessarily considered by private companies in the design and testing of a technology product for use in the public sector. Rather, companies tend to use spaces with little citizen and public sector protection as a liability-free testground. The technologies that are tested in these liability-free spaces are then deployed by local agencies.
In addition, the inherent uncertainty about the capabilities and functionalities (ex ante) and impacts (ex post) of AI systems may require a re-evaluation of the distribution of liabilities across AI supply chains. Direct contractors of public agencies may have many different suppliers themselves, calling into question who holds the liability for the performance of the AI system that gets deployed. Important measures that can help develop more clarity on the distribution of liabilities across this AI supply chain include improved clarity in applicable vendor contracts (e.g., allocation of liability, warranties, clarity on trade secret protections, or insurance for AI incidents); rigorous vendor diligence; post-deployment monitoring; and quality standards.
There are five narrative traps that hinder innovation in AI procurement. Paying attention to these narrative traps is important, because narratives are powerful tools that shape the trajectory of policies and practices. Read about the narrative traps below and download the primer to learn about how to avoid them.
“We must engage the public.”
This statement deliberately diffuses what “engagement” means – does it refer to a democratic process, citizen participation in administrative decision-making processes, public oversight, or anything else? A lack of defining engagement means that communities will find it difficult to demand it. It also leaves unclear what “the public” means and ignores those communities who have historically been excluded from that frame (see narrative trap 2). Those communities may even be excluded further in spaces that are centered on public engagement participation. And lastly, unloads accountability onto “the public.” When there is a call for the public to get engaged in order to mitigate the risk of AI systems, then this is an indicator that elected officials and administrators have failed, or are about to fail, in adequately representing the interests of the public.
“We must find simple definitions of ‘X.’”
Narratives that call for simple definitions of the complex contexts and issues around AI drive the deployment of monolithic framings of key terms such as “the public,” “government,” “bias,” and “algorithm.” These monolithic framings tend to exclude nuance and derive from dominance in the public discourse, therefore upholding historic power structures. For example, historically the term “the public” did not include communities of color, and “government” did not mean “indigenous government.” A continued use of those terms without nuance and historic context will perpetuate exclusion. Similarly, calls for simplification can function as incubators for new narrative traps, because they generally set the expectation that generalization – over nuance – is the key to successful technology design and policy processes. For example, a call for generalizing “bias” can actually cause terms like “harm” or “impact” to be generalized to a degree that they are removed from the actual lived experiences of individuals.
“The main threat is the government use of data.”
The idea that government use of data is potentially a bigger threat than the private use of data is a fallacy. In many contexts, governments, or government agencies do not possess data archives of the same size as industry, nor do they have capacity to collect, clean, and/or analyze that data. Relatedly, they may have stricter rules as to the collection and processing of data, limiting their ability to use data effectively where it would be needed. Narratives that suggest the opposite are a trap, because they divert regulatory attention and scrutiny away from private companies. They also distract from the government capture of citizen data through private companies via vendor contracts or public-private partnerships.
“One incentive shared across all actors can initiate change.”
This narrative is a trap because it implies that a silver-bullet approach can serve as a solution for very complex and emerging problems that occur in the context of AI and procurement. It perpetuates the idea that goal-alignment and compliance can be achieved through introducing a shared incentive (e.g. avoid fines), which is a strategy that ignores that there are larger systems of (capitalist or bureaucratic) incentives at play for different actors that may override any single incentive. The narrative is also a trap because it can lead to oversimplified goals that are poorly defined and therefore hard to create accountability mechanisms for (such as “help citizens”). Lastly, it can distract from other change mechanisms that may be more effective for innovating AI procurement.
“We can create change in AI design and deployment through procurement alone.”
This last narrative trap is one that we do not want to fall into with this project and with this primer. Narratives postulating that procurement can solve any and all issues related to government use of AI systems (ranging from potential harms to liabilities) are based on a silver-bullet approach that is likely to promise innovation and change in a way that it cannot deliver, especially across the many different government agencies. They also ignore the fundamental fact that it is exceptionally difficult to change and improve the procurement process iteratively in order to address the emergent nature of AI risk.
CALLS FOR ACTION
There are actions that can help kick off and sustain innovation in AI procurement to address and mitigate the risks AI systems can pose to citizens and particularly marginalized communities. Read about these actions below and in our primer.
Re-define the process.
The growing need for developing and enforcing accountability structures in the context of the procurement and deployment of AI systems by public agencies suggests that said processes should be redefined so that public agencies can understand the tradeoffs and benefits of AI systems quicker and with more certainty. There needs to be ample time to define and document the problem that the AI system should address, and how it should address it. Affected communities must be heard. New AI legislation, on both the national and international levels (such as the new EU AI regulation), must be absorbed effectively and translated into change in AI design, procurement, and use. A redefinition of this process can create space for the notion of collective accountability to evolve through innovation in AI procurement, whereby government agencies, as buyers, can extend their power to demand accountability and transparency of vendors, and where agencies can go back and iterate when problems occur. It can also address the false dichotomy between fast approaches that are “wrong” and slower approaches that are “right.” This redefinition should recall that policy decisions are often encoded in the definition and construction of AI systems, and therefore procurement of these systems should be undertaken with the nuance and consideration given to other policy determinations.
Create meaningful transparency.
There is a need to improve communication and issue presentation of AI systems and the risks and harms they can pose. Procurement officers, policymakers, citizens, and vendors must gain a better understanding of how individual situations of AI harm and risk connect to bigger structural problems, and vice versa. In order to create meaningful transparency, standards for the communication of the goals and assumptions baked into an AI system, as well as the risks and harms the system can pose, should be established, alongside guidelines for the documentation and record keeping of such communication.
Build a network.
There is a need for interagency communication, as well as exchange and capacity building on issues related to the procurement of AI systems. There also is a need to more clearly define intra-agency responsibilities for the procurement of AI systems and the impact they can have on citizens as well as on the agencies themselves. Resources must be developed and shared for supporting individuals and communities within agencies who are working toward improving procurement processes in order to mitigate AI harm. Similarly, assistance for communities outside of agencies who are surfacing AI harms and issues should be made available. These resources and opportunities for capacity building should be pooled in a network of procurement officers, AI researchers, representatives of advocacy groups, and more.
The field of public interest technology is growing significantly. Government agencies are increasingly using technology, including AI systems, across all aspects of their work. This means that there is a growing need for public interest technologists: professionals who are trained in both technical and social science fields and are able to adequately assess the social impacts of technology as they continually emerge. Luckily, there also is a growing desire among the next generation of technologists to engage in meaningful work that takes into account the social impact of technology. It is therefore paramount that this talent is cultivated early and equitably. Education must become more interdisciplinary and applied, while being grounded in theories of not just ethics and moral philosophy, but scholarship of inequality, racism, feminism, intersectionality, and more. This education must continue beyond the academe and be afforded to practitioners and communities alike. At the same time, government agencies and the private sector should focus on creating a whole range of public interest technology jobs, recruit and retain diverse talent early in their careers, and support public interest technology teams internally and externally.
JOIN THE COMMUNITY
If you are interested in gaining insight, sharing news, and collectively building capacity on the public procurement of AI systems, request access to the AI Procurement Google Group!
The AI Procurement Roundtables Project was hosted at New York University (virtually) in the winter of 2020-2021 and brought together leading experts in the public sector, data science, civil society, policy, social science, and the law to generate a structured understanding of existing public procurement processes and identify how they can best mitigate risk and support community needs. These experts had three separate conversations focused on mapping data science solutions used by public institutions, algorithmic justice and responsible AI, and governance innovation and procurement in the context of AI.
Rather than prescribing abstract pathways to procurement innovation in order to account for AI risk, this material presented on this site and in the primer sets out to equip individuals, teams, and organizations with the knowledge and tools they need to kick off procurement innovation as it is relevant to their field and circumstances.
The team extends their sincere gratitude to the Northeast Big Data Innovation Hub for funding this project, and to the roundtable participants for bringing their expertise and critical voices to the table. Roundtable participants included: Stephanie Russo Carroll, Jonas Eliasson, Brandeis Marshall, Deirdre Mulligan, Clara Neppel, Kholoud Odeh, Andrea Renda, Renee Sieber, Jeff Thamkittikasem, Linda van de Fliert, Stefaan Verhulst, Meg Young, Christina J. Colclough, Chris Gilliard, Noel Hidalgo, Katie Kinsey, Safiya Noble, Kathleen Riegelhaupt, Fabian Rogers, Andrew Strait, Chris Albon, Ashley Casovan, Alex Engler, Mark Latonero, Graham MacDonald, Sean McDonald, Emanuel Moss, Eli Pariser, DJ Patil, Desmond Patton, Adrianna Tan, Hana Schank, Julia Stoyanovich, Vincent Southerland and Bianca Wylie.
The material presented on this website and in the primer is the authors’ reflection on the conversations that took place during the roundtables, as well as the research the project team has undertaken on the topic of AI procurement in the United States. This primer does not represent the opinions of the roundtable participants.
Mona Sloane, Ph.D., Principal Investigator
Mona Sloane, Ph.D., is a sociologist working on design and inequality, specifically in the context of AI design and policy. She frequently publishes and speaks about AI, ethics, equitability, and policy in a global context. Mona is a Senior Research Scientist at the NYU Center for Responsible AI, and an Adjunct Professor at NYU’s Tandon School of Engineering, as well as a Fellow with NYU’s Institute for Public Knowledge (IPK), where she convenes the Co-Opting AI series and co-curates The Shift series. She also is the technology editor for Public Books, and a fellow with The GovLab.
From fall 2021, Mona will serve as the director of the *This Is Not A Drill* program, which will develop a public pedagogy on art, equity, technology, and the climate emergency. Recent projects she has led as principal investigator include the Terra Incognita NYC project, an investigation of New York City’s digital public spaces in the pandemic, as well as the Procurement Roundtables project, which she led together with Dr. Rumman Chowdhury (Director of Machine Learning Ethics, Transparency & Accountability at Twitter, Founder of Parity) in collaboration with John C. Havens (IEEE Standards Association). Currently, Mona works with Emmy Award-winning journalist and NYU journalism professor Hilke Schellmann on hiring algorithms, auditing, and new tools for investigative journalism and research on AI. With Dr. Matt Statler (NYU Stern), she also leads the Public Interest Technology Convention and Career Fair project which will bring together students and organizations in the public interest technology space across the United States and beyond.
Mona is also affiliated with the Tübingen AI Center in Germany where she leads a 3-year federally funded research project on the operationalization of ethics in German AI startups. From 2020-2021 she was part of the inaugural cohort of the Future Imagination Collaboratory (FIC) Fellows at NYU’s Tisch School of the Arts.
Mona holds a PhD from the London School of Economics and Political Science and has completed fellowships at the University of California, Berkeley, and at the University of Cape Town. She has written for The Guardian, MIT Technology Review, Frankfurter Allgemeine Zeitung, OneZero Medium, and other outlets. You can follow her on Twitter @mona_sloane.
Rumman Chowdhury, Ph.D., Co-Principal Investigator
Dr. Rumman Chowdhury’s passion lies at the intersection of artificial intelligence and humanity. She is a pioneer in the field of applied algorithmic ethics, creating cutting-edge enterprise technical solutions for ethical, explainable, and transparent AI since 2017.
She is currently the Director of the META (ML Ethics, Transparency, and Accountability) team at Twitter, as well as GP of a venture capital fund, Parity Responsible Innovation Fund, that invests in early-stage responsible technology startups. She was previously CEO and founder of Parity AI, an enterprise algorithmic audit platform company and formerly served as Global Lead for Responsible AI at Accenture Applied Intelligence.
Rumman has been featured in international media, including the Wall Street Journal, Financial Times, Harvard Business Review, NPR, MIT Sloan Magazine, MIT Technology Review, BBC, Axios, Cheddar TV, CRN, The Verge, Fast Company, Quartz, Corrierre Della Serra, Optio, Australian Broadcasting Channel, and Nikkei Business Times. She has been recognized as one of BBC’s 100 Women, Bay Area’s Top 40 Under 40, and honored to be inducted into the British Royal Society of the Arts (RSA). She has also been named by Forbes as one of the Five Who are Shaping AI.
As service to the field and the larger community, she serves on the board of Oxford University’s Commission on AI and Governance, the University of Virginia’s Data Science Program, and Patterns, a data science journal by the publishers of Cell.
Dr. Chowdhury holds two undergraduate degrees from MIT, a master’s degree in Quantitative Methods of the Social Sciences from Columbia University, and a doctorate in political science from the University of California, San Diego. You can follow her on Twitter @ruchowdh.
John C. Havens, Collaborator
John C. Havens is Executive Director of The IEEE Global Initiative on Ethics of Autonomous and Intelligent Systems that has two primary outputs – the creation and iteration of a body of work known as Ethically Aligned Design: A Vision for Prioritizing Human Well-being with Autonomous and Intelligent Systems and the recommendation of ideas for Standards Projects focused on prioritizing ethical considerations in A/IS. Currently there are thirteen approved Standards Working Groups and one completed Standard in the IEEE P7000™ series. He is also Executive Director for The Council on Extended Intelligence (CXI) that was created to ensure society prioritizes people, planet, and prosperity by promoting responsible participant design, control of data rights, and holistic metrics of prosperity. CXI is a program founded by The IEEE Standards Association and MIT, whose members include representatives from the EU Parliament, the UK House of Lords, and dozens of global policy, academic, and business leaders. Previously, John was an EVP of Social Media at the PR firm Porter Novelli and a professional actor for over 15 years. John has written for Mashable and The Guardian and is author of the books, Heartificial Intelligence: Embracing Our Humanity To Maximize Machines and Hacking Happiness: Why Your Personal Data Counts and How Tracking it Can Change the World. You can follow John on Twitter @johnchavens.
Tomo Lazovich, Ph.D., Collaborator
Tomo Lazovich, Ph.D., is a machine learning scientist focused on issues at the intersection of technology and equity, in particular working to eliminate the disproportionate harms that AI systems have on marginalized communities. Tomo is currently a Senior Machine Learning Researcher on Twitter’s ML Ethics, Transparency, and Accountability (META) team. They have previously done research at the Charles Stark Draper Laboratory and Lightmatter. Tomo received their PhD in Physics from Harvard University in 2016. You can follow them on Twitter @laughsovich.
Luis C. Rincon Alba, Research Assistant
Luis Rincon Alba is a Colombian artist and scholar based in New York City since 2010. He has taught at the departments of Art and Public Policy and Performance Studies at New York University’s Tisch School of the Arts. He is currently a doctoral candidate in the Performance Studies Department at New York University and a Public Humanities Fellow at Humanities NY and the Urban Democracy Lab. As an actor, performer, and oral narrator, he has collaborated with different artistic collectives in his home country and also in Brazil, Argentina, Mexico, the United States, and Italy. He is also the artistic director of the collective MUSA Presents.
Additional support was given by the Institute of Electrical and Electronics Engineers (IEEE), the NYU Tandon School of Engineering, and the NYU Alliance for Public Interest Technology.
In addition to extending our heartfelt thanks to our supporters and roundtable participants, we want to thank David Rubenstein and Anna Gressel for their insightful comments on the primer, as well as Janina Zakrzewski, Kayla Krieger, Melissa Lucas-Ludwig, Luis C. Rincon Alba, and Clara McMichael for their invaluable support.
© Innovating AI Procurement 2021