DEI for AI: Is There a Policy Solution to Algorithmic Bias?

As artificial intelligence capabilities grow exponentially, AI has been increasingly used to make significant, high-stakes decisions in hiring, healthcare, insurance, and housing. This type of use will only expand, as algorithms allow companies to make decisions quicker and more cheaply. It has become increasingly apparent, however, that artificial intelligence is rife with bias, and decisions made by AI often discriminate against protected classes. Many cases about algorithmic discrimination have already emerged, asserting violations of statutes such as the Fair Housing Act or Americans with Disabilities Act, and states have attempted to address them using various policy solutions. These policy solutions, however, frequently do not address the core issue that creates biased models: the use of biased, unrepresentative training data. Although solutions exist and should be implemented to mitigate discrimination by AI, algorithmic bias is inherently baked into algorithms. Instead of attempting to challenge individual cases under existing laws, it would be far more effective to disallow the use of AI in consequential decisions.

Algorithmic bias largely arises from the way that AI models are trained. Large language models such as ChatGPT are predictive models: they use training data to predict the likelihood of certain strings of words, and select the most plausible next word to create passages of text that appear reasonable but are accurate only by coincidence. As a result, they often recreate patterns in unpredictable ways, and can amplify biases in training data in ways that even a human would not. In an often-cited and particularly egregious example, Amazon’s AI hiring tool automatically downgraded the resumes of all female candidates. The algorithm, which had been trained on the resumes of Amazon’s predominantly white male workforce, calculated that individuals who went to all-women’s colleges or played on women’s sports teams were unlikely to work at Amazon – and made hiring decisions accordingly. In a more recent example, plaintiffs alleged that State Farm’s use of AI to detect fraudulent claims used biometric and behavioral data to determine race, flagging Black homeowners more frequently for fraud and subjecting them to additional delays and hurdles. There has been a historic lack of transparency about exactly what data models are trained on, with many systems described as “black boxes” where even the developers frequently do not have control over or understand how the algorithm has interpreted the data it has been fed.

Developers’ attempts to resolve bias can lead to worse results. Half-baked solutions are an insufficient replacement for robust training data, and often produce equally problematic algorithmic “hallucinations.” The “Black Nazi Problem,” for example, describes unrealistically diverse AI-generated images, such as Black Nazis or Indigenous female popes. Although decried as overly “woke,” these types of outputs usually result from overcorrections by AI developers who have not incorporated sufficiently diverse data. Because of AI’s tendency to generate exclusively white men, developers simply add words such as “black” or “woman” to user prompts. Attempts by developers to install other types of guardrails can also be faulty. Many models, for example, can be manipulated to ignore their safety policies and disseminate harmful information through specifically worded prompts. Although these issues are currently being addressed by developers, large language models lack the fundamental capacity to reason, and “telling” a large language model to follow certain instructions will never be as effective as incorporating or removing training data. Eliminating race or gender data often exacerbates algorithmic bias. The specific methods used to address algorithmic bias are vital to preventing undesired or discriminatory outcomes.

After calls to regulate this process, several states have passed or proposed legislation to address the issue of discrimination in AI. These regulations tend to fall into four categories: prohibitions or restrictions on the use of AI, regulations of the way AI is used, regulation of input data, and regulations based on output data. California, for example, prohibits using AI to discriminate against employment candidates belonging to a protected class. Colorado, in legislation set to go into effect February 2026, requires deployers of “high-risk artificial intelligence systems” to take “reasonable care” to avoid algorithmic discrimination. New York prohibits the use of AI in employment decisions unless the tool has been audited for disparate impact. This type of legislation often addresses after-the-fact discrimination, using disparate impact analysis to determine whether a model is biased, and leaves tech solutions to the discretion of developers. Disparate impact is largely useful due to the impossibility of determining discriminatory intent in humans, but AI has no intent to speak of. Large language models are nothing but computer code and training data – although it might be difficult, one can get inside of their “head,” so to speak, and evaluate what they find inside. If a faulty brake on a car results in an accident, the “intention” of the manufacturer may matter in individual tort cases but is rarely useful to analyze from a public policy perspective.

Because the way in which models are developed and trained is vital to preventing discriminatory outcomes, policymakers should proactively regulate AI tools at earlier stages, treating AI as a particularly risky piece of machinery rather than an unknowable entity capable of independent thought. Emerging methods are currently being tested to tailor databases and prevent bias, with promising results. Researchers have experimented with using smaller, curated, and more accurate datasets in order to gain control over output. Researchers have also tested methods to identify and remove training data that leads to failures for minority groups. Government agencies such as the FTC have already considered requiring consumers to “opt in” to data collection by AI companies. Because of the complexity of these solutions and the technical jargon required to properly evaluate them, it would likely be a major benefit to have experts and agencies regulate them in this way.

Artificial intelligence, however, is often inherently biased, and many conclude that it is impossible to completely eliminate these issues from the models. AI can do nothing but reproduce patterns, and because of the long history of hiring, healthcare, and housing discrimination globally, any and all additional data incorporated into models will be skewed in some way. Incorporating additional data might be difficult and unethical: artificial intelligence has also been widely criticized for its violations of copyright laws and data privacy, and while regulating input datasets might address multiple issues, concerns emerge that allowing consumers to opt out of data use might increase bias.

Although development of these models should be regulated, the only way to completely eliminate discrimination is to disallow their use in certain cases. Several states have implemented or proposed legislation to prevent AI from making consequential decisions. The EU’s regulations, however, are far more comprehensive. They completely prohibit “the placing on the market” or “the putting into service” algorithms that attempt to classify persons based on social behavior, are used for classification or risk assessment, or exploits vulnerabilities of certain groups, to name a few examples. These types of regulations are far more comprehensive and effective than vaguely prohibiting algorithms that are “discriminatory” or requiring “reasonable precautions,” and do not rely on unpredictable technological solutions.

Artificial intelligence is programmed to analyze and repeat patterns based on data, and developers themselves, who feed models large amounts of data from indiscriminate sources, often do not know exactly what patterns the machine has picked up on. Users of these programs tend to assume that they are capable of reasoned analysis, but models are not truly “intelligent” and are only capable of repeating patterns present in training data. The best way to minimize bias in these models is to regulate the data used by models or prohibit them entirely, rather than attempting to use audits or disparate impact analysis to analyze models once they have already been built.

Suggested Citation: Ria Panchal, DEI for AI: Is There a Policy Solution to Algorithmic Bias?, Cornell J.L. & Pub. Pol’y, The Issue Spotter, (Nov. 21, 2025), https://publications.lawschool
.cornell.edu/jlpp/2025/11/21/dei-for-ai-is-there-a-policy-solution-to-algorithmic-bias/.

About the Author

Ria Panchal is a 2L at Cornell Law School. They graduated from Cornell University in 2024 with a degree in Government and a concentration in Crime, Prisons, Education, and Justice, and were admitted as a part of Cornell’s 3+3 Accelerated Scholars Program. In addition to Cornell Law School’s Journal of Law and Public Policy, Ria is also an associate on the Legal Information Institute’s Supreme Court Bulletin, and is the treasurer for both Outlaw and the Cornell chapter of the National Lawyers Guild.