Large language models tend to be biased against dialect speakers, attributing negative stereotypes to them. This conclusion was reached by scientists from Germany and the USA, reports DW.
“I believe we are seeing truly shocking epithets attributed to dialect speakers,” said one of the lead authors of the study, Min Du Bu, in a comment to the publication.
An analysis by Johannes Gutenberg University showed that ten tested models, including ChatGPT-5 mini and Llama 3.1, described speakers of German dialects (Bavarian, Cologne) as “uneducated,” “working on farms,” and “prone to anger.”
Bias was amplified when AI explicitly indicated the dialect.
Other Cases
Similar issues are observed globally. A 2024 study by the University of California, Berkeley, compared ChatGPT responses to various English dialects (Indian, Irish, Nigerian).
It was found that the chatbot responded with more pronounced stereotypes, degrading content, and condescending tone compared to standard American or British English.
Emma Harvey, a graduate student in computer science at Cornell University, called the bias against dialects “significant and concerning.”
In summer 2025, she and her colleagues also found that Amazon’s shopping AI assistant Rufus gave vague or even incorrect answers to people writing in African American English dialect. If there were errors in the queries, the model responded rudely.
Another clear example of neural network bias is the case of an applicant from India who used ChatGPT to check an English resume. As a result, the chatbot changed his surname to one associated with a higher caste.
“The widespread adoption of language models threatens not just the preservation of entrenched prejudices but their widespread intensification. Instead of mitigating harm, technology risks making it systemic,” Harvey said.
However, the crisis is not limited to bias—some models simply do not recognize dialects. For example, in July, the Derby City Council AI assistant (England) failed to recognize the dialect of a radio host when she used words like mardy (“crybaby”) and duck (“dear”) during a live broadcast.
What to Do?
The problem lies not in the AI models themselves but rather in how they are trained. Chatbots read vast amounts of internet texts, based on which they generate responses.
“The main question is—who writes this text? If it contains prejudices against dialect speakers, AI will copy them,” explained Caroline Holtermann from the University of Hamburg.
She also emphasized that the technology has an advantage:
“Unlike humans, biases in AI systems can be identified and ‘turned off.’ We can actively fight such manifestations.”
Some scientists suggest that creating customized models for specific dialects could be an advantage. In August 2024, Acree AI introduced the Arcee-Meraj model, which works with several Arabic dialects.
According to Holtermann, the emergence of new and more adapted LLMs allows viewing AI “not as an enemy of dialects but as an imperfect tool that can be improved.”
Recall that journalists from The Economist warned about the risks of AI toys for children’s mental health.
View Original
This page may contain third-party content, which is provided for information purposes only (not representations/warranties) and should not be considered as an endorsement of its views by Gate, nor as financial or professional advice. See Disclaimer for details.
Research reveals AI model bias towards dialects - ForkLog: cryptocurrencies, AI, singularity, future
Large language models tend to be biased against dialect speakers, attributing negative stereotypes to them. This conclusion was reached by scientists from Germany and the USA, reports DW.
An analysis by Johannes Gutenberg University showed that ten tested models, including ChatGPT-5 mini and Llama 3.1, described speakers of German dialects (Bavarian, Cologne) as “uneducated,” “working on farms,” and “prone to anger.”
Bias was amplified when AI explicitly indicated the dialect.
Other Cases
Similar issues are observed globally. A 2024 study by the University of California, Berkeley, compared ChatGPT responses to various English dialects (Indian, Irish, Nigerian).
It was found that the chatbot responded with more pronounced stereotypes, degrading content, and condescending tone compared to standard American or British English.
Emma Harvey, a graduate student in computer science at Cornell University, called the bias against dialects “significant and concerning.”
In summer 2025, she and her colleagues also found that Amazon’s shopping AI assistant Rufus gave vague or even incorrect answers to people writing in African American English dialect. If there were errors in the queries, the model responded rudely.
Another clear example of neural network bias is the case of an applicant from India who used ChatGPT to check an English resume. As a result, the chatbot changed his surname to one associated with a higher caste.
However, the crisis is not limited to bias—some models simply do not recognize dialects. For example, in July, the Derby City Council AI assistant (England) failed to recognize the dialect of a radio host when she used words like mardy (“crybaby”) and duck (“dear”) during a live broadcast.
What to Do?
The problem lies not in the AI models themselves but rather in how they are trained. Chatbots read vast amounts of internet texts, based on which they generate responses.
She also emphasized that the technology has an advantage:
Some scientists suggest that creating customized models for specific dialects could be an advantage. In August 2024, Acree AI introduced the Arcee-Meraj model, which works with several Arabic dialects.
According to Holtermann, the emergence of new and more adapted LLMs allows viewing AI “not as an enemy of dialects but as an imperfect tool that can be improved.”
Recall that journalists from The Economist warned about the risks of AI toys for children’s mental health.