LibGuides: Artificial Intelligence: Bias & Privacy Issues

Bias

AI output depends entirely on its input, in the form of the prompt it is fed, the dataset used for training and the engineers who create and develop the tool. This can result in explicit and implicit bias, both intentional and otherwise.

To “train” the system, generative AI ingests enormous amounts of training data from across the internet. Using the internet as training data means generative AI can replicate the biases, stereotypes, and hate speech found on the web. There have been numerous cases of algorithmic bias, which is when algorithms make decisions that disadvantage certain groups in generative AI systems.

ProPublica found black people were twice as likely as white people to be misclassified by COMPAS, which is used by law enforcement. (Larson 2016)
Amazon shut down a recruitment AI tool it had developed because it was consistently discriminating against female applicants. (Hamilton 2018)
Galactica — an LLM similar to ChatGPT trained on 46 million text examples — was shut down by Meta after 3 days because it spewed "false and racist information.”(Getahun, 2023)
Algorithms used in a hospital recommended Black patients receive less medical care than their white counterparts. (Obermeyer 2019)

Privacy

There are ongoing privacy concerns and uncertainties about how AI systems harvest personal data from users. Some of this personal information, like phone numbers, is voluntarily given by the user. However, users may not realize that the system is also harvesting information like the user’s IP address and their activity while using the service. This is an important consideration when using AI in an educational context, as some students may not feel comfortable having their personal information tracked and saved.

Additionally, OpenAI may share aggregated personal information with third parties in order to analyze usage of ChatGPT. While this information is only shared in aggregate after being de-identified (i.e. stripped of data that could identify users), users should be aware that they no longer have control of their personal information after it is provided to a system like ChatGPT.