AI output depends entirely on its input, in the form of the prompt it is fed, the dataset used for training and the engineers who create and develop the tool. This can result in explicit and implicit bias, both intentional and otherwise.
To “train” the system, generative AI ingests enormous amounts of training data from across the internet. Using the internet as training data means generative AI can replicate the biases, stereotypes, and hate speech found on the web. There have been numerous cases of algorithmic bias, which is when algorithms make decisions that disadvantage certain groups in generative AI systems.
There are ongoing privacy concerns and uncertainties about how AI systems harvest personal data from users. Some of this personal information, like phone numbers, is voluntarily given by the user. However, users may not realize that the system is also harvesting information like the user’s IP address and their activity while using the service. This is an important consideration when using AI in an educational context, as some students may not feel comfortable having their personal information tracked and saved.
Additionally, OpenAI may share aggregated personal information with third parties in order to analyze usage of ChatGPT. While this information is only shared in aggregate after being de-identified (i.e. stripped of data that could identify users), users should be aware that they no longer have control of their personal information after it is provided to a system like ChatGPT.