The content created by generative AI tools may not be reliable, some were found incorrect. The reliability of generative AI tools depends on the quality of the data used to train them. If the data is biased or incomplete, the output may be unreliable.
AI hallucination and incorrect citations
AI hallucination is incorrect information or inaccurate output that generated from a large language model (LLM), usually a generative AI chatbot like ChatGPT4 or Google Bard.
Source: IBM - What are AI hallucinations?
One of the examples of AI hallucination is a New York firm used ChatGPT for legal research. A filing used in court was found to reference example legal cases that did not exist. The lawyer who used the tool told the court he was “unaware that its content could be false”.
Source: BBC News - ChatGPT: US lawyer admits using AI for case research
Deepfakes
Deepfake is a technology that uses generative AI to create content that appear to be authentic but are actually fake, such as images, videos, and audio recordings.
Source: Dagar, D., Vishwakarma, D.K. A literature review and perspectives in deepfakes: generation, detection, and applications. Int J Multimed Info Retr 11, 219–289 (2022). https://doi.org/10.1007/s13735-022-00241-w
Generative AI tools depend on the quality of the data used to train them. If AI tools are trained on data contains discriminatory patterns or biased information, this can lead to those biases and prejudices becoming embedded in the AI tool itself and may generate outputs with biases and inequalities. For example, if an AI system is trained on data with gender bias, it may produce output with similar bias.
Therefore, it is important to ensure that the data used to train AI models is diverse, unbiased, and representative of the population it aims to serve.By doing so, we can help prevent the perpetuation of existing biases and inequalities in AI systems.
Example of how image generated by biased AI tool.
Source: United Nations Development Programme - Reproducing inequality: How AI image generators show biases against women in STEM
Generative AI tools may require access to personal or sensitive data for training or generating outputs. However, there are concerns about these tools not obtaining proper consent from users or being unable to provide transparent information about how data is collected, used, shared, and retained. In some cases, generative AI tools may share user data with third parties without consent.
The techniques used in generative AI tools are not always transparently stated, and there is a risk of re-identification, where individuals’ privacy can be leaked from the generated data. This can lead to unintended data sharing and potential privacy breaches.
There could be a risk of unauthorized access or unintended use of personal information if generative AI tools do not implement a proper user data retention policy.
Therefore, it is important to be aware of the personal information inputting to the system and its data policy when using generative AI tools.
Source: PECB Insights - Generative AI and Data Privacy
Generative AI models are trained to create content, such as music or images, using large datasets. There is a risk that the generated output may include unlicensed content, such as copyrighted images, potentially leading to copyright infringement issues. Users of generative AI should be careful to adhere to copyright laws and avoid infringing on existing works.
Determining whether such generated content violates copyrights is a complex task. Additionally, the question of authorship arises when AI produces content, challenging traditional copyright assumptions that attribute authorship solely to humans. Establishing clear ownership rights for AI-generated works remains an ongoing challenge.
Source: Appel, G., Neelbauer, J., & Schweidel, D. A. (2023, April 7). Generative AI Has an Intellectual Property Problem. Harvard Business Review