AI Hallucinations and How To Avoid Them
FAQs
What is an AI hallucination?
An AI hallucination is incorrect content generated by artificial intelligence (AI) that is presented to the user as factual and correct.
Why do AI hallucinations occur?
AI hallucinations occur most commonly for the following reasons:
Incomplete or low-quality training data
Data lag
Unknown or misunderstood prompt language
Biased training data
What are some different types of AI hallucinations?
Some commonly encountered types of AI hallucinations include:
- Factual errors: This simply means that the training data provided to the AI tool is factually incorrect and, as a result, the AI response is incorrect.
- Fictional responses: In this type of AI hallucination, large language models (LLMs) can generate outputs using words very similar to their training data but with no basis in fact.
- Response misalignment: Response misalignments are hallucinations where the AI model provides an answer that may be tangentially related but predominantly irrelevant to the prompt.
- Response contradictions: In these types of AI hallucinations, the response to the user prompt will actually contradict itself.
- Non sequitur responses: Non sequitur responses are a type of hallucination where one part of the AI chatbot's response has no relationship to the others.
- Data lag errors: As many AI chatbot models utilize training data that is historical rather than real-time, it's possible that inaccurate responses may occur simply due to technology advancement or events that occur post training data capture.
- Inappropriate responses: Sometimes referred to as "creepy AI," these types of responses can be disturbing to users. These hallucinations can be the result of benign training data issues, misunderstood prompts, fictional responses fueled by an errant language model, or adversarial attacks.
What are some ways to avoid AI hallucinations?
Some of the easiest ways to avoid AI hallucinations problems include the following:
Know the AI's limits and understand where that model historically has had issues.
Be specific in user queries and avoid using slang or complex language.
Filter and limit responses by giving the AI specific parameters to operate within.
- Provide context regarding the desired output by using a vector database or vector search solution (such as MongoDB Atlas Vector Search)
- Provide samples of desired output.
- Fact-check.