Member-only story
Baidu’s Self-Reasoning Approach to Curbing AI Hallucinations
Language models like ChatGPT and Gemini are capable of generating diverse text on a wide range of topics. However, this text generation process is inherently machine-driven and relies heavily on the provided context. As a result, the outputs are not always accurate and can sometimes contradict the truth.
This phenomenon, where language models fabricate information, is known as hallucination. For instance, in 2023, a lawyer cited fictitious cases generated by ChatGPT, which were later exposed by a judge.
Baidu is attempting to mitigate this issue by equipping its language models with a self-reasoning mechanism. This approach enables the models to conduct a self-check of the information they generate by referencing an external knowledge base. The self-reasoning approach involves three steps:
- Source Finding: The approach identifies relevant documents from the external knowledge base that can answer the given query.
- Evidence Selection: The Large Language Model selects specific sentences from these documents that directly address the query. These sentences are then cited as references for the final answer.
- Trajectory Analysis: The model generates a summary of the selected sentences and uses this summary to construct the final response.