Please improve referencing abilities

Major problem in GPT today is that it doesn’t reference or credit people who have made the information it uses. This creates anger toward AI and AI users.

I’d propose that you train your LLM to respect references more, and not make them up. GPT itself proposes these kind of measures to do that:

  1. Better training data: Ensuring that the training data is diverse, accurate, and well-referenced is crucial. Curating high-quality sources that emphasize proper referencing and citation practices can help the model learn the importance of accurate references.

  2. Reward proper referencing: During the fine-tuning process, use reinforcement learning techniques to reward the model when it generates text with accurate references and citations. This can help the model understand that accurate referencing is valuable and should be prioritized.

  3. Penalize false references: Similarly, during fine-tuning, penalize the model when it generates text with incorrect or made-up references. This negative reinforcement can help guide the model away from generating false references.

  4. Custom prompt engineering: Design prompts that explicitly instruct the model to provide accurate references or only include well-sourced information. This may lead to more responsible generation of text with better referencing.

  5. Post-processing: Develop post-processing techniques that review the generated text for references and validate their accuracy. If a false reference is detected, either remove it or replace it with a valid one. This step can help ensure that the final output includes accurate references.

  6. Monitor and correct hallucinations: Implement mechanisms that can detect when the model is hallucinating or making up information. Once identified, correct the hallucination or prompt the model to generate a more accurate response.

  7. External fact-checking: Integrate the LLM with external fact-checking tools or databases to validate generated references. This can help improve the overall accuracy of the references and minimize the chances of making them up.

  8. Incremental model updates: Continuously update and fine-tune the model using feedback and new data sources to enhance its ability to generate accurate references.

  9. User feedback: Encourage users to provide feedback on generated text, especially when false references are identified. This feedback can be invaluable for further fine-tuning and improving the model’s performance.

1 Like

I fully agree with @henri.tuhola. Reference helps to confirm the accuracy of the answer. I am still hesitating to trust every content returned from GPT. In some business cases, when one answer was wrong and damage the trust of the user, it will be hard to gain it back. Also wrong answers may cause legal issue. Reference can provide a way for users to double check before using the response from GPT especially when they are in doubt.

Cheers
Gary from encodegpt.com