Ground Truths- Curious About LLM Accuracy Research & Methods of Detecting/Avoiding Confabulation

Subject: Curious About LLM Accuracy & Methods of Detecting/Avoiding Confabulation - Also Humbly/Desperately Seeking Plugin Access Tips

Hey OpenAI Community,

I’m grateful to get involved in this group- thank you for all of the incredible insights, I’ve learned a lot in my first few hours since joining.

I’ve got a question about how OpenAI handles tagging parts of text in LLM inputs/outputs. How do you tell the difference between ground truth knowledge and what the LLM predicts as the next text output? I’m generally curious about how confabulation is avoided. I feel like external knowledge bases might be the key here, and maybe there’s a way to tag LLM output to show what’s trustworthy and what needs ground truth verification. Any thoughts or resources on this would be super helpful!

Also, I know I need to be patient, but I’m also determined to get access to the plugins feature in a legal and moral way. If you have any tips or tricks on how to get off the waiting list faster or even some good vibes to send my way, I’d really appreciate it. :slight_smile:

Thanks for being kind and helpful (this is just a cool thing to be a part of, and I’m excited to keep on learning).



1 Like

I wonder if a potential approach could incorporate weighting content in LLM output, something like the following: 1) Generating the sentence with components like A+B+C; 2) Identifying that component ‘C’ requires ground truth knowledge; 3) Replacing ‘C’ with an output that maintains the same structure but is derived from an external knowledge base, ensuring it closely aligns with verified ground truth information rather than a generic LLM output.

Recognizing when ground truth knowledge is necessary seems quite challenging, and I welcome any thoughts or insights on this approach.