Adding human in the loop(HITL) for language models

I am building a question and answer application using gpt model and retrieval augmented generation approach. I also wanted to add HITL in the solution, but due to the nature of the model there is now way to get a confidence threshold of the answer it gives. So, for these type of models, how do we setup a human in the loop(HITL) in the solution

Thanks

Did you ever figure this out? I’m interested in implementing something similar