Grounding score for OpenAI RAG

I was wondering if I can the grounding score for the generated answer when using the OpenAI RAG? Like the score that google RAG provides:
Check grounding | Vertex AI Search and Conversation | Google Cloud

Hey there!

This seems to be something specific to Google’s ecosystem. What exactly is attractive about this to your use case? Do you need something more like logprobs or citations?

Thanks for your quick response.
That is actually not specific to google, most of packages like ragas, MLFlow, and TruLens offer a similar functionality. The interesting point about the google’s offering is that it has a specific LLM (looks like fine-tuned) for this task, which provides a good performance. Although gemini’s performance is not as good as gpt4, so that would be great to have grounding in OpenAI RAG offering as well.

Note that the grounding metric helps to implement the ReAct method end-2-end. For example, if the grounding score on the generated text is not high enough, one can just block the answer from being sent to the user, or do some rewriting/re-ranking/etc to make it better before sending it to the user.

Yeah, OAI doesn’t really have anything like this out of the box, you would have to build something like this yourself, using logprobs: Using logprobs | OpenAI Cookbook

Is TruLens incompatible with OAI’s knowledge retrieval?

1 Like

TruLens is compatible with OpenAI. But the issue is that I am not sure about the quality of those packages. In other words, let say TruLens tells me that the generated text is grounded, how do I know that is True? Basically, what is the accuracy of the TruLens itself?

Usually the grounding is obtained through some LLM calls, not using logprob calculations. The reason is that logprobs are usually small values,