Structured Output Confidence Score

dcraigmile · December 10, 2024, 6:02pm

I am using Structured Output and passing a JSON schema to the Chat Completion API. I am looking for the LLM to provide a “confidence score” for each object in an array returned in the JSON. I have defined this element this way:

                        gen.writeFieldName('confidenceScore');
                        gen.writeStartObject();
                            gen.writeStringField('type', 'number');
                            gen.writeStringField('description', 'Your confidence (0-100) that this load data is correct'); 
                        gen.writeEndObject();

I am getting way overconfident values back from the model. Sometimes “100” when I provide gibberish as the user input.

How can I craft a better description or definition in the JSON so I get a better confidence score?

Thanks for your help!!

mani.doraisamy · March 20, 2025, 2:39pm

We had the same requirement. So, we created a small open source project to calculate confidence score:
NPM - @promptrepo/score - npm
Source - GitHub - ManiDoraisamy/promptrepo-score: Calculate confidence score for structured output generated by LLMs like OpenAI

Topic		Replies	Views
Gpt-4o-mini response evaluation Community gpt-4 , rag , evals	3	243	February 17, 2025
Calculating the Confidence Scrore for the Responses to the Prompts in case of Text 2 SQL application Community gpt-4 , plugin-development	0	211	July 4, 2024
ChatGPT and RASA API	1	1636	February 6, 2024
Evaluating the confidence levels of outputs generated by Large Language Models (GPT-4o) Community gpt-4	5	618	June 8, 2025
Confidence Score OpenAI GPT-4-turbo API	2	374	July 22, 2024

Structured Output Confidence Score

Related topics