Hallucinations in structured output when the source data is light

Hi all:

I have got structured outputs working nicely. My main challenge right now is hallucinations when the source data is small. My use case is parsing info about art objects. Works great with a rich data source, but if my data source/prompt contains the single word “Ostade,” OpenAI picks a random work by the artist van Ostade and gives me or hallucinates data about it. Any way to minimize this? Will the temperature or seed settings make a difference? Or better prompt engineering? Thanks!

2 Likes

The AI pretty much HAS to write. You back it into a corner when using structured outputs, and all the keys must be filled in. There is no way to write denials if the AI doesn’t know the answer with certainty - or if the schema description says it will produce policy violations if the single key is ever used.

Untitled

The way I would see that you could avoid the similar problem of being forced to represent an answer as truthful, with only set keys that talk about the solution, is to use an anyOf choice of two different schemas (after a top level that is just a container).

When doing so, you can place a second schema, that is specifically crafted for the AI to escape with a “can’t answer”, perhaps making the way out obvious with some enums like “no information returned”, “too new”…

This is also a technique to get out of a bad function call. The AI could pick the “no useful function, reply to user” function when for some uncontrollable probability the AI decides to emit functions when none are of any use.

  • By providing a way for the AI model to respond that it is uncertain, or even deny output, you can avoid hallucinations.
  • Because a structured response schema is strict, the way to give an option is anyOf.

Untitled

No more output field for something unanswerable!

(rest assured that there is no actual function return, but indeed, the AI will still see the response format somewhat as the job it is designed to do, so this demo response_format I just wrote up, still with “you are a helpful assistant”, invokes description-less “policy violation” enum I provided far less than expected.)


One-job schema, with escape schema

{
“name”: “my_response”,
“strict”: true,
“schema”: {
“type”: “object”,
“properties”: {
“my_response”: {
“anyOf”: [
{
“type”: “object”,
“properties”: {
“company_product_search”: {
“type”: “string”
}
},
“required”: [
“company_product_search”
],
“additionalProperties”: false
},
{
“type”: “object”,
“properties”: {
“no_response_reason”: {
“type”: “string”,
“enum”: [
“not_parts_request”,
“question_unclear”,
“not_company_scope”,
“policy_violation”
]
}
},
“required”: [
“no_response_reason”
],
“additionalProperties”: false
}
],
“additionalProperties”: false
}
},
“required”: [
“my_response”
],
“additionalProperties”: false
}
}