I have been using function calling to generate structured output. However, I need to use logprobs - which is not supported with function calls. So I switched to specifying json schema in the response_format parameter in the api call.
Both methods show adherence to json format - thats not the problem. However, while using the “response_format” method - I can see an overall decrease in the reasoning/accuracy of answers provided. Function calling gives much higher quality answers. Why is this? Has anyone else faced this?
There have been reports that the quality of the model, particularly concerning structured outputs, decreases when using Structured Outputs. However, the exact cause is not yet clear.
It is suggested that keeping the structure simple when generating structured outputs might be better.
If you prioritize the quality of the model’s output, you might want to consider using JSON mode.
Its interesting that function calling - which I am using only to elicit structured outputs, has a 10% higher accuracy than providing the exact same schema in response_format during an api call. I had expected them to be equivalent given the same schema.