Function call validation approach?

Hey, I’m curious how others are managing validation of input parameters supplied by function call responses. I’ve explored Langchain and Pydantic but found them lacking in flexibility due to my application’s need for custom-constructed functions based on the current state.

Currently, I use the jsonschema package. Upon each function call, I intercept it to validate the arguments against the relevant schema. If validation fails a new request is created with the original response and a system message containing the errors. This loops, retrying up to three times if validation fails. Though functional, this approach is slow and token-expensive with jsonschema. I’ve tried fastjsonschema as well, but it only reports the first error it encounters.

Moreover, for simpler argument errors, like a mis-typed integer or a string instead of an array, I wish there were a way to auto-correct them, though I haven’t found a satisfactory solution yet.

Has anyone discovered Python packages or other solutions for handling this, especially ones compatible with JSON schemas?

5 Likes

I validate the output and if it has problem, I give it back to the Completion API and let it handle how to say it to the user.

user: what is the weather today?
function-call: { time: 'today', location: '' }
validation output: { error: 'invalid location', message: 'you need to submit location' }
...

The AI handles it well from my experience and tells the user what is the problem so the conversation flow seems more natural.

1 Like

what I’m doing is that, in the chat-history I include the function call message object and then I produce and add a function response object {“role”: “function”, “name”: function_name, “content”: content}; where the content is like Success or in this case Fail: {{failure reason here}}.
ontop of this, I have retry mechanics. the retry mechanics together with telling the bot what it did wrong seems to work for me in 99.9% of cases

doesn’t seem so different from what you are doing already maybe, but I’m just not really seeing failure cases~

some more details: https://platform.openai.com/docs/guides/fine-tuning/fine-tuning-examples (under the function calling section) this is what I’m using for reference to keep the history accurate in regards to the base function calling pattern, it might help to accurately produce the function response object.

I think beyond this feedback pattern along with correctly programmed retry mechanics; what you’d be looking at now is better prompt engineering for your functions~ try to address all your failure cases in either the system prompt or the function description prompt.

A good thing to know when creating more complicated structures is some take-aways from this thread: How to calculate the tokens when using function call - #9 by _j where it shows that, among other things, "description"s for properties of type “object” properties are not included.
Basically your json gets parsed into a typescript-like interface-like structure, where things like description and default end up as // comments and stuff, but openai’s “parser”/“transformer” from Json to Ts is very limited (you’ll find a couple of people, including myself, who are writing their own implementations to overcome these limitations)~

Edit:
actually this got me interested and I decided to have a conversation with chatgpt about it https://chat.openai.com/share/b698c075-b36a-4739-9d78-a7845e3b3a11

of course a lot of it might just be hallucinations, but it keeps telling me that it only expects output from the function role for a successful call, where as otherwise it expects a system message, in this thread it’s talking about structures like the one @supershaneski mentioned (error: ... message: ...)

so perhaps the original approach of using system messages might actually be more in-line of what the bot expects/is trained on. the docs don’t really address this~

1 Like

erm … yeah! Did function calling exist before September 2021?! So I don’t see how you can ask ChatGPT about this and expect a sensible answer, unless it’s somehow prompted with current Open AI API docs … ?

If you are putting the text of a training file, like that example’s function_call": {"name": "get_current_weather",... then you are actually not making it clear what the AI emitted, and are retraining the AI on how to write functions wrong with every inclusion in chat history.

Unfortunately, there’s no way to know the right way until OpenAI releases the specification, or actually includes a role for including a function call back into history. You can’t correctly put get_current_weather("city": "Miami") => str<different_stop_token> into the assistant role, and identical text would be best for showing what, when previously emitted, didn’t work.

That chat share is a long sequence of confabulated implausibility.

No, that part comes from the AI itself, I was talking about the function response not the call, for the call you just append the AI’s function call message to the history (which looks like the example in the docs, so I think if you built it yourself you’re still fine), but for the response it has its own role like {"role": "function", "name": "get_current_weather", "content": "21.0"} except I’m starting to realize that probably errors might be system messages

Did function calling exist before September 2021?! So I don’t see how you can ask ChatGPT about this and expect a sensible answer, unless it’s somehow prompted with current Open AI API docs … ?

from what I understand function calling is a fine-tuning of the model, it’s in the parameters

The response that your function supplies back to the AI can be anything you want, it doesn’t have to follow a format. Plain language is more what the AI is attuned for, so that can work just as well as something in a json.

“Validation” for this topic refers to the AI producing output that is not in the desired format. You can minimize the errors by minimizing the production options with a low top-p parameter.

To have an AI iterate on what it did wrong and try again, the AI must know from faithful conversational context its prior attempts and how that failed.

1 Like

my personal experience is that following the pattern that the model is fine-tuned on for function calling will result in a higher success rate for function calling

That’s the thing - we can’t know what kind of function returns the AI was actually tuned on, or by extension, the similar plugin returns.

There’s no “here’s how we made a function-calling, API-return-understanding, AI” white paper.

1 Like

Would be great if we could annotate the API we need to call and gpt can understand the parameters requirements.