Currently I am working on application where assistant has to answer to user messages, but sometimes it requires constant template messages, e.g. order confirmation, patient information.
The problem I got stuck with is that I cannot return template message when using function calling as when I use structured output.
Example of structured output:
User: how to solve 2x+7**x = 9
Structured Output:
{"steps": [{"type": "step", "explanation": "We start with the equation", "output": "2x + 7^x = 9."}, { "type": "step", "explanation": "To isolate terms, rewrite the equation", "output": "7^x = 9 - 2x."}, {"type": "step","explanation": "This is a transcendental equation and can't be solved algebraically. We will use numerical or graphical methods to find an approximate solution.", "output": "Use numerical methods or graph the functions."}, {"type": "step","explanation": "We can plot the functions f(x) = 2x + 7^x and g(x) = 9 to find the intersection.", "output": "Find intersection points.")],"final_answer": "The solution can be approximated numerically or graphically. The approximate solution is x ≈0.2."}}
Output:
Step 1. We start with the equation
Section Answer: 2x + 7^x = 9.Step 2. To isolate terms, rewrite the equation
Section Answer: 7^x = 9 - 2x.Step 3. This is a transcendental equation and can’t be solved algebraically. We will use numerical or graphical methods to find an approximate solution.
Section Answer: Use numerical methods or graph the functions.Step 4. We can plot the functions f(x) = 2x + 7^x and g(x) = 9 to find the intersection.
Section Answer: Find intersection points.The solution can be approximated numerically or graphically. The approximate solution is x ≈0.2.
.
I want to make same thing with function calling, but as I understand when you use function call, its output it has to be pushed to LLM to complete run and add to thread. Output pushed using
openai.beta.threads.runs.submit_tool_outputs(...)
But I want to return template message to end user as it is done when it structured output is used.
Solutions I come up with:
- cancel run and then add template message with arguments to thread, but in this can we might lose other tools if there are.
- Let Assistant to generate response(do not cancel), but add message to memory storage and when run.status == “completed”, delete generated response and append the one that was in memory storage.
Example Code:
memory_message = ""
req_action = False
thread_id = ''
if run.statuts == "requires_action":
handle_tool_calls(run, client, thread_id) # also add template messages to memory_storage variable
req_action = True
elif run.status == "completed":
messages = client.beta.threads.messages.list(thread_id=thread_id)
message = messages.data[0]
if req_action == True:
if message['role'] == 'assistant':
client.beta.threads.messages.delete(thread_id=thread_id, message_id=message.id)
client.beta.threads.messages.create(thread_id=thread_id, role='assistant', content=memory_message)
req_action = False
return message.content[0].text.value
I have not tested it yet, but I think it will work this way.
So in the end:
- Is there any better way to make it work, because right now, I literally waste tokens and time?
- I want to request a feature to allow less AI interception when there is function call and if there is template message.