Assistant API and function calling

The way the assistant API expects a feedback when there is a function calling is lowering the ability to build up useful features on top of Assistants. You have to provide a function feedback before it triggers the textual output and finishes the run.

On my end, I would like to be able to get the information about function(s) to be called without it beeing a blocker for the textual output of the model

Why ?

  • Because it is way faster and the way it is structured right now makes it impossible to be used for near real time answer expectations
  • Because I don’t want the LLM textual output to be influenced by the function feedback

I understand that in some cases it might be useful but why not add a parameter to the POST query to avoid the need for function feedback and have the best of both worlds?

My use case is the following : users provide a transcript of a call they have with a customer. My assistant API provide a summary of the call structured through a pre-designed framework. In addition I use a function to recommend product depending of what is said during the conversation. Thanks to the arguments I can use my elasticsearch service to provide products in touch with the customer’s needs.

The current compulsory workflow is the following one :

  • Send the transcript passing the run’s status to “in progress”
  • wait 10-13 seconds
  • the assistant tells me which function to call and pass the status of the run to actions_required
  • I provide it with fake answers (true to all the function calls) passing the run status to in_progress
  • wait 10-13 seconds
  • the assistant provide the textual output passing the status to completed

What I would like to achieve is the following :

  • Send the transcript passing the run’s status to “in progress”
  • wait 10-13 seconds
  • the assistant tells me which function to call and provides the textual output of the LLM passing the status to completed
  • I run the functions to call asynchronously on my server and enrich the textual answer accordingly making it 10-13 seconds faster than the current possible workflow
1 Like

Maybe you can expand a bit on your usecase? You can certainly instruct the Assistnat to make calls where with the result is irrelevant to the text output. And obviously you by describing the function and the prompt have control over how and when you want to function to be called. Share a little bit about what challenge you have with it. Or what scenario you are trying to run?

I enriched my question, is it clearer?

So if I understand correctly you actually want two things done with the same transcript? 1) create a specifc format summary 2) create a product recommendation.
Why not create two assistants and start both, one for each.

Then present the results combined?

I had similar issues with these limitations, linking to my post for good measure Feedback on requirements for tool_call_id in messages

Cost efficiency / prompt would be partially overstepping (mostly even), and because one assistant does it perfectly, but it does it in a sequence instead of paralleling the steps.

Then philosophically, it should be in the same thread and run by the same model because the recommendation has the same source as the summary of the transcript.

So it needs to write the summary in order to write to write the recommandation. So that part is certainly not parallel. You running something on your server async is like running that second Assistant (parallel). What is the role of the ‘function’ then in your ‘ideal’ scenario " * the assistant tells me which function to call and provides the textual output of the LLM passing the status to completed" ?

Sorry I expressed myself wrongly and corrected my post: it is not derived but both the summary and the recommendation have the same source : the transcript of the conversation between the sales rep and the customer.

All in all it comes back to the cost, running two models for this purpose is not efficient cost wise.

But if you think this is the only solution I’ll have to probably do it this way.

I’m really curious and interested in understanding your exact workflow (I run a lot of assistants on different tasks that include database updates and round-trips so I am certainly interested in different approaches.)

I am sorry I still don’t understand your flow- what is the problem with the assistant calling your function WITH the summary - triggering you background work and then giving it back? You already have the summary at that point. Is what happens by the assistant AFTER the function call taking so much time?

The problem is that the assistant ask me for the function feedback before providing the textual answer. Is there a way to ask it to provide the summary and then request the functions feedbacks?

By the way, I completely forgot to thank you for taking the time to deep dive into my issue, it is really much appreciated :smiley:

1 Like

Self-explanatory, not really needing “assistants”.

You are a helpful transcript summarizer.
Your output is to an API for further processing.
Create only valid json complying to schema, with AI summary and analysis.
AI response language will be extracted from output json.

// structured summary instructions

100 words: the user concerns, then
100 words: the support agent's solutions.
1 word: disposition, from [open, resolved]

// validated json output schema

{
  "$schema": "http://json-schema.org/draft-07/schema#",
  "type": "object",
  "properties": {
    "transcript_output": {
      "type": "object",
      "properties": {
        "structured_summary": {
          "type": "string",
          "description": "the response to the user",
          "example": "I am glad to help!"
        },
      },
      "required": [
        "structured_summary",
      ]
    },
    "transcript_structured_summary": {
      "type": "string",
       "description": "AI writes a summary of the support conversation using instructions"
    },
    "user_question_count": {
      "type": "number",
      "description": "number of conversation turns which were user questions"
    }
  },
  "required": [
    "transcript_structured_summary",
    "user_question_count",
  ]
}

So YES you can totally prompt the Assistant to call the function WITH the summary. You put that in the description of the function as well - and it will have one incoming field the the summary you need to do your work.

It is like literally the key point of Assistants + Functions to work like that.
I have an Assistant the processes incoming emails where it needs to do a summary - check in Salesforce if a company exists or not -(and then add if not) - and then save the result. That is a three page long prompt for a lot of different tasks and using 10 different functions as well.

Here’s how I have the ‘engine’ running it Building a scalable OpenAI Assistant Processor in Django with Celery | by Jean-Luc Vanhulst | Dec, 2023 | Medium (Django+Celery+Assistants)

I meant the other way around :
I want the assistant to give me the summary first and then ask me to execute the functions.

But I think I’ll go for two different assistants :

  • one in charge of providing product recommendations
  • The other one in charge of providing a summary of the conversation

Something I’m struggling with when it comes to product recommendation is that it is very unreliable. The same user conversation will provide different get_product_recommendation and I can’t manage to tweak the result based on the function description and the documentation is pretty poor when it comes to function description.

Please share some of the PROMPTING. The function description is only a tiny part of ‘why’ and how your function gets picked. Its really the prompt that will direct it. If there is one thing I have learned the last few weeks: It is HARD to write clear instructions. More details, more specificity is better in general I have realized my prompts where not elaborate enough.
And that is one of the huge advantages of the Assistant model: you can have 32k instructions.
Lastly - the different models have different ways of handling your instructions. It does help to switch models and see how they react different.

Hey Jean-Luc,

I managed to do it, here is what I did :

  • I’ve been very specific in the prompting of the assistant about the usage of each functions, and I’ve asked GPT4 to refine my prompt
  • I’ve redesigned my functions with GPT4 too and managed to have array parameters cutting the need to have each function executed several times and i now have consistency over the different runs of my assistant for the same user prompt. Making it more reliable.
  • I divided my function in two sub functions with one less parameter in each
  • I’ve learned of the minItem parameter than can be used within a function parameter description to ensure that at least one of the enum will be cited.

Key learnings :

  • If functions are your main focus within an assistant, describe them from the assistant instructions
  • Rework both instructions and functions with GPT4 telling him what is not working when you use the playground
  • Use several simple functions rather than one complexe one with a lot of parameters
1 Like

I love your learning points - especially 1) and 3) ! Glad you got it working!

1 Like