Conversation history and function call answers

Hello,
I am exploring the possibility of transitioning away from assistants (which are slow and still in beta), and this involves creating a conversation history for the completions API. Since I use function calling, I am uncertain about how to incorporate previous function calls into the conversation history (appropriate message format?). Should I include only the returned function values, or also the calls themselves?

The scenario I wish to address is as follows:

USER: How many apples are required for the ā€œApple pieā€ recipe?
FUNCTION CALL: retrieve_recipe(ā€œApple pieā€)
FUNCTION CALL RESULT: { ā€œapplesā€: 2, ā€œcarrotsā€: 1 }
ASSISTANT: You need 2 apples.

Now, if I subsequently inquire about the number of carrots, I want the model to retrieve the data from the previous result without making another function call, for example:

USER: Ok, and how many carrots?
ā† NO FUNCTION CALL REQUIRED, RETRIEVE ANSWER FROM PREVIOUS JSON
ASSISTANT: You need 1 carrot.

Thanks!

You need the concept of hidden thoughts. Include results from all function calls. Then include those hidden thoughts in the prompt history you send to LLM each time but donā€™t show them to user (or hide them behind a toggle)

3 Likes

Thanks, sure my idea is to include something hidden in prompt history. But are function call answers enough or do I also need function call requests? Are call_id required or no?
How ā€œminimalā€ can be messages and what format to use for them to work?
Thanks!

Hi! This interests me, when you refer to hidden thoughts, exactly what needs to be done? At the moment I am migrating to completions and with the context I donā€™t know how to work it, currently I am leaving the last 10 messages as open ai does in the messages array.

thx!! :smile:

You need to maintain a distinction between prompts (ie text) shown to the user and all prompts shown to the bot.

You need an inner loop where the bot iterates through sending function calls and getting answers (these are not shown to the user)

mmā€¦ you donā€™t mean this?

 {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "hi"
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "{\n  \"responseText\": \"hi how to assist today?"\n}"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "aaaa"
        }
      ]
    },
    {
      "role": "assistant",
      "content": [
        {
          "type": "text",
          "text": "{\n  \"responseText\": \"aaaaaa"\n}"
        }
      ]
    },
    {
      "role": "user",
      "content": [
        {
          "type": "text",
          "text": "aaaa"
        }
      ]
    },

Thatā€™s one layer. The other layer is ā€œbehind the scenes thinkingā€ (which is what assistants API does and thereā€™s an even more sophisticated version going on with ā€œreasoningā€ models). This is what you need to replicate when you move to Completions.

Yep. Creating a prompt machine that fills the context on the fly.

But it is not really carrots and apples in real world applications. It can be thousands of pages of documents structured im chunks and labled with millions of labels.

Those would probably be behind another layer via RAG and accessed via function calls.

So not incompatible with my proposal, and in fact aligned with it.

You might find this read useful:

It chains tool calls and tool results behind the scenes.

1 Like