How to make a function call and get a textual response at the same time?

I’d like to get text content as well as a function call back in the same response. Is this possible?

If it isn’t possible what is the best practice?

The most advanced example on git just has the assistant return the SQL information but it doesn’t respond with text that contains more than just the information which would be ideal.

For example rather than just getting back [(‘Iron Maiden’, 213), (‘U2’, 135), (‘Led Zeppelin’, 114), (‘Metallica’, 112), (‘Lost’, 92)] it would be nice if we could get back "The top 5 artists by number tracks are Iron Maiden with 213, Led Zeppelin with 114, etc…

Or it would be nice if the content could even just respond to a larger prompt like for example if I sent it a prompt + forced it to make a function call where the prompt is like “How are you doing today” and the forced function call is get_time I get back for content: “I’m doing great thanks for asking” + the function call of “10:50 AM” rather than just content: None

I think the latter is more important to me actually.

The desired response is how functions operate.

  • you ask something the AI thinks it can use a function to answer
  • the AI calls the function chart_hits_per_artist(genre, search_years)
  • you give another AI call back your raw retrieved data in a functions role
  • the AI writes a response augmented by the knowledge

It is possible to have the AI both say “In order to answer this question, I’ll search by alternative genre to find how many hits Nirvana had.” and also write the function, but it takes specific deliberate prompting language to get it to produce that preliminary text.

1 Like

So via a single well written prompt and a forced function call you can get both?
You have an example prompt that might facilitate this action?

But you aren’t “getting back” this data from the LLM, this is the data you are going to find locally or via another API and send to the LLM as the answer for it to incorporate into its knowledge.

The LLM is not responsible to know anything about the charts, you are going to supplement its knowledge to reduce the risk significantly of it hallucinating.

In this paradigm you are only using the LLM to manipulate the language, not determine facts: ie playing to its strengths.

Now you are correct. I guess I need to re-look at that example. Thought it was responding with that information but perhaps it was doing so only after being provided it and a secondary prompt?

That said, it should be even easier than to have an actual content response along with the function call response?

You can start with, “before you use API searches for music information, explain to the user how you will fulfill their requests with the functions provided”. Then customize that text to what needs to be presented for your app, or bring out longer thought processes the AI must discuss before it calls.

Then refine over and over to get it to do both a response and function call reliably, as it’s not tuned that way and must be instructed.

I think the first thing to get here is the overall concept, before messing with the LLM to force unintended behaviour.

This system is designed to provide a framework for call and response, “factual” knowledge enrichment then the final “money” step which is providing a well worded response to the user, usually having hidden all the interstitial steps and “cheatsheet provision” from the user.

If you use GPT 3.5 these multiple calls aren’t even that expensive or time consuming.

But what if one wants the function calls not for more information / knowledge enrichment but instead or simply directing events to unfold?

1 Like

Can I guess where you are going with this? :slight_smile: :mage:

That’s a valid use of a local function too, if provided with the right inputs that the LLM can decide upon.

You could not only provide an output of a local function but do some local processing too to change local state and return a representation of that state to the LLM as needed.

1 Like

Right but it just seems a bit silly to me. If I wanted the AI to say something and direct at the same time that I can’t, I have to get it to first direct and then get it to say something or vice versa. But I see no reason why we can’t do it at the same time.

I think I’ll manage to work around this limitation, but it also seems like it shouldn’t really be a limitation

For an “action prompt” (“post this tweet”), the AI should still see each conversation history turn that it got the question, made the request, got a result, and then responded about the success.

Eliminating steps can give the impression of incompletion or error, and then you get a loop.

From the user’s perspective, the AI simply answers the question.


Yep, this is the key point.

There’s more than one layer here.

  1. What the user sees
  2. The to-and-fro between your logic and the LLM

The user does not need to be aware of the to-and-fro part.

Making the to-and-fro part simple so each “message” serves one purpose is, imho a strength as it makes the logic easier and clearer to write.

1 Like

The to and fro meaning the message history. In the end the user will just see the last response.

I was framing this from the view that I should try and minimize the message history as much as possible. Or that maybe I didn’t really need that much of a history, but at least with the way functions are designed, I’ll need to include that command in the chain prior to the actual response I want the user to hear.

1 Like

Well, eventually, once everything is up and running, you might be able to minimise this.

But you don’t have to maintain everything in the prompt you send back to the LLM, only the critical stuff you need to move to next state … so this includes any knowledge which must be taken into account, including any new state.

So by “to and fro” i’m not just referring to an ever elongating prompt … I’m just refering to the looping exchange between the LLM and your logic. It doesn’t all have to be sent back again, necessarily.

So does the auto specifier for the function call take into account the message history to determine whether or not a specific function should get called?

Like if there’s a pool of functions that could potentially get called or even if one function should be called, I would like to leave that up to the discretion of GPT. It needs some information to have the context to know. Hey should I call this? Or hey I should call this one function out of five?

So I’m assuming it will use the message history to decide what should happen when it said to auto?

In my chatbot I send recent history, but sometimes you change subject and ask a very off-topic questions, so my hunch tells me the last message is probably the most significant in determining what function is called.

And let’s be careful here … by recent history I’m talking about the summary layer of the visible conversation to the user, not the internal thoughts of the agent piecing together the answer which will only be maintained for the current QnA step.

I see. For clarity my user never sees the chat history, it’s my game talking to GPT which then talks to the user.

Really? I think you underestimate the memory of your user. :wink:

But for sure, there’s an internal conversation they are not privy to.

But I think you can junk that after one cycle so long as you are maintaining state locally.

Yes, this is possible. I had the same problem and wracked my brain for ages on how to do it. I simply have another text parameter in the function call which is for the message that I want displayed to the user. I can therefore send parameters of the function I want called, plus the message all in the same function call.

This only works if you are maintaining two separate conversations, ie the one you want shown to the use and the one you’re having with the LLM.

Hope that helps.


@dan01962 can you show me an example of this? How did you achieve this? I would like to do this via streaming. The idea is to stream the output message and then later call the function on the inputs given once the streaming is done. Also what do you mean by having two conversations? Does this mean you pass the same input to LLMs twice?