Skipping the last API request to function call

I’ve been developing a quite time-sensitive GPT function-calling-based app and I find that one extra API call at the end effectively slows down the execution. I am basically using GPT as a NLU tool. Suppose I have multiple intents in my query, and I want GPT to parse them and so I can use the respective functions I have locally. The thing is that every time my program needs to send one more request to the API until the finish_reason becomes stop, but my program is not using the text replied from that request at all. So I am just wondering if there is any effective way to skip this step or anyone has find any workaround if you are working on similar projects? Thanks

I don’t know you exact use case, but you might investigate using the Stream=true option, that way you can end the connection immediately, not sure why you would be calling the API if you don’t need the reply…

I get that you need to get the finish reason as stop and why you are not getting that in the normal course of a reply.

Thank you for the suggestion. That’s what I initially thought could be the solution.

But it turned out that the finish_reason would remain null until the last SSE gets streamed. It actually seems quite weird to me because GPT’s decision to end the conversation instead of calling another function, intuitively, should be done before it actually starts streaming the text to summarize all the data it gets from my function calls. So I believe it is technically possible to set the finish_reason to stop from the first streaming response.

As for why I don’t need that reply, it is bascially because the local functions I need to use are quite dependent on each other and I need to know which functions the query needs to use before I can use them alltogether. In this case, GPT’s summary text would be meaningless because my program is not actually calling those function as my program executes.

1 Like