How to reduce cost of chat like API call

Gabriel.ZO · May 27, 2023, 1:03pm

I’m trying to setup a chatGPT-like bot, but the cost of each API call gets too expensive really fast, since I need to send the full chat back to the API so it recognizes its own conversation.
What are the best practices to make less expensive API calls?

sps · May 27, 2023, 1:48pm

Welcome to the OpenAI community @Gabriel.ZO

To achieve that, you can use summarize the pervious conversations, store them, and then send only the conversations that are contextually relevant to the user’s message using embeddings.

This will greatly reduce token count which goes up rapidly with every message.

hazel1 · May 27, 2023, 8:51pm

Sorry to piggyback but this is something I’m working on as well. Can you elaborate on when/how to summarize and is there some sort of dummies guide to embeddings because they seem to be the go to answer around here but apparently their use is way over my head. All the examples are in python (which I do know a bit) but I’m using the orhanerday php library and I’m just not sure what to put in the input and what to do with all these numbers it gives me back and how that in any way helps! I looked at the classification example and it involves csvs and comes back with a graph, how does that tell me if the user is asking for a search warrant or found out about the secret affair ?(I’m making a murder mystery game) - I’ve currently got that part handled, probably not in the most cost efficient way but it will do for now.

Like the OP though, I’m sending the whole conversation history in the messages . I start with a system message that outlines the personality and constraints of the character and some backstory etc then its just user message, assistant message alternating - obviously getting exponentially more expensive with each message. The context is important as it allows the character to respond appropriately based on things its already said or things the user has previously asked.
I’m already storing every message in the database with the role and content.

renjestoo · January 30, 2024, 8:42pm

I had an idea… What if you let the API generate stenography?

It takes much less letters to create words or even complete sentences.
It was also an idea in my head, but I still arent able to let the API correctly come up with the “chords” that make up the words from phonetic stenography typing.

By implementing this into your code, it would decrease the costs for the API usage immensely.

So instead of paying about 5 dollars for a completely written out program in for example python with gpt 4, it could be reduced to about a dollar when using stenography.

Dont know if someone already thought of this earlier, but couldn’t find anything online that tries to implement this.

Again, this is just an idea that popped into my head a day ago…

Topic		Replies	Views
Saving API cost in back-and-forth conversational chatbot API	4	1702	December 17, 2023
Differnet ways to Summarize the user Chat History API gpt-4 , api	4	3894	March 10, 2024
Has anyone brainstormed a cost efficient way to include the chat history for conversation-based applications? API	8	3473	July 21, 2023
Retain past responses in memory without sending them again at every API request API gpt-4 , gpt-35-turbo , chatgpt	11	10862	January 25, 2024
Reducing costs from the previous context and system instructions when using chat completions api API api	3	280	October 5, 2024

How to reduce cost of chat like API call

Related topics