How to reduce cost of chat like API call

I’m trying to setup a chatGPT-like bot, but the cost of each API call gets too expensive really fast, since I need to send the full chat back to the API so it recognizes its own conversation.
What are the best practices to make less expensive API calls?

2 Likes

Welcome to the OpenAI community @Gabriel.ZO

To achieve that, you can use summarize the pervious conversations, store them, and then send only the conversations that are contextually relevant to the user’s message using embeddings.

This will greatly reduce token count which goes up rapidly with every message.

3 Likes

Sorry to piggyback but this is something I’m working on as well. Can you elaborate on when/how to summarize and is there some sort of dummies guide to embeddings because they seem to be the go to answer around here but apparently their use is way over my head. All the examples are in python (which I do know a bit) but I’m using the orhanerday php library and I’m just not sure what to put in the input and what to do with all these numbers it gives me back and how that in any way helps! I looked at the classification example and it involves csvs and comes back with a graph, how does that tell me if the user is asking for a search warrant or found out about the secret affair ?(I’m making a murder mystery game) - I’ve currently got that part handled, probably not in the most cost efficient way but it will do for now.

Like the OP though, I’m sending the whole conversation history in the messages . I start with a system message that outlines the personality and constraints of the character and some backstory etc then its just user message, assistant message alternating - obviously getting exponentially more expensive with each message. The context is important as it allows the character to respond appropriately based on things its already said or things the user has previously asked.
I’m already storing every message in the database with the role and content.

2 Likes

I had an idea… What if you let the API generate stenography?

It takes much less letters to create words or even complete sentences.
It was also an idea in my head, but I still arent able to let the API correctly come up with the “chords” that make up the words from phonetic stenography typing.

By implementing this into your code, it would decrease the costs for the API usage immensely.

So instead of paying about 5 dollars for a completely written out program in for example python with gpt 4, it could be reduced to about a dollar when using stenography.

Dont know if someone already thought of this earlier, but couldn’t find anything online that tries to implement this.

Again, this is just an idea that popped into my head a day ago…