Assistants API pricing details per message

_j · November 13, 2023, 4:53am

Exactly. And unlike ChatGPT, OpenAI has no incentive to minimize the conversation loaded into the AI every iteration. They limit gpt-4-turbo output to 4k because the generation is what actually costs money and CPU time, not the loading of $1 billed conversation into it.

OpenAI doesn’t describe any techniques such as an embedding database that could extend the illusion of memory, but they do say they’ll truncate only when conversation won’t fit into AI context.

You could pull down the thread occasionally, truncate it by token count, and send it back to a new thread, to not spend 16k (or 128k) every question, but then what’s the point of their system anyway?

Topic		Replies	Views
Assistants API Pricing and Token Usage API api , pricing	104	33103	February 27, 2024
I'm burning through tokens here. What can I do to minimize that? I've included the text of my instructions to my Assistant API	11	2261	November 14, 2023
How exactly do you get charged for using the API for assistants? API assistants-api	33	7599	November 27, 2023
How to accurately price a gpt-4 chatbot? API gpt-4 , api	64	25165	February 6, 2024
Test new 128k window on gpt-4-1106-preview API	29	18494	February 6, 2024

Assistants API pricing details per message

Related topics