Assistants API context window?

fluxtah · November 26, 2023, 6:55pm

To my understanding when adding messages to a thread and running, you increase the context sent in each inference up to the maximum of the models context size before it starts shifting the context window losing older messages from the thread.

Worst case scenario is each new message you add you are pretty much hitting 128k tokens each time?

Because I was working on an awesome coding assistant I was so engaged it did not take me long to burn through 30 dollars

I think roundtripping code snippets, etc got quite large. I know the price is 6x less now though for hobby projects its quite expensive.

Is there a way to specify the token window when you run? So it does not send every message in a thread but something you can specify like only the last 10, etc?

That would be a useful feature, I could create a new thread each time but since we can’t list threads I have no idea where all the threads are going since I am not keeping track of them

Anyway, despite its an expensive hobby, I am loving assistants API! very cool, I wonder though if I should just be using the chat completions endpoint since the most impressive thing I found is functions and maybe I can have better control if I use completions with function support myself?

RonaldGRuckus · November 26, 2023, 6:58pm

Nope. It’s mind-boggling why this isn’t a feature.

The only solution feels hacky and it’s basically to count the tokens yourself by retrieving all the messages (waste), counting the tokens (waste), summarizing the conversation (waste), destroying the thread and re-creating it with the summary prefixed in the user message or instructions.

You cannot even truncate it yourself. Adding an assistant message isn’t permitted. So realistically the solution is to either drain your bank account or roll back to an inferior model that still at times will max your tokens by falling into an infinite loop.

100%. I love the concept of Assistants and am building my tools to use them as well in hope that they are improved. After the ousting though I’m not even sure how long it will be until it’s addressed.

fluxtah · November 26, 2023, 7:03pm

Hey RonaldGRuckus although that workaround seems hacky it is actually a good idea thanks! I Will consider that if the spend gets unbearable.

Also hope they get improved

Topic		Replies	Views
How exactly do you get charged for using the API for assistants? API assistants-api	33	6923	November 27, 2023
Assistants API token usage and pricing breakdown clarification API gpt-4 , api , assistants	10	10223	February 6, 2024
Why are my context tokens used so quickly? API api	3	2690	January 5, 2024
Token consumption: Prompt tokens exponentially increase when using Threads (Assistants) API assistants-api	8	321	September 5, 2024
Assistant API token Usage - promt_tokens usage is too high API api-usage , assistants , assistants-api	8	1772	April 10, 2024

Assistants API context window?

Related topics