Assistants API context window?

To my understanding when adding messages to a thread and running, you increase the context sent in each inference up to the maximum of the models context size before it starts shifting the context window losing older messages from the thread.

Worst case scenario is each new message you add you are pretty much hitting 128k tokens each time?

Because I was working on an awesome coding assistant I was so engaged it did not take me long to burn through 30 dollars

I think roundtripping code snippets, etc got quite large. I know the price is 6x less now though for hobby projects its quite expensive.

Is there a way to specify the token window when you run? So it does not send every message in a thread but something you can specify like only the last 10, etc?

That would be a useful feature, I could create a new thread each time but since we can’t list threads I have no idea where all the threads are going since I am not keeping track of them :person_shrugging:

Anyway, despite its an expensive hobby, I am loving assistants API! very cool, I wonder though if I should just be using the chat completions endpoint since the most impressive thing I found is functions and maybe I can have better control if I use completions with function support myself?

2 Likes

Nope. It’s mind-boggling why this isn’t a feature.

The only solution feels hacky and it’s basically to count the tokens yourself by retrieving all the messages (waste), counting the tokens (waste), summarizing the conversation (waste), destroying the thread and re-creating it with the summary prefixed in the user message or instructions.

You cannot even truncate it yourself. Adding an assistant message isn’t permitted. So realistically the solution is to either drain your bank account or roll back to an inferior model that still at times will max your tokens by falling into an infinite loop.

100%. I love the concept of Assistants and am building my tools to use them as well in hope that they are improved. After the ousting though I’m not even sure how long it will be until it’s addressed.

6 Likes

Hey RonaldGRuckus although that workaround seems hacky it is actually a good idea thanks! I Will consider that if the spend gets unbearable.

Also hope they get improved :pray:

1 Like