Assistant API Max input context size

Will it be possible with the assistant API to specify a max input context size?

With the automatic truncating of messages to fit the max context size, this implies that after the message length is long enough, every input to the gpt model will be at or right below the maximum context size.

I worry that if we can’t specify something smaller than the max context size, this will end up costing a lot per interaction with the assistant.


I also have the same question. As the infrastructure of the production environment, we need to have a clear expectation on the cost.

Same question here, the way it’s right now it looks like a playground for testing and them implementing your own infra using chat completion.

Hey all! Steve from the OpenAI dev team here. We’re working on designing usage controls for thread runs in the assistants API, and I want to provide a preview of the proposed change and get your feedback.

What we’re proposing is to add two new parameters to endpoints that create a run:

  • POST /v1/threads/{thread_id}/runs
  • POST /v1/threads/runs

we would add an optional field, token_control to the payload that looks like this:

  token_control: {
   	max_run_prompt_tokens: int;
	max_run_completion_tokens?: int;

The idea is to internally limit the number of tokens used on each step of the run and make a best effort to keep overall token usage within the limits specified.

Let us know what you think of this idea and whether it will work for your use cases!

1 Like

Those parameters would be wildly helpful! Do you have an estimated date for implementation?

This would alleviate all concerns around using GPT-4 with the Assistants API.

Hi @stevecoffey Steve, This sounds like a promising addition for better managing token usage in the assistants API.

Could you share the current implementation status for this feature in the API? Knowing how far along this development is and any anticipated timelines for testing or release would help plan integration efforts on our end.