Assistant API message cutoff

turbolucius · June 20, 2024, 4:53pm

Current OpenAI models have a pretty low limit when it comes to outputting text.

GPT-4 models can take up to 128k tokens as a context window (so the total input + output), but can only output up to 4096 tokens at a time. If you want to output more, you have to send another request (which doubles the amount of tokens you’re using since it counts as another generation).

One token roughly equals ~4 characters, sometimes less with English text, so that fits your description - it just sounds like you’re giving the Assistant more text as input than it can output in one go.

One possible workaround to do this automatically would be to give the Assistant a custom function it can call at the end of its first generation if it realises it hasn’t finished fixing/organising the transcript yet. Your code would then return a function output to the Assistant, instructing it to keep going and pick up where it left off.

Topic		Replies	Views
Large JSON Responses from Assistant API are truncated API json , assistants-api	5	1408	June 20, 2024
Response limitations on Assistant and possible alternatives for structured data API	2	911	March 19, 2024
The output of the assistants does not cover the requested word limits API gpt-4 , api , assistants-api	2	1443	January 22, 2024
Token/character restrictions for assistants-api? API assistants-api	10	4187	January 25, 2024
Assistant's truncation/optimization of context window is not good! API gpt-4 , api , assistants , assistants-api	7	1772	April 4, 2024

Assistant API message cutoff

Related topics