What is the max output token length for Predicted Outputs?

sirui.lu.phys · November 12, 2024, 12:10pm

Does 16k account for only the changed/rejected tokens or including the accepted tokens? Curious to know because this would help in editing long documents.

_j · November 12, 2024, 6:26pm

Sending a prediction does not change what the model will produce.

First, you would have to confront the near-impossibility of having the model write to that length. The most I have gotten when specifically targeting a length with justification is around 12k. It will just cut off mid-sentence, or interject some other wrap-up message.

Then, that the AI must write almost exactly the same as long lengths that you send - or you will have rejected prediction tokens that COST. There’s already near-doubling the price with rejected token billing on the best applications that you can think of…insertions, minimum modifications, things that take careful hand construction of exactly the prediction you expect to remain - all are more expensive.

So with increasing cost as the AI is figuring out how to come up with “accepted” and “rejected” totals that are significantly more than the actual prediction length sent, the only thing to answer is if max_completion_tokens still does it’s o1-preview job of limiting total cost - but you can’t use that parameter. I’m done thinking about how to increase my costs and work to get the same model output, though, at even slower production rates if there is not good prediction matching. Less computation time for OpenAI shouldn’t mean higher costs for me.

Topic		Replies	Views
What is the maximum response length (output tokens) for each GPT model? API	6	32832	November 7, 2024
Is the "output (Maximum length)" for the GPT-4-1106-preview API still capped at 4095? API gpt-4 , gpt-4-turbo	3	7310	November 15, 2023
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	13522	January 11, 2024
Introducing Predicted Outputs Announcements	15	6744	November 18, 2024
Tokens limit gpt-3.5-turbo-0125 API token , gpt-0125	1	3556	February 15, 2024

What is the max output token length for Predicted Outputs?

Related topics