What is the max output token length for Predicted Outputs?

Does 16k account for only the changed/rejected tokens or including the accepted tokens? Curious to know because this would help in editing long documents.

1 Like

Sending a prediction does not change what the model will produce.

First, you would have to confront the near-impossibility of having the model write to that length. The most I have gotten when specifically targeting a length with justification is around 12k. It will just cut off mid-sentence, or interject some other wrap-up message.

Then, that the AI must write almost exactly the same as long lengths that you send - or you will have rejected prediction tokens that COST. There’s already near-doubling the price with rejected token billing on the best applications that you can think of…insertions, minimum modifications, things that take careful hand construction of exactly the prediction you expect to remain - all are more expensive.

So with increasing cost as the AI is figuring out how to come up with “accepted” and “rejected” totals that are significantly more than the actual prediction length sent, the only thing to answer is if max_completion_tokens still does it’s o1-preview job of limiting total cost - but you can’t use that parameter. I’m done thinking about how to increase my costs and work to get the same model output, though, at even slower production rates if there is not good prediction matching. Less computation time for OpenAI shouldn’t mean higher costs for me.

3 Likes