The fine-tuning documentation seems to imply that the fine-tuned model has the same token limits as the model upon which it’s trained:
For
gpt-3.5-turbo-0125
, the maximum context length is 16,385 so each training example is also limited to 16,385 tokens.
The documentation states that for gpt-3.5-turbo-0125:
Returns a maximum of 4,096 output tokens.
However, when I test my model in the playground, I cannot set the maximum output length to anything higher than 2048.
Can someone confirm the maximum output tokens for a model fine tuned from gpt-3.5-turbo-0125?