Fine tuning changes max completion tokens?

mctravis · August 26, 2024, 9:05pm

Hey,

I’m using gpt-4o-2024-08-06 specifically because it support 16384 completion tokens. However, when I fine tune the model and try to substitute the new one into my code (based on gpt-4o-2024-08-06) I get the error:

BadRequestError: Error code: 400 - {‘error’: {‘message’: ‘max_tokens is too large: 16384. This model supports at most 4096 completion tokens, whereas you provided 16384.’, ‘type’: ‘invalid_request_error’, ‘param’: ‘max_tokens’, ‘code’: None}}

Is this a known issue or am I doing something dumb?

Thanks,
Travis

remod · August 27, 2024, 12:00am

Have the exact same problem, was excited as gpt-4o mini fine-tune was working well for our use case (and its 16k output is mandatory for us).

But if you fine-tune the latest gpt-4o:
gpt-4o-2024-08-06 input:128,000 tokens output:16,384 tokens knowledge:Up to Oct 2023

It suddenly becomes 4k output. I really dont get why they hampered it.

jr.2509 · August 27, 2024, 6:15am

I believe the issue is that the very latest version of gpt-4o (gpt-4o-0806) offers up to 16k output tokens, just like gpt-4o-mini does:

However, when you fine-tune gpt-4o-0806 (which is the only available option for fine-tuning a gpt-4o model), the max output tokens may be shortened back to 4k.

Topic		Replies	Views
Small input token limit when fine-tuning gpt-4.1-mini API fine-tuning	1	267	July 23, 2025
Problem with context token for gpt-3.5-turbo-0125 Community chatgpt	1	301	June 17, 2024
Max tokens chat completion gpt4o API gpt-4o	4	20235	September 5, 2024
Hitting max output token limit for 4.1-mini API gpt-4 , api , responses , gpt-41-mini	2	973	July 28, 2025
Why is gpt-3.5-turbo-1106 max_tokens limited to 4096? API	3	14327	January 11, 2024

Fine tuning changes max completion tokens?

Related topics