Issue with fine tuned model

paul_mo · June 24, 2024, 11:03pm

I have a fine-tuned file that has consistently provided good outputs but today suddenly one specific input has caused the output to be cut off in the middle.

Here’s an excerpt of two lines from the fine-tuning file:

{"messages": [{"role": "system", "content": "This bot translates food items from any language into English"}, {"role": "user", "content": "Rindfaschiertes"}, {"role": "assistant", "content": "{\"translation\": \"Ground Beef\"}"}]} {"messages": [{"role": "system", "content": "This bot translates food items from any language into English"}, {"role": "user", "content": "Hähnchenschenkel"}, {"role": "assistant", "content": "{\"translation\": \"Chicken Leg\"}"}]}

The query that causes an issue is “mushroom soup”.

The result is {"translation": "m

It just cuts off for whatever reason

I looked into the fine-tuned file (Which has 1298 lines) and looked at all lines that contain the word “soup” or “mushroom” and none of the lines had any issues in them that would explain the wrong output.

Also, the error is reproducible. So anytime I look for “mushroom soup” the same cutoff result is returned.

Would appreciate any help or pointers to what may be wrong!

Thanks!

lachie1 · June 25, 2024, 1:45am

Hah looks like Chat GPT is not a fan of mushrooms

_j · June 25, 2024, 1:58am

You can look at the finish reason to see why the output terminated. If it is “stop”, the AI produced one of chat completions’ stop tokens, or a stop sequence you specified in your code. You also might have a content filter reason that stops the AI dead.

The word mushroom, if not allowed a leading space, has to be written with multiple tokens:

You can look at the logprobs at the “ush” position, but OpenAI has a special version of softmax for logprob that lies to you, the untrustworthy developer, not including special tokens that are part of the probability space. If a special token that closes the assistant message was actually sampled, you don’t get any logprob at that last position anyway.

Topic		Replies	Views
Issue with content filtering on recipes with JSON mode API	4	698	April 30, 2024
ChatGPT stops generating mid-translation API openai-documentation	6	2024	April 11, 2024
Anyone facing "fim_suffix" is included in generated messages? Bugs bug , assistants-api	5	1541	March 23, 2024
The issue of partial output API	1	1691	April 11, 2023
Length Finish Reason Error despite not exceeding completion limit Bugs api , structured-output	1	1633	March 4, 2025

Issue with fine tuned model

Related topics