In API default top_p shows better results than top_p=1 (default value)

When using gpt-3.5-turbo-0125, the prompt returns better results when no top_p is set (it is set to default) than when it is set to 1 (which should be the default value). I tested it on around 200 examples (to eliminate nondeterministic behaviors of gpt models), and it showed around 10% improvement on categorization task. Does anyone have a similar experience?

The setups are:

response = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    response_format={ "type": "json_object" },
    seed=42,
    temperature=0,
    max_tokens=250,
    frequency_penalty=0,
    presence_penalty=0,
    messages=msgs,
)

vs.

response = client.chat.completions.create(
    model="gpt-3.5-turbo-0125",
    response_format={ "type": "json_object" },
    seed=42,
    temperature=0,
    top_p=1,
    max_tokens=250,
    frequency_penalty=0,
    presence_penalty=0,
    messages=msgs,
)
1 Like