When using gpt-3.5-turbo-0125, the prompt returns better results when no top_p is set (it is set to default) than when it is set to 1 (which should be the default value). I tested it on around 200 examples (to eliminate nondeterministic behaviors of gpt models), and it showed around 10% improvement on categorization task. Does anyone have a similar experience?
The setups are:
response = client.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={ "type": "json_object" },
seed=42,
temperature=0,
max_tokens=250,
frequency_penalty=0,
presence_penalty=0,
messages=msgs,
)
vs.
response = client.chat.completions.create(
model="gpt-3.5-turbo-0125",
response_format={ "type": "json_object" },
seed=42,
temperature=0,
top_p=1,
max_tokens=250,
frequency_penalty=0,
presence_penalty=0,
messages=msgs,
)