I am trying to debug a prompt on my app that is using GPT3.5-turbo. I was hoping to use playground to quickly iterate over multiple prompt idea, but unfortunately, when using playground, the response is correct. I don’t understand why the exact same prompt in playground, using the exact same hyper parameter including temperature 0, and same system message, would result in different output?
I am capturing the API calls, and copy past the API input in playground, so I am lost about what could be causing the discrepancy?
There is not much to go on except to assume something is actually different.
model: gpt-3.5-turbo
(or variant such as gpt-3.5-turbo-0301)
temperature or top-p: either to 0 should result in near identical unchanging output (with both at 0.001 an actual number instead of a made-up internal default).
penalties: default is 0 for both.
Role messages: each of system, past user/assistant exchanges, and the most recent user input must all be passed if you had a “conversation”.
Hi and welcome to the developer forum!
Can you please post your API calling code, without it’s it’s impossible to tell.
This can also be the case where there are forgotten custom instructions that were added to ChatGPT that affect the output of that platform, if one were to compare ChatGPT to the API (ChatGPT can be replicated by using the same system prompt).
Can’t disclose unfortunately :-(.
Thanks for trying to help. Anyway I found a workaround using a different prompt that behaves better. I’ll ignore the issue for now.