I’ve found this to be a curious issue with all text generation API’s - davinci, 3.5 turbo, GPT-4, etc. The issue is this:
When the API’s are “born” (AKA fresh API key or trained model), and the temperature is set to say, 1, then they all start out by answering similar prompts with pretty original and unique responses each time. However, the more often I use the API’s, the less original their responses become. It’s almost like API’s seem to “remember” how they answered a similar prompt, and follow suit when asked the prompt again. It’s like they’re substituting actually coming up with unique responses for the memory of how they answered previous responses (no “assistant” functions are being used to cause this to happen by the way, this happens with my custom trained gpt-3 davinci model as well).
I’ll give you an example:
Whether it be a custom trained davinci model or a 3.5 or gpt-4 model, let’s say in the prompt I ask it to tell me a joke and run the prompt 10 separate times, one after another. The API will then tell me 10 jokes, all unique except for, let’s say, two. For the sake of this example, let’s say the two similar responses are both jokes about superheros. I’ll then ask the API to tell me 10 more jokes, and this time 3 are about superheros. Then I’ll run the experiment again and 4 out of the 10 jokes are about superheros. Then again again I’ll run the experiment, and before you know it, all 10 jokes are about superheros. I’ll then switch the prompt around to say “tell me something funny story” instead of “tell me a joke” and guess what the funny story is about? superheros!
My only guess is that OpenAI saves on processing power by having the API’s “remember” responses it gave in the past and choose the most likely response given the similar prompt. However, this cascading effect completely ruins the uniqueness of responses overtime, even when the temperature setting is set quite high.
Has anyone else had trouble with this??