I’ve been running prompts in Chat GPT-3.5 and I was consistently getting pretty good results. I then coded up the same prompts into API calls, and the results have been underwhelming. I’ve played with the frequency and presence between 0 & 0.8, while leaving temperature at 0.7. The assumption was that the API model for gpt-3.5-turbo would operate like I was using the UI for chat GPT. Maybe I’m doing something wrong?
The scenario: In the UI for chatGPT, I set the role, ex “you are taxGPT - an IRS tax assistant model trained by OpenAI. You are very familiar with answering tax questions about small businesses”. Then I send in the prompt right below that. Pretty good completions & provides a little background as to why it gave that specific answer.
Now in the API call, I set the system role, set the user prompt, and test a few variations of frequency & presence. The return from the API seems less creative, and provides less explanation as to why it’s giving me it’s answer.
Thoughts / suggestions?