Inconsistency in OpenAI's response with the same prompt

sebav · April 24, 2023, 12:03pm

Good morning.

I am testing OpenAI’s API to create suggestions for C# class unit tests. I have created a Python script that reads a .cs file and uses a prompt:

“role”: “system”, “content” : “You are an expert in .NET 6 and C#”
“role”: “user”, “content”: "Create a unit test with xUnit, Shouldly, and Moq for the following class: "

I am using the OpenAI library in python (openai.ChatCompletion.create) and the gpt-3.5-turbo model.

The issue is that sometimes it generates a test class, but the generated class may be different from one another. Other times, it provides totally random responses that are unrelated to the prompt. Sometimes, it says that it needs more information or that we haven’t provided the class (which is not true because the prompt is exactly the same). It even cites ethics for not being able to help me “cheat” sometimes.

Is there any way to improve this? I understand that it can generate different responses and not always do the same thing, but we do need some consistency in the generated results to have some logic with what we have asked (if we ask for tests, it should generate tests, even if they are different in each request).

Best regards.

udm17 · April 24, 2023, 12:59pm

It usually is a case of the temperature and top_p parameters, which are used to determines the randomness of the generated output. For code, using a lower temperature is recommended as it keeps the output a bit more deterministic.

Another point I would add is that GPT-4 is much much better at code generation that 3.5 is, so at lower temperatures, you can expect the quality of the code generated to be much better.

A small suggestion, you could change the system content to be “You are a senior test case developer in .NET 6 and C#”. Targeted system message are likely to produce better results, though this is more prominent in GPT 4 compared to 3.5 where the API pays less attention (in 3.5) to the system message

sebav · April 25, 2023, 9:03am

Thank you very much for the response, I will try those tips

Topic		Replies	Views
Inconsistencies in API response to same prompt and similar content API gpt-4 , gpt-35-turbo , api	3	4835	July 18, 2023
I get different answers to the same request API gpt-4 , gpt-35-turbo , chatgpt , api	2	4763	December 8, 2023
Consistency of responses from Vision (or GPT4-Turbo) API	6	912	May 15, 2024
API Completions not really matching with chat.openAI GPT-3.5 Completions API gpt-35-turbo , chatgpt , api	7	2802	December 17, 2023
Help please how do I do this? (enhance AI-generated diversity) API gpt-4	9	927	May 7, 2024

Inconsistency in OpenAI's response with the same prompt

Related topics