How to generate DIFFERENT responses?

I’m trying to get a diversity of results from the GPT-3 API.

Consider this simple call:

import os
import openai
import dotenv
dotenv.load_dotenv()
openai.api_key = os.environ.get('OPENAI_API_KEY')

response = openai.Completion.create(
  model="text-davinci-002",
  prompt="Tell me a joke.",
  temperature=1,
  max_tokens=20,
  top_p=1,
  n=4,
  best_of=5,
  frequency_penalty=1,
  presence_penalty=1
)
print(response)

Here is a typical output for it:

{
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "logprobs": null,
      "text": "\n\nWhy did the chicken cross the road?\n\nTo get to the other side."
    },
    {
      "finish_reason": "stop",
      "index": 1,
      "logprobs": null,
      "text": "\n\nWhy did the chicken cross the road?\n\nTo get to the other side."
    },
    {
      "finish_reason": "stop",
      "index": 2,
      "logprobs": null,
      "text": "\n\nWhy did the chicken cross the road?\n\nTo get to the other side!"
    },
    {
      "finish_reason": "stop",
      "index": 3,
      "logprobs": null,
      "text": "\n\nWhy did the chicken cross the road?\n\nTo get to the other side!"
    }
  ],
  "created": 1669369481,
  "id": "cmpl-6GPY96AQUyCul9vTmo93oN84Kh96Y",
  "model": "text-davinci-002",
  "object": "text_completion",
  "usage": {
    "completion_tokens": 95,
    "prompt_tokens": 5,
    "total_tokens": 100
  }
}

i.e. 4 times the same result, which (at least to me) defeats the purpose of the n setting.
(If I’m “lucky”, I’ll get a wee bit of diversity, with maybe 3 times the same joke and once a different one.)

Is it possible, through the API, to get DIFFERENT results? If not are there any plans for it?

I understand that making multiple calls to the API (e.g. asking for a list of jokes + some prompt engineering) may lead to the desired result, but it would be so much more convenient and cleaner to have this from the get-go in the API.

I am having this exact same issue - even without using “n”. I have a conversational chatbot that gives the same response when I submit the same input.

This is not the same when I use the playground/sandbox.

Anyone figure this out?

Temperature ranges from 0 to 1 and it controls the randomness in the output. By decreasing the temperature the duplicates may not occur.

try this. immediately before your prompt, add a random number. so in your code, generate a random number and put that in front of the prompt.

also make sure that the temp is 1.

1 Like

Have you tried the new “Text-davinci-003” model?

The issue you are encountering with the GPT-3 API is a common one when requesting multiple responses from the same prompt. The n parameter, which controls the number of responses returned by the API, does not guarantee a diverse set of responses.

One solution is to use the “prompt engineering” approach you mentioned, where you modify the prompt slightly for each API call to encourage diverse responses. For example, instead of asking for a joke with the same prompt each time, you could ask for a joke with a different keyword or context in each prompt. This may lead to more diverse responses.

Another option is to use the “temperature” parameter to control the creativity of the API. A higher temperature value can lead to more diverse, but potentially less coherent, responses. Experimenting with different temperature values can help you find a balance between diversity and coherence that suits your needs.

Additionally, you can try using different models or even combining models to get more diverse results. The GPT-3 API offers a variety of models with different capabilities, and combining their output may lead to more diverse and interesting results.

It’s worth noting that while the GPT-3 API is a powerful tool, it is not perfect and may not always generate the exact results you are looking for. It’s important to keep in mind the limitations and biases of the model, and to use it responsibly and ethically.

1 Like

This worked fine for me when I changed the prompt to:

Tell me five jokes.

Also,worked OK for this prompt:

Tell me some jokes.

Tell me handful of jokes.

My conclusion based on this quick test is that when you use the prompt: “Tell me a joke.” the language model literally interprets “a” as “one”, and so the params n and best_of are overwritten by how the language model interprets “a”.

HTH