Different results from openai and playground

Hi,

I’m sending in a completion request for a Q&A. I’m providing an email that we got and asking what the revenue of the company mentioned is. I am using ‘text-babbage-001’.
In the playground I get the correct answer, but when using the api it randomly adds 811,000 to the amount.
Also if the revenue isn’t mentioned it return 811,000 as the amount.
Any ideas why?
when I use curie, it seems to be ok (for a specific example - I didn’t run it on the whole set).

Thanks!

12 Likes

@at-bay-ds-team I’m curious if you found a solution since I have the same problem. I literally copy the prompt string and paste it in the playground and get a (consistent) answer. My local code gets a (consistent) different answer.

I have gone so far and used the chrome developer tools to look at the settings that are used in the http post and made sure they match with no effect.

I understand that there is some inherent randomness and that answers aren’t always the same, but this seems like something else is going on.

4 Likes

Babbage is not very good to get answers from embeddings. I would stick with DaVinci. A few other things:

  1. What prompt are you giving to the bot? You need to say that "if a question can’t be answered based on the knowledge base, respond with “Unknown” or something like that.
  2. What temperature are you using? Set it to zero.
  3. Make sure you are using the eight API to create embedding for the knowedge base and question respectively. These are two different engines.
3 Likes

It doesn’t seem to me that there is an issue of randomness here. I am consistently getting the same answer in the playground, and consistently getting the same wrong answer through the api.

3 Likes

babbage is giving the right answer on the playground - and I am using the code provided there.
At this point I will have to use a different model, but the question is why is the difference occurring?

this is the code I am using:

response = openai.Completion.create(
            model=openai_model,
            prompt=f"{text}\nQ: what was the revenue for the last fiscal year?\n\nA:",
            temperature=0,
            max_tokens=100,
            top_p=1,
            frequency_penalty=0,
            presence_penalty=0,
            stop=["\n"]
        )

temperature is 0.
I am new to openai - how would you implement your suggestions (points 1 and 3)?

1 Like

Try doing this. Whatever question a user is asking, append that question with "If you can’t find the answer in the provided context, say “unknown”. Let me know how tat goes.

1 Like

I should have clarified what I was using! I am using the completion API from Python and these settings:

        model="text-davinci-002",
        prompt=prompt,
        temperature=0,
        max_tokens=256,
        top_p=1,
        frequency_penalty=0,
        presence_penalty=0,
        best_of=1,
        stop=["\n"]

It always gives an answer that is right-ish, but just different so it isn’t a problem with not knowing.

I wonder if the playground gets “newer” or “test builds” before they roll out?

2 Likes

Just try changing the prompt as I suggested and let me know.

1 Like

I tried this - but then it just gave ‘unknown’ for everything

1 Like

I tried this with a different prompt using davinci and it works. :smile:
This still doesn’t explain the discrepancy between the playground and the api…

5 Likes

I have the exact same problem with text-davinici-003 now. I wanted to generate emails and the API consistently makes up, that I have studied Computer Science even though I said that I studied Math. The playground got the Math part consistently right. Exact same prompt and temperature is in both cases 0.

5 Likes

I am just seeing this thread. I had the same thing. I copied a prompt back and forth from the playground to double-check and consistently got the intended answer in the playground and the “wrong” answer when I called the code provided by playground with literally the same prompt

2 Likes

So odd. I don’t know why this happens. The biggest problem for me is that the API still makes stuff up with temperature 0. Somehow the temperature 0 for the API is not the same as temperature 0 for playground.

2 Likes

This is happening to me now too, trying to use text-davinci-002. The results are coming back a little sanitized, like either the temperature is being turned down, or it’s using text-davinci-003 instead maybe?

2 Likes

im struggling with this exact thing. Im trying to backwards engineer prompts from poems, the API and playground have the exact same prompt and settings. But when I use my API, no matter which poem I input the response is always: ‘Prompt: Write a poem about the beauty of nature and how it can bring peace and joy to our lives.’
But when I use playground it generates prompts as I expect :\

5 Likes

I hav3 a custom trained model that is doing the same. In the playground it’s perfect. Via an API call i get shitty weird responses

4 Likes

Same issue, using exact same params, but the playground has a different (and better) response than the API.

3 Likes

same issues. I am using ada and babbage for classification. Playground returns me the result but the sdk returns a blank string.

2 Likes

Same issue with text-davinci-003. My prompt has instruction to identify appointments in email threads for automated scheduling followed by the actual thread. Playground does great but the api returns incorrect answers, many of them out of format. Both have temperature set to 0.

3 Likes

same problem here, prompts and parameters the same - consistent results in playground, something else on api. Stops me from actually building. Any workarounds? Anyone tried paid api?

4 Likes