Garbage characters returned in payload with multiple GPT-3 models

Using the OpenAI API, I tested with a couple of model versions. While not the same in all versions, there is garbage characters returned in the inference:

def classify_sentiment(text):
    model_engine = "text-davinci-002"
    cheaper_engine = "text-ada-001"
    prompt = f"label:Sentiment Analysis (Positive, Negative, Neutral) | {text}"

    completions = openai.Completion.create(
        engine=model_engine, # cheaper_engine | model_engine
        prompt=prompt,
        max_tokens=1024,
        n=1,
        stop=None,
        temperature=0.5,
    )

    message = completions.choices[0].text
    print (completions.choices)
    sentiment = message.strip().split(" ")[-1]
    sentiment_debug = message
    total_tokens = completions["usage"]["total_tokens"]
    print(f"The total number of tokens in the request is: {total_tokens}")
    print (f"this is the debug:  {sentiment_debug}")
  1. Here is my function with the utterance:
    classify_sentiment("I am unreasonably excited about the state of my phone today given I have only 1 bar")

  2. Here is what is returned:

[<OpenAIObject at 0x15bc8d800> JSON: { "finish_reason": "stop", "index": 0, "logprobs": null, "text": " of battery life left\n\nPositive" }] The total number of tokens in the request is: 41 this is the debug: of battery life left Positive



'left\n\nPositive'

Why are there garbage characters in the returned json values that have nothing to do with the utterance I used to pass into the function? Example:

of battery life left Positive

and:

left\n\nPositive

Of particular interest is the function return value, because it is unpredictable and I wonder why I should have to parse the returned value. Am I missing something (such as a function property) where I can set the option to ONLY return positive, negative or neutral? Thank you in advance.

.\n\nNegative

Hi @tdentry

The ‘\n’ is a newline character.

Being very specific with prompt can help.

Also if you look at the call you made:

You’ll see that not only did it classify, it also completed your prompt with left\n\nPositive where left is the completion and the result positive is printed after 2 new lines.

For more consistent results, you can alter your prompt such that the model only gets to complete the 'Result:'

Something like:

classify_sentiment(text):
    model_engine = "text-davinci-002"
    cheaper_engine = "text-ada-001"
    prompt = f"label:Sentiment Analysis (Positive, Negative, Neutral) | {text}\nResult:"

    completions = openai.Completion.create(
        engine=model_engine, # cheaper_engine | model_engine
        prompt=prompt,
        max_tokens=1024,
        n=1,
        stop=None,
        temperature=0.5,
    )

    message = completions.choices[0].text
    print (completions.choices)
    sentiment = message.strip().split(" ")[-1]
    sentiment_debug = message
    total_tokens = completions["usage"]["total_tokens"]
    print(f"The total number of tokens in the request is: {total_tokens}")
    print (f"this is the debug:  {sentiment_debug}")
2 Likes

Thank you @sps for responding, especially on a Saturday! This makes sense but I guess I’m wondering why there are unnecessary tokens/characters that are returned by the engine. For example, running the original function on a line in a dataframe with real text, I receive an inordinate amount of unrelated, irrelevant tokens. See below:

print(classify_sentiment(formatted.formatted.iloc[5]))
print ("here is the formatted content: \n",formatted.formatted.iloc[5])
this is the debug (the data that came from OpenAI)  

You can also use the keywords that are related to the product or service you are offering. A new survey from the American College of Allergy, Asthma and Immunology (ACAAI) shows that most people do not know that indoor and outdoor allergies can be active at the same time. Theres no need to install any software. You can make a lot of money if you can attract a large number of followers. You can also make money online with Swagbucks by watching short videos in whatever category you choose, like entertainment, news, or fitness. You can use the logo for your business cards, website, letterheads, etc. These could be in the form of a free e-book or a free report. You can also find the specific products and services that you offer, and you can also find an opportunity to promote the products and services of your business.

| The way to make money from your blog is to use it as a platform to advertise or promote products and services that relate to the theme of your blog. You can use a free website builder and earn money from the ads they place on your website. You can use a simple technique to help you remember your dreams. You can also choose to use a free web host and earn money from the ads they place on your website. The online marketing industry is huge and there are many ways to make money online. Some of the more popular ways are through CPA offers and affiliate marketing. I know, there are a lot of people who are skeptical about making money online.

The first step is to find the products that you want to sell. There are many ways to find products to sell online, but the best way is to join an affiliate program. There are many affiliate programs available online, but the best way to find them is to join a directory of affiliate programs. The second step is to promote the products that you have chosen. There are many ways to promote products, but the best way is to use pay per click advertising. The third step is to build a list of subscribers. The best way to build a list of subscribers is to use an autoresponder. The fourth step is to sell the products that you have chosen. The best way to sell products is to use an e-book or an ebook.

The fifth step is to promote the products that you have chosen. The best way to promote products is to use an affiliate program.
program.

here is the formatted content (the content that I am analyzing using the inference engine):  

 Account administrators can create an upload holiday greetings through the Admin Portal steps to enable recordings can be displayed in the Admin portal by selecting the black help box at the top right corner of the page and typing manage greetings in the search bar. Would you need me to send you an S, M, S, with the instructions no.

As you can see, the model gave me back a significant amount of useless data that is unrelated to my query.

They are being returned because the engine has been given the opportunity and the the room to do so.

The max_tokens=1024 needs to be reduced to just the size to allow results and not additional tokens.

And your prompt needs to end at a point where the model understands that it has to give result and not complete the prompt.

Also, reduce the temperature to 0, and try with increments of 0.1. 0.5 is too much for such a simple task.

Take a close look at the code I gave you in the last reply.

2 Likes

Thank you @sps . I used your function modification and it definitely gets more readable results, but ONLY for the davinci-002 model. I also followed your recommendation regarding the temperature and am experimenting with the max_tokens parameter. I believe I am also not approaching my goal of identifying the number of tokens that are actually being evaluated correctly because the max_tokens seem to be what the model outputs, rather than what is being input. Is there any way to call to Tokenizer correctly in the current version? I get an error when trying to call that class. The reason I am asking is that when I use the davinci-002 model, I get:

The total number of tokens in the request is: 36

When I use the cheaper ada-001 model, I get:

The total number of tokens in the request is: 68

Also, on the ada-001 model, there appears to be some additional prompt settings that I must look at because when I use the same function parameters but choose the ada model (function below) I get some odd returned output:

def classify_sentiment(text):  
    model_engine = "text-davinci-002"
    cheaper_engine = "text-ada-001"
    prompt = f"label:Sentiment Analysis (Positive, Negative, Neutral) | {text}\nResult:"

    completions = openai.Completion.create(
        engine=cheaper_engine, # cheaper_engine | model_engine
        prompt=prompt,
        max_tokens=1024,
        n=1,
        stop=None,
        temperature=0.1, #modified temperature as per recommendation
    )

    message = completions.choices[0].text
    print (completions.choices)
    sentiment = message.strip().split(" ")[-1]
    sentiment_debug = message
    total_tokens = completions["usage"]["total_tokens"]
    print(f"The total number of tokens in the request is: {total_tokens}")
    print (f"this is the debug:  {sentiment_debug}")
    return sentiment

output:

[<OpenAIObject at 0x15f7c02c0> JSON: {
  "finish_reason": "stop",
  "index": 0,
  "logprobs": null,
  "text": "\n\nThe phone is not working\n\nThe phone is working but I am not getting text messages or calls through it\n\nThe phone is big and heavy\n\nThe phone is not the same when I get it from bed or when I wake up\n\nThe phone is small and not big enough\n\nThe phone is uncomfortable to hold\n\nThe phone is not loud or speaker quality\n\nThe phone is not looking or looking good\n\nThe phone is not worth the price I paid for it"
}]
The total number of tokens in the request is: 137
this is the debug:  

The phone is not working

The phone is working but I am not getting text messages or calls through it

The phone is big and heavy

The phone is not the same when I get it from bed or when I wake up

The phone is small and not big enough

The phone is uncomfortable to hold

The phone is not loud or speaker quality

The phone is not looking or looking good

The phone is not worth the price I paid for it

What I like about this is the model clearly shows what corpora text strings are being used in comparison, but it did not actually include the sentiment. If there is a docsite which tells me how to avoid this output, i’d appreciate a link.

For example, when I comment out all of my print statements and ONLY include the return value, it outputs a text string (word) and NOT the sentiment. Here is the function return value with the ada model:

'calls'

So it’s clear that there is a difference in behavior, but how to tune the parameters per model are what I’m looking for.

I have done some experiments with ada and I don’t believe ada can be reliably used for sentiment analysis.

This is because the davinci has better understanding and delivers the required result (and reaches stop sequence) in less number of tokens than ada, which doesn’t understand the prompt and goes haywire generating rubbish tokens until it reaches stop sequence.

The total tokens = prompt tokens + generated tokens

That is not comparison you’re seeing, it’s irrelevant token generation.

Few shot learning for this case works like roulette on text-ada-001. Sample:OpenAI API

I agree that it is the cheapest model but its capabilities are limited.

Instead I would recommend using text-babbage-001 as it has performed great and still cheaper than curie and davinci.

Try this: OpenAI API

If you still run into problems, feel free to discuss. My calendar link is in the bio.

1 Like