Text-davinci-003 - Completion API - choice[0].text begins with sentence fragment

miyuru.cv · July 11, 2023, 9:17am

Hi,

I’m trying to use Completions API to extract action items from a conversation/discussion which I’m submitting using OpenAI JavaScript library’s createCompletion().

    let transcript = "..."; // Transcript content here 
    let myPrompt = "Generate all action items from below discussion: \n"; 
    myPrompt += transcript; 

    const completion = await openai.createCompletion({
      model: "text-davinci-003",
      prompt: myPrompt,
      temperature: 0,
      max_tokens: 3200,
    });

The response I’m getting always includes one choice in choices array. Value of that choice’s text property always begins with a sentence fragment, which is the problem I need to resolve:

blurbs that we have around that, unless we have a good resource that we’ve all pulled together

The rest of the completion consists of full sentences and is coherent. But I’m wondering why the first sentence is always a non-sensical fragment. The full transcript text I’m submitting I’m attaching at the end of this question.

Below is the contents of completion.data member of the response that I’m getting for this API call:

{
  id: 'cmpl-7ahtGf9aE2V1JWV9N2CvpAXs2qAiR',
  object: 'text_completion',
  created: 1688982878,
  model: 'text-davinci-003',
  choices: [
    {
      text: " blurbs that we have around that, unless we have a good resource that we've all pulled together.\n" +
        'Action Items:\n' +
        '1. Work with GTM teams to support corporate events. \n' +
        '2. One person to pick the one that GTM teams needs to back. \n' +
        '3. Connect with GTM teams on Slack and Github issues. \n' +
        '4. Keep track of people signing up for KubeCon and for other events. \n' +
        '5. Add link to Rolls product announcement on Github. \n' +
        '6. Create a resource that list all major improvements since 130.',
      index: 0,
      logprobs: null,
      finish_reason: 'stop'
    }
  ],
  usage: { prompt_tokens: 783, completion_tokens: 118, total_tokens: 901 }

I’ve seen for other prompts also, completion.data.choice[0].text of response I’m getting consistently begins with a sentence fragment:

e.g.

when the prompt is:

Prepare a summary of below discussion: \n<long_text_of_conversation>

choices[0].text begins with:

use cases or things that we should put in there that are kind of consistent. Speaker A and the team are discussing initiatives for corporate events…

when the prompt is:

Identify the full list of dependencies or prerequisites for any action items mentioned in following discussion: \n<long_text_of_conversation>

choices[0].text begins with:

projects, so maybe there’s people who remember Hahha" Dependencies or prerequisites: - Restructuring - Work with GTM teams - Regular sync - CICD on Google Next - GitOps on KubeCon - Commitment from campaign managers - Big improvements since 13.0

It gives the impression that the completion text is truncated at the front. I need to know if this is due to values for any parameters I’m sending (or not sending)? Do you have any ideas on how to resolve this? Thanks in advance.

Appendix:

Text of conversation/discussion:

SPEAKER A\nCan record and we don’t have a ton of items to get to, and I might be able to do one that might be fun if we have a little bit of time. So corporate events, I think I saw a little I put this in Slack and I saw a little bit of kind of noise around it, which was good. The nutshell here is, as we’ve kind of restructured and tried different things, the event support that we need isn’t as nailed down as it needs to be. So the current tactic that we’re going with is go to market, team signs up and kind of sponsors that event. So you support as a PMM, your campaign manager does the campaigns for that event, et cetera, et cetera, et cetera. I don’t see anyone in the maybe there are comments in the issue. I don’t see the header updated yet.\nSPEAKER B\nI thought we had in Slack sort of farmed each one of them out.\nSPEAKER A\nIt looks like Ty put in some folks. It looks like this looks good. Let’s see. Need support from GTM teams. So I guess the ask would be to work with your GTM team. Let me ask. I saw some Slack, I think, in our Slack, but were you all able to connect with your GTM teams on.\nSPEAKER C\nSlack only and on the issue? Actually, yeah.\nSPEAKER D\nSame not in real time, but source feedback. I got one person to respond so far, so it may end up Cindy being you, and I just picking the one that we want to do and then they can back us up.\nSPEAKER A\nIf.\nSPEAKER D\nWe don’t get any more feedback.\nSPEAKER A\nYeah. Does anybody have a regular sync still? Are those all been canceled or is there.\nSPEAKER E\nYeah, we’re on like the two week cadence.\nSPEAKER A\nOkay.\nSPEAKER C\nGithubs has been canceled after the enablement.\nSPEAKER A\nCool. So I’m just trying to catch up with the thread. So it looks like maybe platform on Reinvent, CICD on Google Next and GitOps on KubeCon. Does that sound right?\nSPEAKER D\nYeah, that’s where we were last I heard.\nSPEAKER A\nCool. So then I think we can help the Corp events team. They do a lot of cat herding and keep on tracking people down. So I think if this team can take the mission to try to help track that down so if you get the commitment specifically from your campaign managers, hey, we’re signing up for KubeCon, can you comment on the issue that, yes, I can commit to this, et cetera, et cetera, et cetera, just so that they can get that event support? But that looks good and I appreciate thanks for the link too, sonya on the Rolls product announcements. So I appreciate Brian for adding this. I probably should have added it, but.\nSPEAKER B\nI had two questions about that one. One is over what time frame are we looking at?\nSPEAKER A\nIn theory, this could be the same as the GitLab 14 launch, where we’re saying basically since 130, what kind of big improvements have we made? Candidly, this is always a little bit tough.

udm17 · July 11, 2023, 9:22am

Just a word of warning here, gpt-text-davinci-003 is going to be deprecated soon, so please keep that in mind.

Other than that, I often encountered this error when the text which was being fed to the model as input would often stop midway in a sentence or would be cut off. This causes the model to generate what it assumes is the ending of the input text and then get on with what it is supposed to do.

Just check and see if it the cause of your problem

Foxalabs · July 11, 2023, 9:24am

Seconded on @udm17 's comment.

I would 100% shift to gpt-3.5-turbo for this kind of task, it’s cheaper, faster and will respond to the query request far better.

_j · July 11, 2023, 9:32am

Why have you set max_tokens to 3200? Note the size of your input, 783 reported by the return you show.

The context length of the model is 4000 and you’ve reserved 3200 tokens of that for the response. the usage shows 783 prompt tokens was received, but just the “text of the conversation” you show is 818 GPT-3 tokens

I’m not familiar with the workings of that api module, but it may truncate your prompt input rather than let an error be returned for going over the context length.

miyuru.cv · July 11, 2023, 9:34am

Thanks for the pointer on deprecation; I’m just doing some initial experiments with the OpenAI models at the moment and will be changing to a different model soon.

Your explanation makes sense, but I have not seen any indication that my API call fails to submit the full prompt. But I’ll see if there’s a way to find this out for sure.

miyuru.cv · July 11, 2023, 9:38am

I actually tried with a lower max_tokens matching the expected completion length and only increased it to this to see if it was somehow due to hitting that token limit. But still, with 3200, I’m within the 4000 context length. The 783 prompt tokens actually includes the conversation-text.

_j · July 11, 2023, 9:41am

That’s the point, it doesn’t and can’t include your conversation text in the prompt. I put the input text you cite into a tokenizer, and it alone is 818 tokens.

You need to set max_tokens to your expected response, more like 500.

Then you need to look at the prompting style when using a completion model. It fills in the text that should follow what you gave it. Or just continues writing in the same style. A good style prompt would be:

An AI analyzes the provided text, and as output, summarizes the most important points.

Text:

blah blah

Analysis:

that then compels the AI to fill in the text that should be after the “analysis:” call-to-action.

miyuru.cv · July 11, 2023, 9:59am

I just tried the same requests (with same conversation-text) with a far lower max_tokens (900). The issue is still present (begins with a sentence fragment).

Request configuration:

    const completion = await openai.createCompletion({
      model: "text-davinci-003",
      prompt: myPrompt,
      temperature: 0,
      max_tokens: 900,
    });

When prompt is:

Prepare a summary of below discussion: \n<long_text_of_conversation>

Full value of choices[0].text:

’ big banner releases in this period, but Ryan and I can probably think of.\n’ +
‘\n’ +
‘Speaker A is discussing a tactic to better support corporate events. The current plan, which involves the go-to-market team signing up for events and their accompanying campaigns to be managed by the
PMM and Campaign Manager, is looking good. However, there is an ask to connect with the GTM team and comment on the issues once people have committed to an event. Finally, Speaker A and Ryan will need to think of big banner releases since 13.0.’

When prompt is:

Identify the full list of dependencies or prerequisites for any action items mentioned in following discussion: \n<long_text_of_conversation>

choices[0].text begins with:

" static cover praise type places where we publish these types of things. But it is definitely worth reviewing. And as we move into the new year, I think, mark this as something that I’m working on and then checkeback around March or April time frame.\n" +
‘\n’ +
‘\n’ +
‘Action Items:\n’ +
‘1. Update issue header with information about corporate events. \n’ +
‘2. Work with GTM teams to support corporate events.\n’ +
…
…

So I don’t think max_tokens is the issue here.

And yes, it makes sense to change the prompt style too. Let me try that as well. Thanks.

_j · July 11, 2023, 10:15am

In case it wasn’t clear, max_tokens sets aside the reservation of model context length that is to be used for generating output. Context must hold both the prompt you load into it and the text that is generated and returned to you. That preset amount is subtracted from the space available for your prompt input.

You should dump out the text of your string right before the function call so you can understand what is actually being sent, and see if it is being truncated.

If that looks complete and not a problem with your JS environment, then, as the openai module is a bit of a black box, we must diagnose what it’s doing. Have the AI report what it sees, or try to fix the input problem:

Prompt: "the AI Takes the text below, and only reports back the last 50 ASCII characters it received.

text:
blah…

AI output:"

Shorten the amount of input provided until you no longer have the effect caused by the end of your input being truncated (and then rewritten by AI) to discover the amount the JS platform will handle if you can’t delve deeper into that code itself to diagnose and rewrite.

You can validate that your inputs otherwise behave by using the playground at platform.openai.com

miyuru.cv · July 11, 2023, 10:39am

Thanks for that. Below is my understanding of max_tokens; which I believe is in line with what you explained.
context_length >= {token-count-of-prompt} + max_tokens

Basically, I would get the the token length of my entire query/prompt (instruction + conversation-text), and ensure that prompt_token_count + max_tokens I specify don’t exceed the context length (4096 for this model). Hope this is how it should be done.

And, yes, I will try the suggestions out including getting the AI to report how it’s handling this prompt. Thanks again.

_j · July 11, 2023, 10:51am

The only issue with maximizing the max_tokens value instead of just estimating a bit more than your desired response:

You must calculate the input tokens locally using an accurate library like tiktoken. It needs a large dictionary file to be downloaded off the internet;
The AI models may get caught in a repeating loop of output that fills the entire output reservation with garbage you pay for and wait;
This operation is backwards from a chatbot, where you are instead maximizing the amount of prior conversation you send while picking a reasonable response size.

Topic		Replies	Views
Finetune model completion cut off too short Prompting	7	3872	January 17, 2023
Question regarding max_tokens Prompting	11	36973	December 13, 2023
Creating Concise AI Replies in Short Interactions without max_tokens Prompting prompt , prompt-engineering , api-output-length	10	1970	March 12, 2024
Issues with Truncated Responses API	3	1873	April 22, 2024
Minor change in prompt causing a return of no text using python API	6	1625	September 25, 2023

Text-davinci-003 - Completion API - choice[0].text begins with sentence fragment

Related topics