The summary text that is returned is truncated at the beginning

model: gpt-3.5-turbo
endpoint: completion
result is truncated the the start :

<OpenAIObject at 0x12b7484f0> JSON: {
“content”: “ar case is particularly interesting, as he was quite blatantly expressing his political opinions and even urging people not to vote for a specific party leader, Jeremy Corbyn. Despite this, he has not faced any repercussions from the BBC, and continues to host The Apprentice. On the other hand, Gary Lineker’s tweet was a critique of a government policy, but he did not support one party over another or encourage people to vote a certain way. \n\nEmily Maitlis’ tweet also seemed to take a position by suggesting the idea of a coup against Corbyn, but as she has since moved on to Global, her situation seems less comparable to Lineker’s. Regardless, it is clear that different standards are being applied to different individuals within the BBC, which calls into question the consistency of their policies and guidelines.\n\nThe reaction to Lineker’s tweet and the subsequent events that unfolded raise questions about the role of political opinions within the BBC and the organization’s ability to maintain objectivity and impartiality. Many might argue that consistency is necessary if the BBC is to be seen as a genuine unbiased news source.\n\nAs for Match of the Day, it is unclear who will step in to fill Lineker’s and Ian Wright’s shoes for the time being. Wright’s decision to step down in solidarity with Lineker shows that the issue has resonated with those within the sports world as well. It will be interesting to see how the BBC proceeds with this situation and whether the backlash over Lineker’s tweet will lead to changes in their policies regarding political opinions and commentary.”,
“role”: “assistant”

1 Like

Hi, do you mind sharing the input used to generate this outcome?

def generate_summary(transcript):
completion= openai.ChatCompletion.create(
{“role”: “system”, “content”: “You are a news podcast presenter.”},
{“role”: “user”, “content”: given the transcript of the episode, generate a summary of the most important topic in this episode: + transcript}

summary = completion.choices[0].message["content"]
return summary
1 Like

Hi @bukikir,

tks for sharing. I have 3 ways to mitigate this issue for you:

  1. Work with a stop token (OpenAI API; “Up to 4 sequences where the API will stop generating further tokens.”) this signals to the model that the provided content is finished there
  2. Specify in your Request in plain english that you don’t want the text to be continued but only the summary. You can do this by appending the string to your request.
  3. Modify your input data so that you only provide input which ends with a “.” and a finished sentence. This also helps in a lot of the cases

Hope that helps :slight_smile:

Thank you Linus, the steps you have lsited are related to issues for truncated text at the “end” of the generated text. Our issue is related to text truncated at the “beginning” . So it looks like the API starts the response text from somewhere other than the first token.

Hi @bukikir,

tks for specifying, only to clarify: One way this might be caused is if the input data provided seems incomplete (e.g. ends not with a stop token or there is a unfinished sentence), GPT sometimes tries to complete the content first.

So by mitigating the issues with the (possibly, I have to guess since you did only share the promt w/o content…) truncated text at the end of your imput data. This might lead to trunc. text at the beginning of the output.

Oh I see. Okay that is a really good point. We have the list of inputs that loop in the function so one of them might be the way you described. Let me check the inputs. Many thanks!

1 Like

I had experienced a similar issue where I was also trying to put more Information in than would be manageble in a single API-Request.

I found a combination between embeddings and davinci lead to better results. But this would be a lot more effort to set up. But if you want an optimal solution with very large volumes of text this might be of interest to you.

1 Like