GPT fails to deliver a condensed version of a text

mdrome · December 24, 2023, 8:14pm

I’m working on a python script that relies on gpt-3.5-turbo-1106 in an API call to the Open AI to deliver condensed versions of 700-1000 word long text batches. I’ve tried different ways but GPT fails to deliver the version I’m looking for. My expectation is a condensed version of 350-500 words of the original text in length. Usually, the version GPT delivers is significantly shorter at about 10% of the size of the text. I’ve tested with multiple variables, including:

different prompts (eg “generate a shorter version”, “regenerate the following test in a shorter form”, “generate a summary of”, “regenerate a version of the following text that’s 300 to 500 words long”)
different token capacities ( anything from 100 to 4000)
different temperatures (from 01. to 0.9)

And, it keeps failing to deliver a properly condensed version. Meanwhile, ChatGPT is more successful at those text manipulations.
Any advice on how to solve this?

PaulBellow · December 24, 2023, 8:22pm

Because LLMs operate with tokens not whole words, it can be difficult for it to count words correctly. You might try asking for # of sentences or giving it a one-shot with the exact length of output that you want.

simplex · December 24, 2023, 10:24pm

This seems to be the same issue I have described in my post from earlier today - Essentially the new turbo models have gotten lazy and their responses are too short and missing data in many cases. Interesting to see that this applies not only to extraction but also to summarization.

Can a mod confirm that the team is aware of this issue?

mdrome · December 24, 2023, 10:53pm

@PaulBellow Thank you for your reply! What do you mean by an “exact output length”?

_j · December 24, 2023, 11:22pm

The training of the AI model fine tune is very much a conscious decision of the OpenAI developers. Summarization is a somewhat unnatural language task that takes a lot of training, and that includes the output length when “summary” is requested.

Topic		Replies	Views
Longer GPT 3.5-turbo Output Prompting gpt-35-turbo , api	23	4226	December 8, 2023
Is GPT3.5 API designed to generate responses to fit within the remaining token space? API	1	550	December 24, 2023
Cannot, for the life of me, get a detailed enough response API gpt-4 , api	13	2479	February 22, 2024
Generated summary is too long Prompting chatgpt	5	860	December 12, 2023
Length and structure of output for summaries Prompting gpt-4 , api	1	588	January 30, 2024

GPT fails to deliver a condensed version of a text

Related topics