Longer GPT 3.5-turbo Output

sanujat · October 24, 2023, 9:31am

Hi,

I am using gpt-3.5-turbo-16k through API and I want to summarise some content. The content has about 12k tokens and I want the output to have at least 3k tokens. But the response only gets around 200-500 tokens range. How can I increase the response size? I tried with different prompts mentioning to use specific number of words or tokens. But it didn’t work. Can someone please help me with this?

Thanks in advance

_j · October 24, 2023, 9:49am

You must not use the wording “summary”. You will get a trained summary length, a surefire recipe for getting output that is like all other short summaries the AI has been fine-tuned on writing.

If you put in 12k tokens, and merely ask for it to be “rewritten” in a new way, the inability for the AI to write long output (also by fine tuning) will also get you a shrunk output. Or you can say “based on the documentation above, write a new article”.

With that much input though to that model, I’ve found that it tends not to perform any kind of restructuring or rewriting on demand, and you get your exact text back for paying for max context.

sanujat · October 24, 2023, 10:17am

Thank you for the response. To give you more context on the task I do, I’m trying to summarise a meeting transcript and generate meeting minutes from that. Since a transcript is more than 16k tokens, what my idea was to chunk it and summarise each chunk and combine those summaries finally to generate the meeting minutes document. When summarising, I want a thorough summary keeping as much information as possible. I tried the command ‘Paraphrase’ and it improved the results a bit. Do you have any specific way or prompt for this?

_j · October 24, 2023, 11:29am

You have a good plan. The only thing you might do is to tell the AI that it is working on multipart files, to only write the summary of that one part without a happy ending conclusion so it can be reassembled, and even give it the previous summary upon which it should continue writing without ending.

Around 2k is where your instructions start becoming lost in importance among the other mass of text.

sanujat · October 25, 2023, 7:51am

Thank you again for the valuable reply. I applied some of the things you told and I could get a bit longer output. The word ‘Paraphrase’ seems to generate a bit longer outputs.

Another concern I have is that even if we have longer summaries, the final meeting minutes document is a bit short. Is there a way to make it longer?

_j · October 25, 2023, 8:09am

One “trick” that you can use: the older checkpoint AI models haven’t been so heavily trained on curtailing their output length.

You can try gpt-4-0314 for your final product, with its 8k context giving you something like 6000 → 1500+ - which is a lot of words.

sanujat · October 25, 2023, 9:38am

Thanks for that valuable point. I changed the model to gpt-3.5-turbo-16k-0613 and the response length was increased by a bit. Since I have to process large files, I have to use gpt 3.5 because of its high context window.

b0zal · October 25, 2023, 10:05am

gpt-3.5-turbo-16k-0613 is recommended for longer output

_j · October 25, 2023, 10:09am

gpt-3.5-turbo-16k-0613 and gpt-3.5-turbo-16k are the same thing.

b0zal · October 25, 2023, 10:12am

nah a bit different from response while embedding as format markdown
gpt-3.5-turbo-16k

gpt-3.5-turbo-16k-0613

sanujat · October 25, 2023, 10:17am

I had that suspicion too. But I need a higher context that 8k from gpt-4. Since gpt-4-32k isn’t available for everybody, gpt 3.5 16k is the best option.

sanujat · October 25, 2023, 10:18am

Thank you. Do you have any other tricks for getting a longer output?

b0zal · October 25, 2023, 10:20am

give me example of prompt for a test getting a longer output

also the best tricks for getting a longer output is LLM Method’s

_j · October 25, 2023, 10:24am

Let’s check that assertion that they are different. I use a top_p setting near zero for near deterministic output…

comparemodels1

b0zal · October 25, 2023, 10:25am

try this config in your playground

edited

ignore a max tokens

_j · October 25, 2023, 10:26am

If you leave the temperature or top_p set at 0.5 or higher, you’re going to get different generations every time you run the model.

b0zal · October 25, 2023, 10:27am

this the best for all models as default

_j · October 25, 2023, 10:37am

Are you sure not this the best for all models?

The models selectable vary greatly in perplexity, and the desired output of your application will also, and even the quality of a particular world language at a particular temperature will change, so there is no “best for all models”, you’re going to have some idea that will inform the choice better…

But the point is that - being the same model, and the one without the date being a “stable name” alias - setting the sampling parameters to where you don’t get random token outputs reveals the sameness of the models if you don’t simply trust the true model name returned from your API call.

b0zal · October 25, 2023, 10:39am

In the Wilds

edited
a config

sanujat · October 26, 2023, 10:17am

I used these settings. However, the output is still short. For 12k input, the output is around 6-7k maximum.

Topic		Replies	Views
Gpt-3.5-turbo-16k Maximum Response Length Prompting api	33	34241	December 13, 2023
GPT one paragraph reply? Condensation/Summary for core ideas (keep content depth) Prompting api	4	1660	February 10, 2024
Cannot, for the life of me, get a detailed enough response API gpt-4 , api	13	2568	February 22, 2024
Impossible to generate texts of more than 600 words API	5	3129	December 18, 2023
How to print the output over 10,000 tokens? API gpt-4o-mini	4	461	September 9, 2024

Longer GPT 3.5-turbo Output

Related topics