I am using gpt-3.5-turbo-16k through API and I want to summarise some content. The content has about 12k tokens and I want the output to have at least 3k tokens. But the response only gets around 200-500 tokens range. How can I increase the response size? I tried with different prompts mentioning to use specific number of words or tokens. But it didn’t work. Can someone please help me with this?
You must not use the wording “summary”. You will get a trained summary length, a surefire recipe for getting output that is like all other short summaries the AI has been fine-tuned on writing.
If you put in 12k tokens, and merely ask for it to be “rewritten” in a new way, the inability for the AI to write long output (also by fine tuning) will also get you a shrunk output. Or you can say “based on the documentation above, write a new article”.
With that much input though to that model, I’ve found that it tends not to perform any kind of restructuring or rewriting on demand, and you get your exact text back for paying for max context.
Thank you for the response. To give you more context on the task I do, I’m trying to summarise a meeting transcript and generate meeting minutes from that. Since a transcript is more than 16k tokens, what my idea was to chunk it and summarise each chunk and combine those summaries finally to generate the meeting minutes document. When summarising, I want a thorough summary keeping as much information as possible. I tried the command ‘Paraphrase’ and it improved the results a bit. Do you have any specific way or prompt for this?
You have a good plan. The only thing you might do is to tell the AI that it is working on multipart files, to only write the summary of that one part without a happy ending conclusion so it can be reassembled, and even give it the previous summary upon which it should continue writing without ending.
Around 2k is where your instructions start becoming lost in importance among the other mass of text.
Thanks for that valuable point. I changed the model to gpt-3.5-turbo-16k-0613 and the response length was increased by a bit. Since I have to process large files, I have to use gpt 3.5 because of its high context window.
The models selectable vary greatly in perplexity, and the desired output of your application will also, and even the quality of a particular world language at a particular temperature will change, so there is no “best for all models”, you’re going to have some idea that will inform the choice better…
But the point is that - being the same model, and the one without the date being a “stable name” alias - setting the sampling parameters to where you don’t get random token outputs reveals the sameness of the models if you don’t simply trust the true model name returned from your API call.