Reduce the number of tokens

kishorekumard01 · December 14, 2023, 11:12am

Hey guys I am using gpt-4-0613(8k). I will give You the outline of what I am doing.

I have an api that takes pdf file as input extracts text from that pdf file and after that the text extracted is sent to the gpt model for quiz generation based on the content present in the pdf file.

The problem here is that after text extraction the number of tokens that is to be sent to the model is more than 8k for some pdf files

So now I need a solution to reduce the number of tokens so that it can adapt to the gpt (8k) model

TonyAIChamp · December 14, 2023, 12:07pm

You can try to either summarize what you are sending, do that in chunks or use RAG.

foxicle · December 14, 2023, 1:32pm

Use gpt-3.5-turbo-16k for more token allowance.

jwatte · December 14, 2023, 4:15pm

The long-contex models aren’t that great, because they miss a bunch of the information in the context – they still don’t have more attention than the smaller-context variants, as far as I can tell.

It sounds like you’re generating quizzes. You could split the document into chunks, each of which are 4-7k tokens, and ask the model to generate a quiz question per chunk. You can also ask the model to summarize each chunk, and concatenate all the summaries, perhaps multiple times, to get to a smaller input size.

bruce.dambrosio · December 14, 2023, 5:36pm

maybe?

bruce.dambrosio · December 14, 2023, 5:38pm

It says 16k, but that’s in AND out.
It limits me to 8k in.
But maybe that’s just me. I am tier 3, though…

jwatte · December 15, 2023, 9:23pm

You could use 15k in and 1k out. Input and output tokens go into the same vector in the GPU. Once the model has generated one token, it immediately becomes a new input token. In fact, the model can’t tell the difference between tokens that you supplied as input, and tokens that it previously generated!

bruce.dambrosio · December 16, 2023, 2:19am

I wish I could. Maybe it’s just me, but if I use one token in over 8192, even if max_new_tokens is set to 100, api will reject request.

jwatte · December 16, 2023, 5:45pm

That’s surprising; are you sure you’re using the model with 16k context?

I’ve used the 16k model with 13k context, and it “worked” but the accuracy was so bad I preferred to engineer smaller context prompts.

bruce.dambrosio · December 17, 2023, 6:01pm

Sigh. I wish I would stop making stupid mistakes.
gpt-3.5-turbo-1106/-4-1106-preview
Thanks for motivating me to double-check.

foxicle · December 22, 2023, 2:28am

adjust the temperature, 0 - 0.1 more stable, more than that is ‘creative’

Topic		Replies	Views
How do I get gpt to throw out more tokens in API? API gpt-4	3	2265	December 16, 2023
Any idea how to input more than 8k token in GPT 4? Prompting gpt-4	4	2157	December 17, 2023
Longer GPT 3.5-turbo Output Prompting gpt-35-turbo , api	23	4637	December 8, 2023
Reducing token usage while hinting LLM as it generates API gpt-4 , gpt-35-turbo , chatgpt , fine-tuning , api	5	3690	October 25, 2023
Token Limitization Error when prompting Prompting chatgpt , api	8	3774	December 6, 2023

Reduce the number of tokens

Related topics