Token limits on prompting

andy.yao · June 4, 2023, 11:13am

Hi all,

Would like to seek for your advice on how to optimise the tokens for the API calls.

I am using a local vector DB to store the PDF contents, and loading the similar content in the embeddings while calling the ChatGPT API. However, I realised there is a token limits on the questions + embedding + response.
I would wonder how you guys optimise the knowledge base in the API to improve the precision of the contents.

Thanks

Andy

bruce.dambrosio · June 4, 2023, 4:06pm

Don’t forget chat history. Not needed for all applications, I guess.

Prompt layout frameworks like Promptrix can help quite a bit
I suspect many people just roll their own.

andy.yao · June 5, 2023, 7:53am

That sounds like a bottleneck to build a good OpenAI system.

shatzakis · June 16, 2023, 4:15am

Perhaps one way could be to identify the type of prompts that cause errors due to too large messages from the token limits being surpassed, and then try to revise the prompt to fetch results that will be slightly lower than that limit, in addition to upgrading your models to the newer ones that can support 32k tokens. This worked for me where I had no control of the database, but realize it might not be applicable in your situation but thought to share just in case.

andy.yao · June 16, 2023, 5:25am

Thank you sir.
I am using the turbo-32k now. ：）

Topic		Replies	Views
16k Input vs Output: Edit and token strategies for long input texts Prompting gpt-35-turbo , python	2	1961	December 17, 2023
How can I adjust the length of the prompt so that it does not exceed the max tokens? API api	4	3530	December 18, 2023
Optimizing Token Utilization for GPT-4 with Vector Database: Overcoming 1000-Token Limit Challenges Community gpt-4 , api , assistants-api	2	391	October 9, 2024
Help Needed: Tackling Context Length Limits in OpenAI Models Community gpt-4 , chatgpt , token , rate-limit , openai	8	16354	February 8, 2024
Optimizing Input Token Usage in API Conversations Prompting api , api-optimization , prompts , prompt-optimization	2	1022	March 10, 2024

Token limits on prompting

Related topics