createChatCompletion - Reduce consumption of tokens in instructions

AgmDeveloper · April 3, 2023, 4:25pm

I am testing the OPEN AI API using the gpt-3.5-turbo model. As far as I understand, all requests must include instructions, such as “You are an AI that will generate random questions about X topic, and if there are inappropriate words, take X action, etc.” This is what consumes the most tokens in the long run. Is there a way to prevent these instructions from consuming tokens since they will never change?

My code:

const openai = new OpenAIApi(configuration);
  const messages = [
        {
          role: ChatCompletionRequestMessageRoleEnum.System,
          content: AIInstructions // ---> Instructions send each time
        }
    ];

wmnd2022 · May 2, 2023, 3:03am

Source: Response from GPT-4
It’s true that instructions consume tokens, and repeating the same instructions in each API call can be inefficient. However, GPT-3.5-turbo requires a system message to set the behavior. You can minimize token usage with the following suggestions:

Suggestions

Make the instruction as concise as possible without losing important context.
If a conversation involves multiple messages, consider using a single API call for the entire conversation instead of making separate calls for each message.

Inspirational Questions

How can we optimize the conversation structure to make the most of the token limit without sacrificing context or quality of the AI’s responses?
Are there any other aspects of the API call that can be optimized to further reduce token usage?

Code Modification

To make your code more token-efficient, try using a concise instruction in your system message:

javascriptCopy code

const openai = new OpenAIApi(configuration);
const messages = [
  {
    role: ChatCompletionRequestMessageRoleEnum.System,
    content: "Generate random questions about X topic, and avoid inappropriate words." // Concise instruction
  }
];

Remember that making the instruction too brief may result in less context for the AI, which could affect response quality. It’s essential to find a balance between token usage and providing enough context for accurate and relevant responses.

Topic		Replies	Views
Reducing token usage while hinting LLM as it generates API gpt-4 , gpt-35-turbo , chatgpt , fine-tuning , api	5	3485	October 25, 2023
Can Instructions be reused at no cost? Or, how to save on tokens API	4	3288	January 1, 2024
ChatGPT-4 doubts Regarding pre instructions storage API gpt-4	12	1290	September 12, 2023
Token Optimization for Assistants API - Excesive token count API gpt-4 , assistants , assistants-api	2	2970	May 24, 2024
Difficulties with chat implementation using Completions API API	2	1259	December 18, 2023

createChatCompletion - Reduce consumption of tokens in instructions

Suggestions

Inspirational Questions

Code Modification

Related topics