I am testing the OPEN AI API using the gpt-3.5-turbo model. As far as I understand, all requests must include instructions, such as “You are an AI that will generate random questions about X topic, and if there are inappropriate words, take X action, etc.” This is what consumes the most tokens in the long run. Is there a way to prevent these instructions from consuming tokens since they will never change?
My code:
const openai = new OpenAIApi(configuration);
const messages = [
{
role: ChatCompletionRequestMessageRoleEnum.System,
content: AIInstructions // ---> Instructions send each time
}
];
1 Like
Source: Response from GPT-4
It’s true that instructions consume tokens, and repeating the same instructions in each API call can be inefficient. However, GPT-3.5-turbo requires a system message to set the behavior. You can minimize token usage with the following suggestions:
Suggestions
- Make the instruction as concise as possible without losing important context.
- If a conversation involves multiple messages, consider using a single API call for the entire conversation instead of making separate calls for each message.
Inspirational Questions
- How can we optimize the conversation structure to make the most of the token limit without sacrificing context or quality of the AI’s responses?
- Are there any other aspects of the API call that can be optimized to further reduce token usage?
Code Modification
To make your code more token-efficient, try using a concise instruction in your system message:
javascriptCopy code
const openai = new OpenAIApi(configuration);
const messages = [
{
role: ChatCompletionRequestMessageRoleEnum.System,
content: "Generate random questions about X topic, and avoid inappropriate words." // Concise instruction
}
];
Remember that making the instruction too brief may result in less context for the AI, which could affect response quality. It’s essential to find a balance between token usage and providing enough context for accurate and relevant responses.