How to call API for multi-round messages without being charged for history messages?

gtigerrr · August 28, 2024, 3:22am

Let’s say I have the following messages:

[
    {
        "role": "system",
        "content": [
            {"type": "text", "text": SYSTEM_PROMPT},
        ],
    },
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": "Please summarize and redraw the attached chart.",
            },
            {"type": "image_url", "image_url": {"url": user_image_data}},
        ],
    },
],

And I keep on appending new user prompts and gpt responses to the end of this message. Will I be charged incrementally? i.e. Being charge for all history messages as the conversation gets longer (1 sys + 1 user for the first time, 1 sys + 2 user for the second conversation, 1 sys + 3 user for the third, etc.). Is there other practise to avoid this?

icdev2dev · August 28, 2024, 4:28am

While it seems a tautology to say this, the only way to be NOT charged for the history messages is to not send them to chat completion ; because chat completion is stateless.

Here’s one way in which I have dealt with messages in threads not being sent to Chat Completion.

gtigerrr · August 28, 2024, 6:14am

Thank you so much for your reply! Would using Assistant or something other than ChatCompletion solve this issue? Or that would still be impossible?

gtigerrr · August 28, 2024, 6:16am

Ideally, I am trying to be charged for 1 sys + 1 user for the first time, 1 new user for the second time, 1 new user for the third time, etc.

icdev2dev · August 28, 2024, 2:28pm

You can just create another chat for each user; unless the aim is to somehow store the results somewhere.

At any rate, this post here explains the logic and some implementation. Seeking the Best API Choice: Should I Use OpenAI's Assistant API or Chat Completion API? - #12 by icdev2dev

gtigerrr · August 29, 2024, 2:32am

My aim is to generate outputs for the 2rd user prompt without being charged for the sys prompt and the 1st user prompt, and the 2rd reply should still be based on the sys prompt and the 1st user prompt. Based on your reply, it seems that this is not possible?

icdev2dev · August 29, 2024, 2:53am

Right. This is NOT possible.

Topic		Replies	Views
Retain past responses in memory without sending them again at every API request API gpt-4 , gpt-35-turbo , chatgpt	11	10653	January 25, 2024
Will creating more threads help avoid appending the conversation history? API gpt-4 , api	5	1223	December 19, 2023
Cost increase if i send privies messages to OpenAi API api	1	540	September 18, 2023
Are we repeatedly charged for all tokens in the context window? API	4	496	May 30, 2024
Can Instructions be reused at no cost? Or, how to save on tokens API	4	2914	January 1, 2024

How to call API for multi-round messages without being charged for history messages?

Related topics