API Remembering conversation

Hi guys…

I just want to make sure I understand this correctly.
I’m able to make chatgpt api remember previous conversation by just adding more to this part:

$data = ‘{
“model”: “gpt-4-turbo”,
“messages”: [
{
“role”: “user”,
“content”: "’. $question. ‘"
}
]
}’;

But, this only has a limit of 4096 characters. Some answers can not be shortened and having a full own conversation back and forward will be more than 4096 characters.
Is there truly no other way?
The only way to do this is to simply add previous responses from chatgpt and after 4096 it’s game over?
Thank you.

1 Like

Your understanding that you must include relevant chat history in every API call is correct. However, as for the input you can include you have much more flexibility.

4096 is the maximum number of output tokens that can be returned in an API call. As for the input tokens, you have a much larger count available, depending on the model you choose. With gpt-4-turbo you can include over 120,000 tokens.

Here in the model overview, you can see the so-called context window for each model. The sum of input and output tokens must stay within this limit.

4 Likes

You’re so amazing thank you so much for answering.
Sorry but, tokens and is something I still haven’t figured out the meaning for.

So are you telling me, that the text content below, can contain A LOT like over for example 100 messages etc?

[
{
“role”: “user”,
“content”: “what is 5+5?”
},
{
“role”: “assistant”,
“content”: “5 + 5 = 10”
},
{
“role”: “user”,
“content”: “what did I just ask you?”
},
{
“role”: “assistant”,
“content”: “You asked me "what is 5+5?"”
}
]

Correct. The exact amount depends on the size of the individual messages.

Additionally, you also want to bear in mind costs and therefore be more selective regarding what you include.

1 Like

ok I think I’m starting to understand. So the ROLE / content above I went to chatgpt tokenizer and pasted it there it said:
tokens: 90
Characters: 225

So it seems it’s around 0.4 tokens for 1 characters.

Last question. When analyzing LARGE CSV, I simply have to add the entire content of csv file in the “content” I assume (and “user” in role)?

1 Like

Both is correct.

The exact token to word ratio is somewhat dependent on the language. But for English in particular that is a fair proxy.

1 Like

You have no idea how many hours I spent every day for weeks and now it all makes sense thanks to you.
Thank you…truly! really appreciate your help! :slight_smile:

1 Like

Hi @starsun

The 4096 token limit is on output. The model gpt-4-turbo has a context length (input + output) of 128K tokens.

If you ever get the model to generate a response where it reaches the max output limit, you’ll get a finish_reason with the value length in the response object. In that case, you can simply append the partial assistant message you received to the existing messages list and have the model continue from where it left off.

1 Like