How to maintain context with gpt-3.5-turbo API?

I thought the user parameter is doing this job. But it doesn’t work.

2 Likes

If you read the docs you just posted, it clearly stated the user param is only for abuse monitoring by OpenAI staff.

Also, f you want to maintain state, you need to write code to store messages and resend them (feed them back) to the API. You can search this site for how to do this as this topic has been discussed many times here.

OpenAI APIs currently does not manage user sessions. This code must be done by the application developer.

HTH

:slight_smile:

5 Likes

Also you can look at the documentation it has a little link that get you to a page where they explain how to do that. But what is not explained is how to manage the previous prompts to make sure you can minimize the amount of tokens you are using and how to optimize the length of the messages to achieve that

ChatGPT Completions

import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)

Note: you need to be using OpenAI Python v0.27.0 for the code above to work

The system message helps set the behavior of the assistant. In the example above, the assistant was instructed with “You are a helpful assistant.”

gpt-3.5-turbo-0301 does not always pay strong attention to system messages. Future models will be trained to pay stronger attention to system messages.

1 Like

The best method I found to maintain a semblance of context for my GPT calls was to chain a few of them together.

def gpt_call_2():

messages=[
{“role”: “system”, “content”: “You are a helpful assistant.”},
{“role”: “user”, “content”: “Who won the world series in 2020?”},
{“role”: “assistant”, “content”: f"{gpt_call_1()}"},
{“role”: “user”, “content”: “Where was it played?”}
]
By using instructions in prompting, you can tailor the output of the first call so that it is direct and concise and sort of chain them together.
This is by using python f-string btw.

Hello,

from my side I compress historical data in zlib to save tokens. It could be very efficient depending on kind of text.
At each new prompt, I send n last prompts/answers compressed in zlib into the assistant field which are interpreted as it would be if uncompressed and my bot reacts like chatgpt application. The history could be from 10 to 15 prompts depending on requests.

EDIT following discussion with @ruby_coder : I’d never been able to finish this process. Zlib compressed data contains unauthorized characters and the JSON request is malformed. My last idea is to UrlEncode() the compressed string but I am not sure it could work (and my capacity to develop code is very limited) and if it would be efficient to save tokens. If I find a new solution, I will add this to this topic. Sorry, my bad.

Interesting, but I have tried this before cannot get compression to work when I try a simple test case:

This works fine of course:

client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
text = "Are you an AI?"
response = client.completions(
    parameters: {
        model: "text-davinci-001",
        prompt: text,
        max_tokens: 100
    })

This fails with error when converting to JSON:

require 'zlib'
client = OpenAI::Client.new(access_token: ENV['OPENAI_API_KEY'])
text = "Are you an AI?"
compressed_text = Zlib.deflate(text).force_encoding(text.encoding)
response = client.completions(
    parameters: {
        model: "text-davinci-001",
        prompt: compressed_text,
        max_tokens: 100
    })

Error:

.../lib/active_support/core_ext/object/json.rb:39:in `to_json': source sequence is illegal/malformed utf-8 (JSON::GeneratorError)

I have tried many encodings and all fail to validate as JSON data.

Do you have working Python code in a small “hello world” test case for compression @mattg ?

:slight_smile:

1 Like

Sorry but at the moment, all my workflow is running in made.com platform. And I don’t find this part anymore so perhaps I had the same problem than you and let this inachieved. I am migrating this workflow to a on premise platform to save costs (n8n) so I will retry and get back to you if it works.

In between, perhaps it would be possible to encode base64 the text to obtain a json compatible string ?

Hi Matt,

Your logic is hard to follow and your replies are not helpful, sorry.

You made a strong, definitive statement in this topic for developers:

And I asked you provide the code, since I am a software developer and have tried on numerous occasions and cannot get a compression encoding to return a correct completion when sending it the OpenAI API.

If I send it compressed data in base64 format as a prompt to the API, the API has always returned completions with blah, blah nonsense.

You replied to me that you actually do not have working code and refer me to a web site selling furniture?

Either you have working code or you do not. Please back up your claim or please retract it.

Furthermore, I have searched the net, and I cannot find a reference where any developer has compressed OpenAI prompt data and sent that encoded data and compressed data to an API endpoint and received a valid completion, just sending compressed prompt data to an OpenAI API completion endpoint.

Then, you reply:

Sorry, this makes little sense. So, you have “no idea” if it works, and you just “made up” that you were sending compressed data to the API, but actually, you have no idea and just “make it up” in reply here?

@mattg, sorry again, but you either have a working algorithm to compress a OpenAI API prompt and send that compressed data to the API to return a valid completion, or you do not. You told us all, in very definitive terms, you can do this; but you provide no code and you keep defecting from providing a valid, testable reply.

Now, it seems you want me to come up with a solution for you which you claim you had?

So, I can only conclude based on your posts in this topic @mattg that you just “made up” the idea that you could compress data send it to the API and it would return a valid completion; because you cannot back it up with working code and you now are asking me to come up with a solution.

Hmmmm

:slight_smile:

FYI: Google Searches Turn up Nothing.

Have tried many searches, none bear any fruit. Here is an example:

OBTW: Asking ChatGPT just returns chatbot hallucination nonsense.

1 Like

Hi @ruby_coder,

You are perfectly true and I am so sorry.

I was honest when answering you but I had to manage many things and I simply “forget” that my many attempts to make this working were unsuccessful.

I retried again last night without more success but I am not a developper. Make.com (and not made, sorry for this too) is a workflow manager using APIs for many services or allowing to do JSON calls when the service is not directly managed by the platform. I only add little pieces of code to transform some data.
My last try was to add a urlEncode() after the zlib compression but I wasn’t able due to my poor knowledge in coding. Perhaps you could try but I am not sure we could save many tokens using this method.
So it possibly just not possible to do this and again, I am very sorry I made mistakes and let you think I was able to do this.

Matt.

1 Like

To answer to first topic question: my working solution to add context (without compression):

{
“model”: “gpt-3.5-turbo”,
“messages”: [{“role”: “system”, “content”: “Tu dois te comporter comme un ami sympa qui s’appelle MattGPT. Tu ne dois pas écrire ton nom au début de tes réponses. Tu dois tutoyer tes amis. Tu dois toujours répondre en HTML”},{“role”: “assistant”, “content”: “Précédents échanges : {{ $json.concatenated_text }}”},{“role”: “user”, “content”: “{{ $node[“Telegram”].parameter[“text”] }}”}]
}

Where $json.concatenated_text is a concatenation of last n google sheets row (created at each prompt and each reponse)

and $node[“Telegram”].parameter[“text”] is the message coming from telegram.You can urlencode it as well, chatGPT (at least with gpt-3.5-turbo) will understand.

Link to project topic MattGPT : multifunction telegram bot

1 Like

Dang it I spent two days trying to get ChatGPT to produce this solution after reading that post. Oh well! I’m currently able to get my chatbot to remember it’s place in a conversation by appending prior responses to the messages using an array but the problem is it keeps running out of tokens after about 10 back and forths. Compression seemed like a good stopgap potentially but if that’s not a solution OpenAI presently supports I’ll have to try something else.

Thing is if it lost the token limit I and other developers would likely spend a lot more money because then we could deploy our apps into the wild for general public use! Is there a place to post that kind of feedback? More tokens = more money!

1 Like

I would love to see if there is a way to use a lower model to summarize the previous message in a consistent manner such that using the other lower model would still reduce the cost compared with using the full length previous message… the summary could also be said by the assistant… I am curious about what the difference is when analyzing the array between what the assistant said vs what the user said… I didn’t experience with any of those things so far but I am curious to know what you guys think about that…

I think we can all agree @ruby_coder is de facto an official moderator here… Maybe he doesn’t know or maybe OpenAI doesn’t know but I think it’s obvious to everyone of us.

I love how he is promoting constructive discussion and how he can be rigorous, in his replies it is amazing to see how he brings something positive to our community!

It is refreshing to see someone so conscientious and uncompromising always ready to help others in a respectful manner…

In many other online communities a user would have been schooled on how they made up something but @ruby_coder made it feel methodical and diligent… He was polite and very methodical and made his point clear in a way that is engaging towards @mattg rather than repealing…

Well thanks fellow for making this forum such a great place to share our thoughts and knowledge…

Also thanks to @mattg for the clarification I hope someone will find a solution to avoid have 4000+ tokens per interactions and hypothetically even more with the possibility of getting perhaps 8000+ or 32 000+ tokens per interactions with the GPT-4 trend…