How to keep session with gpt-3.5-turbo api?

I just want to use gpt 3.5 turbo API to do conversation as I do in ChatGPT. But there seems no easy way to keep session with API.
I know this is an old question, but I don’t find a good answer for it.
I searched related topics in this forum, and it seems no way to continue a conversation in completion API itself, such as sending a session ID as a parameter. The alternative ways is giving previous conversation or summary of previous context.
And when I ask ChatGPT how it holds the session, it answers that it uses ‘attention mechanism’ to focus on specific parts of the input sequence and also uses a “memory” to store important information from the conversation history.
So I want to know that is there any easy way to do session with gpt 3.5 turbo API? Or will OpenAI somehow provide additional parameter to support session?

3 Likes

我也希望官方能出一个方便一点的方法来保持聊天会话。现在的这个方法真的太不理性了。在我们中国人的思维里面,这种设计我们是绝对不允许出现的,太不人性化了。

是啊,我看其它方法要么把之前的对话当成提示词输入,要么概括之前的对话当提示词。
估计当前的接口只是更新了原有的无状态接口,没有把chatGPT配套功能做成接口。

哈哈哈哈哈哈。是的。没事,慢慢研究吧。没事,没事。哈哈哈

Hi @ylc

Welcome to the OpenAI community.

You’ll have to implement session on your end by passing the existing conversation to the chat completion end point.

How you store conversation on your end depends on what’s convenient for you.

Here’s a very basic implementation

1 Like

As the OP, I also wish to see some form of conversation_id and parent_id parameters, which correspond to the state of the last prompt and the state of the conversation at any point. In the web UI we can recall a conversation session, only adding new prompts to already existing context and even branch of from a previous prompt into a new timeline. This is already achieved with gpt-3.5-turbo in the context of ChatGPT. If we could get that feature through the API for gpt-3.5-turbo, we could avoid resending the previous messages with every prompt.

3 Likes

Yes, because OpenAI developed an retail web-based ChatGPT application and that app feedbacks the prompts to provide prior messages to the model.

Developers using the OpenAPI are expected to write their own code to do this when they develop their own application.

HTH

:slight_smile:

5 Likes

I cannot agree to that. Do you have a reference? I have in some cases 20+ long prior prompts setting specific context regarding the manual, configuration, code of an application and when I make new prompt i get instant reply. From my experience whole history is not being resent every time.

1 Like

You can search these forums.

This has been discussed many times here.

You @nikko joined our community 11 hours ago. It is a good idea to search the site and review the prior discussions before rejecting replies here.

Welcome to our community, but please search the site before asking questions.

:slight_smile:

5 Likes

I did not say “the whole history is resent every time”. Those are your words, not mine, @nikko

I said:

Of course there are pruning and summarization strategies, all which are the responsibility of the developer to create; and which have also been discussed here in many posts.

That is all a part of the developers responsibility to manage this. OpenAI provides an API. Developers use the API to create a full application.

I kindly recommend you search these forums first.

Thanks!

:slight_smile:

Example Prior Topic(s)

There are more, can search using magnifying class in upper right hand corner

4 Likes

Thank you for the details and the references you provided @ruby_coder. I didn’t think of this as pruning and summarization, but as some light form of transfer learning / fine-tuning which adds up a small overhead to the original model.

2 Likes

Yeah, that’s not what happens, sorry. There is no “transfer learning” and there is no “fine-tuning on the fly” which happens the current series of these OpenAI pre-trained large language models.

Just think of these models are what they are. They are powerful text auto-completion engines. They take in your input and predict the next sequence of text. These OpenAI LLMs do not “learn on the fly” at all, but many people mistakenly imagine they do.

Also, depending on the temperature selected, these LLMs generate random output, so some people get random text which happens to match a prior chat session, and they then mistakenly think that random text which coincidentally matches something from the past is “proof of learning” but it’s just a random coincidence.

Hope this helps.

:slight_smile:

5 Likes

@sps this would be a good approach if chat-gpt has no limitation on text amount you send at a time. Try to send a text of three pages and chat-gpt will refuse to interact. There must be another way to keep the session alive.

2 Likes

Please read:

1 Like

As @mustafa.salahuldin says, there MUST be another way to keep the session alive. I would have expected the API for ChatGPT to work exactly like we use the chatgpt web app. For instance, python API could work like this:

import openai

session = openai.ChatCompletion.Session(model="gpt-3.5-turbo", title='example session')
chat_id = session.chat_id

messages=[
      {"role": "system", "content": "You are a helpful assistant."},
      {"role": "user", "content": "Who won the world series in 2020?"},
      {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
      {"role": "user", "content": "Where was it played?"}
]

response = session.appendMessages(messages)

# do something
userMessage=[
        {"role": "user", "content": "Say it again, but in spanish"}
    ]

response = session.appendMessages(userMessage)

If we want to continue the conversation later, we could just:

session = openai.ChatCompletion.Session(model="gpt-3.5-turbo", chat_id=chat_id)
response = session.appendMessages(userMessage)
2 Likes

This post was flagged by the community and is temporarily hidden.

1 Like

Hi @gone

You are welcome to the community.

Please don’t write hypothetical code that will not run.

Instead read the docs on chat completion API to know how it works, which says in Managing Tokens section:

If a conversation has too many tokens to fit within a model’s maximum limit (e.g., more than 4096 tokens for gpt-3.5-turbo ), you will have to truncate, omit, or otherwise shrink your text until it fits. Beware that if a message is removed from the messages input, the model will lose all knowledge of it.

Note how it says YOU will have to truncate, omit, or otherwise shrink your text until it fits

Hope this helps.

2 Likes

Please don’t be like that. I shall not visit this forum again. You are unkind.

It seems everyone ignored the point of this thread. Also, before the pseudo code there is a very clear “python API COULD work like this”. Suddenly we are not entitled to opinions, or ideas?

You all seem to love being right. I can respect that, but not the unkindness.

Bye.

Hi @gone

If there was any part of my message that was unwelcome, please let me know so that I can address it. I want to ensure that our communication is always respectful and productive.

I would like to take this opportunity to welcome you to our community.

1 Like