Hi team,
I using Chat completion API and i want to maintain user session but its not working.
I am using user peoperty in Body of api but still user session not maintaining as expected. but its working in ChatGPT but not in API.
Any one please help to resolved this issue.
API body:
{“model”: “gpt-3.5-turbo”,
“messages”: [ {“role”: “user”, “content”: “Hi”}],
“user”:“1”}
You need to pass the entire conversation history on each request. There is no built-in conversation history storage/persistence. You can search these forums for other threads offering advice on how to handle this.
Hi @novaphil
Thank you for quick response.
But this solution will use more token while calling API.
Yes it does, but it is the only option. The API has no persistence, it is stateless. It needs the entire history (or as much that fits in token limits, or a summary of the history) on each request. The “user” parameter does not provide this/any functionality, it is used just for abuse tracking.
Based on the previous response, you can pick and choose what you want from the conversation and then pass it on to the next conversation is you are worried about token length. You can also try to limit the generation of the output by restricting the max token.
Other than that, try using gpt-3.5-turbo-16k, so it would allow for more information to be passed in the context
Welcome to the OpenAI community @giddpoojab
@novaphil is correct. You will have to pass the messages you want the model to “remember” when responding to the latest user message.
In your case, you can use embeddings. Here’s how you can Use embeddings to retrieve relevant context for AI assistant
Thank you for solution @sps
I have another query mentioned below
If I use subscribed version(Paid version) of OpenAI then also will have issue of User State Management in Chat completion API
There’s no subscription for the API. It’s a pay as-you-go service, where you pay based on the tokens you consume
This doesn’t exist.
ChatGPT: website with a talking robot giving advice at chat.openai.com
- payment model: free, with more features at $20/month
- conversation management: handled by user interface and backend database
OpenAI API: access backend AI models for application development, management at platform.openai.com
- pay-per-use, measured by data-in, data-out (in AI “tokens”)
- stateless, all infomation to generate a response must be passed each API call
Connection between the two? Almost zero. You even need to put the payment method in again in a different place for API use.
Thank you for clearing doubts @sps.
In any OpenAI document its mentioned Chat Completion API not maintain user management state.
If there please share me document(Not community raised query ). which help me to share with my teammates.
Including conversation history is important when user instructions refer to prior messages. In the example above, the user’s final question of “Where was it played?” only makes sense in the context of the prior messages about the World Series of 2020. Because the models have no memory of past requests, all relevant information must be supplied as part of the conversation history in each request. If a conversation cannot fit within the model’s token limit, it will need to be shortened in some way.
– Docs
Thank you @sps.
The doc will help me to make more understand.
@novaphil Hey frds, can you help me. I’m creating chat bot using openai and llamaindex. I have successfully implemented the flow. But, In live, if the one user ask some question it will response correct, at a time the another user ask some question, that also answering fine but I’ll give the history to the openai request, but all user chat history are merged while inserting the history to the request. what is the solution.
from flask import Flask, request, jsonify, Response
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.llms.openai import OpenAI
from llama_index.llms.openai.base import ChatResponse
import os
import time
import openai
import json
from dotenv import load_dotenv
load_dotenv()
os.environ[“OPENAI_API_KEY”] = os.getenv(‘OPENAI_API_KEY’)
openai.api_key = os.environ[“OPENAI_API_KEY”]
app = Flask(name)
memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
llm = OpenAI(model=“gpt-3.5-turbo”, temperature=0)
data = SimpleDirectoryReader(input_dir=“data”).load_data()
index = VectorStoreIndex.from_documents(data)
chat_engine = index.as_chat_engine(
chat_mode=“context”,
llm=llm,
memory=memory,
system_prompt=(
“Your name is clarus. You are a chatbot, able to have normal interactions, as well as talk”
“about a clarus money project related queries”
“If the user’s query is related to the provided context, you need to answer with the proper response message in the context in the response message JSON”
'''Give all the response in proper JSON format like { "response": "response message of the query", "intent": "intention of the query" }'''
),
)
def generate_chat_responses(query):
res=‘’
response = chat_engine.stream_chat(query)
for token in response.response_gen:
res += token
return res
@app.route(“/voice-backend”, methods=[‘POST’])
def voiceAssistant():
data = request.get_json()
query = data.get('prompt')
try:
return jsonify({"satus": 200, "data" : generate_chat_responses(query)})
except Exception as e:
print("Exception:", e)
if name == ‘main’:
app.run(debug=True, port=8080)
@sps Hey frds, can you help me. I’m creating chat bot using openai and llamaindex. I have successfully implemented the flow. But, In live, if the one user ask some question it will response correct, at a time the another user ask some question, that also answering fine but I’ll give the history to the openai request, but all user chat history are merged while inserting the history to the request. what is the solution.
from flask import Flask, request, jsonify, Response
from llama_index.core import VectorStoreIndex, SimpleDirectoryReader
from llama_index.core.memory import ChatMemoryBuffer
from llama_index.llms.openai import OpenAI
from llama_index.llms.openai.base import ChatResponse
import os
import time
import openai
import json
from dotenv import load_dotenv
load_dotenv()
os.environ[“OPENAI_API_KEY”] = os.getenv(‘OPENAI_API_KEY’)
openai.api_key = os.environ[“OPENAI_API_KEY”]
app = Flask(name)
memory = ChatMemoryBuffer.from_defaults(token_limit=8000)
llm = OpenAI(model=“gpt-3.5-turbo”, temperature=0)
data = SimpleDirectoryReader(input_dir=“data”).load_data()
index = VectorStoreIndex.from_documents(data)
chat_engine = index.as_chat_engine(
chat_mode=“context”,
llm=llm,
memory=memory,
system_prompt=(
“Your name is clarus. You are a chatbot, able to have normal interactions, as well as talk”
“about a clarus money project related queries”
“If the user’s query is related to the provided context, you need to answer with the proper response message in the context in the response message JSON”
'''Give all the response in proper JSON format like { "response": "response message of the query", "intent": "intention of the query" }'''
),
)
def generate_chat_responses(query):
res=‘’
response = chat_engine.stream_chat(query)
for token in response.response_gen:
res += token
return res
@app.route(“/voice-backend”, methods=[‘POST’])
def voiceAssistant():
data = request.get_json()
query = data.get('prompt')
try:
return jsonify({"satus": 200, "data" : generate_chat_responses(query)})
except Exception as e:
print("Exception:", e)
if name == ‘main’:
app.run(debug=True, port=8080)
I ran into this issue last night and solved it: as others are saying you need to store some user identifier on the server side where you send your requests. A solution to this is to store a dictionary of users (sessionId) and their message history. Here is how you would do it in React and NodeJs:
React
const sessionId = uuidv4();
const data = {
userSession: sessionId,
prompt: prompt,
};
const handleChatCompletion = async () => {
try {
setIsLoading(true);
const apiResponse = await fetch(
"api-endpoint/chat/completions",
{
method: "POST",
headers: {
"Content-Type": "application/json",
},
body: JSON.stringify(data),
}
);
const result = await apiResponse.text();
setResponse(result);
} catch (error) {
console.error("Error:", error);
} finally {
setIsLoading(false);
}
};
NodeJS
let userMemories = {};
app.post("/api/chat-completions", async (req, res) => {
try {
const { userSession, prompt } = req.body;
console.log("User prompt: ", prompt, userSession);
let currentUserMessages = userMemories[userSession] || [];
const messages = [
{ role: "system", content: "You are a helpful assistant." },
...currentUserMessages.map((msg) => ({ role: "user", content: msg })),
{ role: "user", content: prompt }, // Add the current prompt to the end
];
currentUserMessages.push(prompt); // Push the current prompt to memory
userMemories[userSession] = currentUserMessages; // Update the memory for the current userSession
const response = await axios.post(
"https://api.openai.com/v1/chat/completions",
{
model: "gpt-3.5-turbo",
messages: messages,
},
{
headers: {
"Content-Type": "application/json",
Authorization: `Bearer ${openaiApiKey}`,
},
}
);
console.log("Response:\n", response.data.choices[0].message.content);
const formattedResponse = formatResponse(
response.data.choices[0].message.content
);
res.send(formattedResponse);
} catch (error) {
console.error(
"Error:",
error.response ? error.response.data : error.message
);
res.status(500).json({ error: "Internal Server Error" });
}
});
// Route to clear the history
app.post("/api/clear-history", (req, res) => {
const userSession = req.body.sessionId;
userMemories[userSession] = []; // Reset the memory array to an empty array
console.log("Conversation cleared for user: ", userSession);
res.send("Conversation history cleared.");
});