joao.b
1
Hello, novice here. Does gpt-3.5-turbo model has built-in long-term memory capabilities? Or do I need to glue the previous msg sent like I would with davinci? Thank you for your answer.
1 Like
joao.b
3
Thank you very much for your answer Raul. What I mean is not to tailor to specific users, but to have long term memory, remember previous msgs.
joao.b
5
Raul, do you use which model? davinci or gpt turbo?
dliden
7
These docs will probably be helpful: OpenAI API.
You send the whole conversation history with e.g.
# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai
openai.ChatCompletion.create(
model="gpt-3.5-turbo",
messages=[
{"role": "system", "content": "You are a helpful assistant."},
{"role": "user", "content": "Who won the world series in 2020?"},
{"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
{"role": "user", "content": "Where was it played?"}
]
)
1 Like
joao.b
8
@dliden thanks for the referece. This raises the problem of how to maintain long conversations due to limitations of the context window and pricing.
1 Like
dliden
9
2 Likes
joao.b
10
thanks dliden, didn’t know about gptcache!
2 Likes
We built simple & semantic cache recently for our tool - ⭐ Reducing LLM Costs & Latency with Semantic Cache
It’s very easy to use it in production - just requires changing the base URL to Portkey (our platform) and passing a cache header in your request.
We are seeing a consistent 20% cache hit rate for Q&A and RAG use cases with 99% accuracy.