Text-davinci-003 vs gpt-3.5-turbo: memory

Hello, novice here. Does gpt-3.5-turbo model has built-in long-term memory capabilities? Or do I need to glue the previous msg sent like I would with davinci? Thank you for your answer.

1 Like

Thank you very much for your answer Raul. What I mean is not to tailor to specific users, but to have long term memory, remember previous msgs.

Raul, do you use which model? davinci or gpt turbo?

These docs will probably be helpful: OpenAI API.

You send the whole conversation history with e.g.

# Note: you need to be using OpenAI Python v0.27.0 for the code below to work
import openai

openai.ChatCompletion.create(
  model="gpt-3.5-turbo",
  messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Who won the world series in 2020?"},
        {"role": "assistant", "content": "The Los Angeles Dodgers won the World Series in 2020."},
        {"role": "user", "content": "Where was it played?"}
    ]
)
1 Like

@dliden thanks for the referece. This raises the problem of how to maintain long conversations due to limitations of the context window and pricing.

1 Like

Those are open questions a lot of people are working on. You might want to check out e.gt. GitHub - zilliztech/GPTCache: GPTCache is a library for creating semantic cache to store responses from LLM queries. and Memory — 🦜🔗 LangChain 0.0.139.

2 Likes

thanks dliden, didn’t know about gptcache!

2 Likes

We built simple & semantic cache recently for our tool - ⭐ Reducing LLM Costs & Latency with Semantic Cache

It’s very easy to use it in production - just requires changing the base URL to Portkey (our platform) and passing a cache header in your request.

We are seeing a consistent 20% cache hit rate for Q&A and RAG use cases with 99% accuracy.