Strategies to feed GPT4 bot with product database to maintain context

Hello,
the titles says it all :slight_smile:
what would you recommend as a strategy to use gpt4 for a “sales” recommendation bot

  • feed it with small chunks of the product data based on a pre-processing (keyword) of user request
  • pre-process thanks to a fine tuned davinci model and use GPT4 just for the conversational side?
  • embeddings search pre-processing?

in general anyone with experience in this ?

thanks!

As far as I know, there is no limit to the content you give to the system.

I would create an markdown document with your instructions and catalog (markdown table), and let the chat bot do all the work. Set up the document in a way that it is easy to update. Whenever you have a change, push the changes. Considering that most of the decisions/recommendations will be based on the data you have provided, I would use a cheaper, older model.

The content size is limited by the context window of the model used, in tokens that would be 4k for GTP3.5 8k for GPT-4 8K and 16K for GPT-3.5-16K to go from tokens to approximate English words multiply by 0.75.

1 Like

That is only the user prompt and answer, right?
I am referring to the initial system prompt.

It’s for everything, system prompt, user role messages and the response from the the model.

1 Like

Then how is it learning if from the standpoint of tokens, you are submitting everything at every turn.
Let me check the docs. Hope I am right :slight_smile:

It does not learn from users messages, ChatGPT messages when you are not opted out may be used for training subsequent models, but it’s not a process that happens at that moment.

The model is stateless, i.e. it needs to be given all of the context every time you make an API call, it creates an illusion of memory by including the past conversation messages with each call. The list grows longer and longer with each message until the information at the start begins to be cut off and discarded, it’s remarkable how well that works even over very long conversations.

2 Likes

How can I use the ChatGPT API? | OpenAI Help Center

Did I misunderstand it? Open to your interpretation.

The tokens are counted on a per minute basis.
When I said, I am not referring the base model, but the model for your chat or api.

Yes, the tokens per min limits are there to prevent the system becoming overloaded, called “rate limiting” that has no bearing on what the model remembers over time.

Each API call or indeed each press of the enter key when using ChatGPT sends everything you have talked about so far to the model for processing, when that list gets too long, the items at the start get removed and the system continues (for ChatGPT). for the API you have to manage that aspect yourself.

1 Like

thanks for the info.

Is it different when using Azure OpenAI?

1 Like

No, all of the current models, by openai, google, anthropic, meta… require the prompt to contain all of the information and context to be included with each interaction with the API.

The best way I can describe it is this: Imagine you are walking down the street and the AI is a random person you tap on the shoulder and ask a question, at that moment in time they have no idea who you are or what your question will be about, when they give you the answer, that’s it, they walk off and you never see them again. To continue your conversation you must pick a new person and tell them everything you chatted about with the first person and then add your next question on to the end of that, and as before when they give you an answer they walk off. Every interaction has to be viewed in that light.

I love that analogy. Mine is giving an amnesiac an open-book test. It knows what it knows, plus what’s in the material you gave it. Give it the test again, and it will have roughly the same quality of answer. But it will forget immediately. This is why larger “context window” models have such appeal - you can give it a bigger book for the test - including maybe giving it the answers it came up with last time! A lot of what seems like “memory” in chatgpt is filling the context window with your recent questions. The 3.5-16k model has the widest window for you to work with at OpenAI, and Anthropic has a model that boasts 100k tokens - 6 times the size. (Whether it uses all that “memory” well is a different question).

Hope that helps some! Even if the LLM forgets you, we won’t - keep coming back with questions and we all get smarter!

1 Like

Got it. I had previously worked with 16K.
This token limitation/race reminds me of downsizing GIF banners to the smallest, optimal quality for file size.

1 Like