I’m a SWE with very little knowledge of AI language models.
I’m having fun and mostly success building a API client, using the chat completions endpoint. One thing I want to avoid is a snowball of prompts with the AI as the chat conversation grows.
Question: I have a set of domain facts that are not part of ChatGPT language models. I would like to have ChatGPT weave in their generative model text with these domain facts. As an example, these facts maybe a pricing catalog in the shape of: product id, name, stock, price.
When I exclude the catalog in a call to chatGPT it makes something up. Such as when asked a very narrow and specific question like: “What is the price of XYZ?”.
I want to avoid having to include a system prompt with this catalog for every call to the chat completions API. I see that there is an endpoint to upload a file, can something like this be leveraged fro this? Do I need fine-tuning, embeddings?
Typically one would use a vector database (such as Weaviate or Pinecone) with the embeddings of your catalog items to return specific information. You may want to consider using sparse embeddings to prioritize keywords rather than semantic relevance.
For testing (and fun) purposes a hybrid would be great, and more adaptable! (For example adding a 10/90 split weight between dense/sparse embeddings, because … who knows maybe it’s better??)
Once you have a robust retrieval system, your next question will (mainly because it was mine) how do I optimize and reduce the amount of data? Fortunately, there’s an answer for (almost) everything