Feeding data then ask questions about it

Hi,

I’ve been trying to figure out how I can use the API to feed a few lines of data then ask questions about it.

After a few research hours, I realized embeddings is the way to go instead of fine-tuning.

Right now I’m create embeddings from this:

"Title: XXXXXX Information; Content: " + df['Content.1'].str.strip()

Now I have the CSV with all the embeddings array, but the documentation doesn’t explain how to actually ask questions from that embeddings.
The most relevant part I found in the docs (/docs/guides/embeddings/use-cases) was “Text search using embeddings”, but I’m not trying to do text search. I want to ask questions about it, I feel like it’s a different approach.

The only way I found to do this was langchain, for example, this project:
git/techleadhd/chatgpt-retrieval/tree/main

I wonder what other approaches I could take without using langchain. I want to keep it simple and easy, that’s it. I feel like adding langchain, will add complexity to it, and might be slighty off of my goal. That’s why I’m asking to a few experts. Many thanks.

Hi there and welcome.

Here is a worked example from the OpenAI cookbook that might be a good initial resource.

If you are intending to use vector database, such as Pinecone, to store your embeddings, then these providers typically have OpenAI specific guidance including code snippets on their website as well that detail the approach for common use cases such as Q&A. A lot of these new vendors offer free trials or some initial credit - if you are new to embeddings and vector databases, then might be a good way to get started and see if this is useful for your use case.

I hope it helps!