Building a RAG App (as a noob)

ayandyan · November 19, 2024, 7:12am

Hi, I’m using OpenAI API and Pinecone as my vector database. Here is the scenario: My Pinecone database will eventually contain up to 30,000+ records where each record contains details about a vehicle spare part. Then, I use the OpenAI API to just basically query generic and specific information from the database.

Say, my initial prompt is: “Give me all brake parts from all types of vehicles”. Currently, my process is that:
(1) The API converts my prompt to an embedding and sends it to Pinecone.
(2) Pinecone returns all records it thinks are relevant to the prompt. Pinecone may return 5,000 records, for example, that are related to the prompt.
(3) Now that I have the records on hand, I then submit another prompt that asks for more specific information such as “Based on the records below, which vehicles have the most requested brake parts? [then all 5,000 records are then appended to the prompt here …]”.
(4) My app then processes the response from OpenAI API.

a. My 1st question is: I wanted to know if the above steps are fine or normal?

b. My 2nd question is: For step (3) and any further prompts, is having to append all 5,000 records each time in the prompt the only way to gain more insights into the data? The size of the request consumes too many tokens and I’m wondering if there was a way to improve this process or make it cost-efficient.

I highly appreciate any feedback. Thank you!

whizbee01 · November 19, 2024, 3:43pm

IMHO, you may be over-engineering things.

Ensure you setup prompts to ensure the AI model understands the dataset it’ll get from pinecone.

Ensure you use the same embedding model you used to upsert pinecone content for the query embedding as well. That way, you don’t have to specifically prompt the model for sorting. You can use similarity search for this instance

Also use a a framework like langraph to get better results and set up nodes.

This is the approach I used for a similar project.

anon10827405 · November 19, 2024, 4:48pm

This is a task for a database, not embeddings.

Generally, yes. It makes sense to be CAPABLE of traversing through your database using different layers of granularity.

Although. You can combine your query all together instead of iterating over it.

It’s important to remember that embeddings are for unstructured text. If you have the benefit of having structured information then using an LLM to form database queries would be better.

ayandyan · November 21, 2024, 3:55am

Thanks, whizbee01. Appreciate the advice and recommendation!

ayandyan · November 21, 2024, 4:01am

Thanks, RonaldGRuckus! Very good point on unstructured vs structured text and awesome tip on using AI to form database queries!

ayandyan · November 21, 2024, 4:06am

Hi johnnonso090, thanks for the valuable response. Yeah, that’s eventually the end goal of the app I’m working on. It looks like there is currently no way of preserving context in RAG, at least in my set up, other than having to append the data each time in the prompt. So, it seems that determining beforehand and appending only the most relevant data in the prompt is the way to go.

rprosenc · November 21, 2024, 8:05am

Just be careful to safeguard the resulting sql to prevent data manipulation (or deletion).
Also, you will have to provide the table structure and might help with some example SQL-queries to guide the AI.

johnnonso090 · November 21, 2024, 9:04am

Referencing this “It looks like there is currently no way of preserving context in RAG”, the model that does the processing is not trained to have any knowledge of you data. so that is why context are very important, so it knows what to work with based on the PROMPT + CONTEXT. think of this as light database info you send for it to work with.

also for cost. you might wanna look at prompt compression tools to significantly help reduce prompt cost.

sergeliatko · November 21, 2024, 11:32am

Using some database wrappers like supabase or Directus might also be of help, as they provide restful API which is easier to understand by the model and you can control what endpoints and operations are allowed to avoid the risks of model screwing up your database with errors in the SQL.

Topic		Replies	Views
Turning chatgpt API into a assistant for a (complex) website API	20	4276	December 21, 2023
Open AI prompts for RAG / doc Q&A API api	11	6974	January 9, 2024
Assistants API is Killing Me API api , api-billing , assistants , assistants-api , cost	38	3362	February 13, 2024
Seeking Guidance on Building a ChatGPT-Style Data Analyst Tool with Database Integration Plugins / Actions builders gpt-4 , chatgpt , api , openai	11	4543	September 23, 2024
Fine-tuning 3.5 turbo to act as conversational AI like Non-Playable Character in games API fine-tuning	4	1596	October 4, 2023

Building a RAG App (as a noob)

Related topics