Did assistant api kill manual RAG with vector databases?

Mansy · December 18, 2023, 10:50am

So now im making a rag app , and im so confused If I should use the OpenAI Assistant API or the old vector db + langchain for my app , I expect pdf files to be uploaded for my bot , Can someone tell me based on trial and error which is better and why

vb · December 18, 2023, 11:02am

The advantage of using the assistants API is that you take a file, provide it to the assistant and can immediately start your RAG application.

The advantage of building your own RAG from the ground up is that you have control over the quality of the retrieval. You decide about the size of the chunks, how you are selecting the best matches based on the user query and how many results you provide to the model, in which order etc… you are also in control of the costs and can manage the number of times the database is queried and how many input tokens are supplied to the model.

This means there is no one size fits all answer and you should experiment with what matches your use case.

Mansy · December 18, 2023, 11:07am

Hmm I see , can I hear you opinion about assistant api ? I’m not sure if I understood the pricing correctly but do I pay daily for already stored files?

vb · December 18, 2023, 11:17am

From the docs:

If you enable retrieval for a specific Assistant, all the files attached will be automatically indexed and you will be charged the $0.20/GB per assistant per day.

But I believe the issue will be that if the model starts querying the knowledge files and adding additional context to the user query that you will pay a lot more for input tokens each turn o the conversation.

I recently had a conversation where things with the assistant retrieval got very costly very quick as the model added thousands of tokens to each turn of the conversation regardless.

So my best advice for using the assistants API is to focus your attention on both: the quality of the answers based on the retrieval and the overall costs for using the stock solution.

Then you can make an informed decision which way to go. If you are just entering this area then it’s generally good advice to build your own RAG anyways so that you get a better grip on what to expect at what costs.

merefield · December 18, 2023, 11:21am

More in this conversation:

Having now implemented a similarity threshold for my search results, my RAG search is even more efficient and the gap is even wider.

Consider Assistants API to be a beta product with some way to go I suspect … (but packaging up functionality in that way is a totally reasonable goal).

shawnharris963 · December 18, 2023, 11:37am

Assitants api has a file limit and can quickly not work if you have good amount of uploads

smit · December 18, 2023, 1:29pm

At the moment I prefer using a vector database myself, you have more control and cheaper in my POV. For instance, by using a vector database you can prefilter what you want it to look at, wheras that is difficult t with the assistants.

jlvanhulst · December 18, 2023, 3:20pm

There’s probably a place for both.

I am using Assistants to process incoming emails, which have attachments that need to be evaluated as well. Being able to create a trhead - add the attachment and ‘go’ is great. The document is then uploaded somewhere else (or not, depending on the assistant outcome) - and can then be discarded in OpenAI as well.
I mostly have those type of ‘incoming’ non-static documents. So for that purpose the Assistant model is great.

eslof.github · December 18, 2023, 3:54pm

One thing that is missing for me here;

Re-ranker. If OpenAI would create and give an option to use a cheap and fast reranker together with top-N limits for their RAG, I would not have to work on a personal implementation.

Topic		Replies	Views
RAG with more than 10 files API assistants-api	9	4337	January 15, 2024
Assistant Retrieval method and RAG (are they doing same?) API codex , gpt-4 , gpt-35-turbo , chatgpt , api	7	7125	January 3, 2025
New "Assistants" API a potential replacement for low level "RAG" style content generation? API	9	8427	March 4, 2024
Why does my assistant find the right answer from file on Playground but not via API? API	6	1762	December 8, 2023
Understanding the current Assistant Retrieval process API assistants	7	13422	November 20, 2023

Did assistant api kill manual RAG with vector databases?

Related topics