Minimizing Costs in RAG Application

rajibdeb76 · March 16, 2023, 4:53am

Hi, I am planning to use RAG approach for developing Q&A solution with GPT. In this approach, i will convert the document corpus to embeddings and store in a vector DB. During prompting, I will retrieve the similar documents and pass that to the prompt as additional context. The problem I see with this approach is that my documents change almost every week, that means I need to run the embedding generations every week which is additional cost. Are there any best practices to reduce cost is such scenarios.

wfhbrian · March 16, 2023, 12:54pm

Embeddings are pretty cheap, it’s the completions that will likely be your biggest cost as long as the embedding strategy is sound.

In one of my projects, Embeddings need to be updated when files are changed. So I keep references to the modified time, file size, and content hash in the metadata for the embedding. This way I can check for changes before re-embedding.

lukaesch · June 23, 2023, 5:11pm

You can host and sell your embeddings on EmbedElite which is a new marketplace platform for AI assets.

This way you can reduce your costs and even make extra money, if buyers will start using your embeddings/RAG chains.

Topic		Replies	Views
How to reduce prompt tokens price API embeddings	3	1680	April 1, 2024
How do you make rag without blowing costs? API rag	4	6828	February 28, 2025
Cost when building chat with text with embeddings and chatgpt 4-128k API embeddings , gpt-4 , chatgpt	6	4892	December 22, 2023
Building a RAG App (as a noob) API gpt-4 , rag	8	1590	November 21, 2024
Calculating embeddings costs API	8	10757	September 5, 2023

Minimizing Costs in RAG Application

Related topics