New "Assistants" API a potential replacement for low level "RAG" style content generation?

movabletype.ai · November 6, 2023, 9:31pm

Looks like for longer docs it will do a vectorized search…

"How it works

The model then decides when to retrieve content based on the user Messages. The Assistants API automatically chooses between two retrieval techniques:

it either passes the file content in the prompt for short documents, or
performs a vector search for longer documents

Retrieval currently optimizes for quality by adding all relevant content to the context of model calls. We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost."

So it indeed does emebed the docs - I’m still digging, but only hang up I’m seeing is the 20 doc limit, but honestly you can do a lot with 20 docs that have a 512 limit… I’m not sure how it charges yet for the embedding functionality (if it does at all!) if it doesn’t charge directly for embedding via this method it will be an insane game changer for my particular use case.

Topic		Replies	Views
Understanding the current Assistant Retrieval process API assistants	7	13723	November 20, 2023
Looking for clarification on knowledge retrieval and using OpenAI's vector database API assistants , assistants-api	9	4548	December 14, 2023
How knowledge base files are handled (Assistants API) API assistants-api	14	8325	February 8, 2024
The OpenAI console Assistant does not use or find some of the files uploaded in its file search zone API	5	314	October 10, 2024
Assistants API Retrieval Pricing: how much does this cost? API assistants , assistants-api , assistants-pricing	42	17845	March 8, 2024

New "Assistants" API a potential replacement for low level "RAG" style content generation?

"How it works

Related topics