Why OpenAI Assistants retrieval is so "file" oriented? How do you work around this?

From the docs:

// Upload a file with an "assistants" purpose
const file = await openai.files.create({
  file: fs.createReadStream("knowledge.pdf"),
  purpose: "assistants",
});

// Add the file to the assistant
const assistant = await openai.beta.assistants.create({
  instructions: "You are a customer support chatbot. Use your knowledge base to best respond to customer queries.",
  model: "gpt-4-turbo-preview",
  tools: [{"type": "retrieval"}],
  file_ids: [file.id]
});

It’s not very intuitive to me. For example I’m used to work with Supabase or similar db which is usually a list of objects, not a “file”.

Of course we can work around this, but I was wondering if something like this could have made sense:

const documents = [
  { data: 'This is a document' },
  { data: 'This is another document' },
  { data: 'This is a third document' },
  { data: 'This is a fourth document' },
  { data: 'This is a fifth document' },
  { data: 'This is a sixth document' },
  { data: 'This is a seventh document' },
  { data: 'This is a eighth document' },
  { data: 'This is a ninth document' },
  { data: 'This is a tenth document', metadata: { path: 'https://google.com/abcd' }}
]
const files = await Promise.all(documents.map((document) => openai.files.create({ data: document, purpose: "assistants" }))));

// later create assistant w files map id

How do you guys do retrieval over non-files using Assistants API?

You have the option of implementing your own local search on or at least via your own server.

Implement a function locally (on your server) and tell the assistant about it.

I should have added:

WITHOUT function calling

1 Like

Assistants’ built in retrieval is based on attaching uploaded files to an assistant. It both inserts some documentation (always) and also gives an internal search function.

What you actually can “add” to insert your vector database knowledge:

additional_instructions as a run parameter. Which might take up to 32k characters like some other inputs.

Containing a preface: “Here’s automatic knowledge retrieval for AI to use, based on the user’s latest input:”

1 Like

… wow, that’s restrictive!

That’s a prompt, not retrieval, it does not scale.

Okay, my guess is that one should read database rows of each tables and just “throw” as txt file in the assistant

Pretty dirty but works

Also you need to sync manually every time your data update

No, that’s how you would inject automated semantic search results based on user input flow, retrieved from an top-n threshold-cutoff embeddings-based vector database.

It scales because you are doing embeddings math on demand, not slow AI inference on AI whims with the ability to iterate.

No need to waste full-context tokens letting AI function-call to search for the same thing.

The additional “prompt” (prompt is actually the ending that denotes it is time in context window for the AI to write as its own entity) lets the AI know why text is there and that it is applicable this chat turn.

I think you miss completely the point of Assistants API.

If you are doing semantic search on your own why are you even using Assistants? It’s supposed to abstract away this.
Otherwise, sure, I can just use raw LLM calls and do semantic search, function calling, code interpreter, etc. on my own infra

1 Like

You can just rewrite that to an all-purpose madlib:
If you are doing ____ why are you even using Assistants??

1 Like