Design approach to using assistants, rag, and files

I am looking for a better understanding and approach to the following.

I have a set of corporate PDF files that contain product coding information. For example, the catalog contains information on water pumps with details on uses, flow rate, size, and meantime between failures. I want to use these PDF files as RAG for the Open AI assistant. I assume that I just upload the files into the assistant. I can then ask questions and it pulls the information from the RAG, so that seems correct.

Next, I want to take a client PDF that contains a project they are working on and have the assistant take the PDF and then code it using the RAG files. Since all my client provide projects as PDF files, I need to be able to upload the file, process it, and then remove it so the next one can be loaded.

The openai assistant creates a thread when communicating with it. The thread has files. Should I upload the client file to the thread and not the assistant directly? How do I ask the AI to take the thread loaded file and enhance it with the codes found in the assistant RAG documents?

  1. Is this a good approach to the problem?
  2. I don’t know how to tell the AI to use a file for the input and use the RAG to generate an output. I’d prefer to do this in .net but can use python as well.

I have seen many videos and google articles on loading RAG documents and querying them with text, but none where the input is coming from a file to be processed by the RAG documents.

Any feedback would be appreciated.

Very interesting problem.

I would guess something like splitting the app in 5 distinct modules:

  1. Stock document consumption and conversion of item raw data into structured data models (entities in your DBs and vectorized objects).
  2. Client files consumption and conversion to structured data.
  3. Parsing structured client data and converting it into requirements standard descriptions.
  4. Storage / Retrieval / Processing engine that will be a combo of relational DB, vector DB, retrieval API responsible to handle requests in both storage engines, and processing API responsible to handle data operations requests from AI agent (you don’t want AI tell you how many valves of what size are needed to build a refinery, give it an API to use your code to calculate stuff like that).
  5. AI assistants team… You read me right, team of specialized agents that will work on specific tasks, a coordinator to be in the front line to talk with “user” and most likely a manager that will supervise the coordinator and the user to direct the thread into the right direction (toward the goal).

As for the tech: whatever you feel comfortable with and in your budget. The “brain” is accessible via a REST API request, so it doesn’t really matter on the tech side.

The biggest issue you will have is to describe the business processes so detailed that it can be converted into meaningful app capable of doing the job (and not just the POC draft) in the field. And weirdly the weakest point in here is humans on the field who cannot clearly describe what they do, how they do it and, the most important: why they do what they do to get the job done. If you nail that one: you’ll get a great business.

Hope that helps, feel free to reach out to me on LinkedIn if you want to discuss, this thing looks really interesting. Especially when you combine it with product/norm requirements data retrieval for the stock and processes.

Assistants does not have true RAG.

Files cannot exactly be “uploaded to a thread”.

Assistants’ AI is provided a single internal tool to which it can emit search queries, and then a vector store (or multiple combined vector stores) is searched for semantically similar chunks of documents.

A vector store can be created permanently and attached to “assistant”, or can be created automatically with a message file attachment which expires in a week (still leaving the file in storage, though).

The AI cannot distinguish between documents from a user and documents that are assistant-based in this system. It only receives search results.

A guideline would be, if you must use this:

  • disallow user uploads if the files are all curated knowledge documents.

An implementation guideline would be to:

  • since the file search tool is just presented to AI with no information about its contents or usefulness, you must
  • provide intensive system instructions telling the AI what it CANNOT answer without having received direct results by writing a file search.
1 Like