My goal is to build something that can reference a dynamic text file changed by a user sort of like a text editor and have a GPT4 powered chatbot that can automatically reference the most updated document. Initially I was going to provide the full text content for each query but this is very inefficient. What are possible solutions to build this relatively simple? Thanks.
Welcome @airex
I would recommend getting started with the Assistants API
I tried a simple implementation using the Assistant V2 and file_search with vectors. It seems to be working but have a few questions. The responses seem to take a little long, and I don’t understand how to adapt it to dynamic data once the original file is uploaded to reference.
@airex - If you like to get a bit more technical this is what I would advise. Use chat completions with tool_choice as required or Assistants with a good system prompt so it picks the tool when user asks a question related to the docs. Have a tool that performs RAG on the most recent documents uploaded by the user. You can pass in the results as context to submit a run. In this way you have more control over the retrievers, chunking and embeddings. Hope this helps. Cheers!
I see, but the thing is I’m working with a document text editor for example similar to google docs where the users can edit the content of the document freely. I’m stuck at how I can consistently have the AI reference this dynamic document. In my implementation with a static json document file, it worked fine and was able to reference the document using a vector store so i am not sure how to implement this when a user edits the document so the json is constantly being changed which will need to be passed to the chatbot somehow.
@airex - Have a pipeline which get invoked when user makes changes to the document. Once the user saves it or click on process. Have some UI play which shows the progress bar and in the background you could process the document, create embeddings and attach it to the vs and then update the progress bar with each step.
I see. Would you recommend to just re-embed the document as a whole again? Or is there a way to only update the embeddings that were changed. Would it be too complex to have something set up where it does this every so often so a user wouldn’t need to manually click process?
Also, I would have different AI “section” that would reference the same document text, for example one would be a chatbot, one would be a review section, etc. so how could I split it up and still reference the same document content without needing multiple assistants? Thanks.