What other things did you tell it about the catalogue. What did you put in the Assistant instructions. The questions you list are only the ‘user part’ - in the Assistant instructions you could start with something like 'You are a warehouse manager that answers questions about our products. You have access to all the catalogues of the company. Each product has the following characteristics: year engineered and dimensions. Please note that a single product can be available in more than one dimension.
When answering questions make sure to always check all the catalogue documents and double check to list all products found.
Hi @jlvanhulst thanks a lot for your reply.
I tryed with this instructions:
You are a helpuf assistant that uses only the files as source of knowledge. Questions not inherent to the documents have to be declined.
You answer questions about our products. You have access to all the catalogues of the company.
When answering questions make sure to always check all the catalogue documents and double check to list all products found.
But this is the chat:
User
which model is 180x34?
Bot-234
I’m sorry, but I couldn’t find the information about the specific model “180x34” in the document. If you have any other documents or relevant details, please feel free to provide them, and I will be happy to assist you further.
User
are you sure?
Bot-234
I will perform another check to ensure that I didn’t miss any relevant information about the model “180x34.”
Bot-234
The document does not provide any information about the model “180x34.” If there are additional documents or specific details related to the model, please provide them so that I can assist you further.
User
What about the dimensions of MODELLOUNO?
Bot-234
The dimensions available for MODELLOUNO are 100x50, 120x70, and 180x34【23†source】.
Just to double check - you have both documents added to the Assistant. (Not the thread) So when you look in https://platform.openai.com/assistants and open the Assistant you see two files under Files?
Now also note that your question is Which MODEL is xxxx. You don’t use the word MODEL in your text file - or in your prompt. So you would have to do a better job explaining that. What is a model? A product? How could it know!
Also - I would probably consider formatting the document different (more like a CSV) , especially if you are expecting more rows. Also I’m curious if addeing quotes around the dimension will make a difference or not.
Just make sure to expand on the prompt explaining explicitly what ‘MODELS’ are. If I have learned anything the last few months is that my prompts are usually not explicit and detailed enough before I get them right.
@factory.factory my experience so far is that the retrieval is just not good enough yet. What I’m doing instead is adding the relevant information to the thread in a message and then asking my questions about the data. So a combination of OpenAI Assistants + RAG.
Thank you @timpietrusky , I’m going to check the resources you posted!
Just an update: I noticed that if the assistant has already talked about both products, then it responds correctly.
It looks like Assistant loads the info in a lazy way but, once loaded, then it uses both documents.
This is my last thread:
User
tell me about MODELLOUNO and MODELLODUE
Bot-234
…it reports all the content of the pdfs…
User
Which models are 180?
Bot-234
The models that have a size of 180 are: - MODELUNO: available sizes 180x34【16†source】. - MODELDUE: available sizes 180x34【19†source】. Both models offer a variant with a width of 180 cm.
…
…
…
…
If I open a new Thread without ask “tell me about MODELLOUNO and MODELLODUE” then it does not respond correctly. He forget about MODELLODUE every time.
But if I force to load both models with “tell me about MODELLOUNO and MODELLODUE” then it works good.
If I load a third pdf with a new model I get the same result. I have to ask explicitly about MODEL1, MODEL2, MODEL3.
This behaviur is not aceptable for a production environment. I hope it is possible to fix it in next versions of gpt4.
It would be awsome to inform the OpenAI dev team of this results.
From what I observed the API does not check the knowledge base during load times or during any incidents (which occurred mostly due to model request load).
Are you specifying both file_ids in your Message? I don’t know for sure, but I think I got better results when I listed the files in the message and didn’t depend on the assistant using all the files it’s associated with.
I have e vector store with more than 1000 pdf and when I was asking for information about an specific pdf I noticed that the assistant was retrieving the correct information only for the first 20-50 pdf. So, I followed the tip by @jlvanhulst to instruct the assistant to seach in all files, then it start working fine
This is my instruction:
You are an assistant […]. To answer questions related to […] you must first search for a specific data in the uploaded files. If you do not find it in the first iteration, you must continue searching until you have read all the uploaded files. If you do not find something specific you must say that there is nothing specific in the base knowladge, but you are going to try to infer the information.