I have asked a few research questions of Deep Research to-date. Today, after I uploaded two new docs for it use during a research question, I was surprised to see (while I monitored the Making Progress modal) that it started to reference a doc I had loaded for a question (unrelated) a few days ago. I made the point of saying that it was looking at “user_files”. Does this mean that it is keeping all uploaded files and referencing them for this new query. The end result showed a number of links that referenced this older file, so it does appear to do this. Is there any way to restrict the files a DR question can use. This behavior will throw all subsequent answers off. Any insights on this would help.
I am pretty sure even if they use older files they will find the relevant information in them and if there is none it will just ignore it.
Imagine a huge desk with all the files on and you sort all documents on it and your brain tries to categorize everything while you search - should work about the same.
Documents are most probably chunked, then semantically grouped (maybe by creating vectors, or maybe even by multiple different agents that have specialized tasks for an analysis and saving that in a vectordb or in a graphdb or by finetuning or training small models from it) and when searching next time the same algorithm is used and they can compare the vectors and see which chunks of the files are semenantically close. They won’t just stack them all into a prompt and let the GPT decide (I hope)…
Thanks for the response. I, too, would hope that they would be selected-out, but unfortunately they are about the same topic (climate science). Is there a way to purge our “stored” user_files?