Assistants Retrieval Configuration

Hey everybody,

the following questions are addressed to whom may have gathered some experience in the topic already.

How does it work under the hood? Is kinda of tricky to work with such of delicate topic without having any control over what happens behind the scenes. I would like to know if there are any kind of “preferences” for the data to upload, such as formats that perform better than others, structure the data, use paragraphs or similar.

Finally, how does the model actually retrieves it? Just semantic similarity? They said it uses their “experience” to provide quality results, but you have to know how to use it as well, and nowadays is just not documented.

Thank you!


Same here - knowing/controlling how the indexing works, and at miminum how to best structure the dataset (files) is very important.