Looks like for longer docs it will do a vectorized search…
"How it works
The model then decides when to retrieve content based on the user Messages. The Assistants API automatically chooses between two retrieval techniques:
- it either passes the file content in the prompt for short documents, or
- performs a vector search for longer documents
Retrieval currently optimizes for quality by adding all relevant content to the context of model calls. We plan to introduce other retrieval strategies to enable developers to choose a different tradeoff between retrieval quality and model usage cost."
So it indeed does emebed the docs - I’m still digging, but only hang up I’m seeing is the 20 doc limit, but honestly you can do a lot with 20 docs that have a 512 limit… I’m not sure how it charges yet for the embedding functionality (if it does at all!) if it doesn’t charge directly for embedding via this method it will be an insane game changer for my particular use case.