Get retrieved text chunks from file_search tool?

Smile_Ly1 · June 7, 2024, 1:08am

Dear OpenAI Developer Community,

I’m wondering if the API can fetch the text chunks that were retrieved by the file search tool for a given thread message? I’ve tried the playground and the Python API but seems they are only able to display the file name(s) but not the specific chunks.

I wanted to see the actual chunks since it would allow the users to diagnose the RAG’s output, also it would allow us to compare the performance (i.e. answer faithfulness) between different RAG systems.

Thank you for your answer!

_j · June 7, 2024, 3:12am

Vector stores cannot be used outside of Assistants.

You can set the chunk size and overlap yourself when creating. You can set the number of chunk results to return, default 20.

Then it not really RAG, it is a search function. The search is a 256 dimension 3-large embeddings search with the query the AI wrote. There is no similarity threshold, just max chunk count.

Typical search return is larger than the AI can return. You can ask a typical question, or ask for a specific search query sent to myfiles_browser, and then ask for a short report from which you can infer where in the documents the results came from.

This may be more obfuscated if you didn’t provide plain text, like the OpenAI API yaml specification of 1MB in this case, where I asked how to make speech. The actual assistant may be less helpful in dumping out the knowledge without some prompt engineering.

Smile_Ly1 · June 7, 2024, 3:47pm

This is a great approach. Thanks for your answer!

Topic		Replies	Views
What is the chunking strategy used by the Assistant? API assistants	6	5047	December 5, 2024
Assistant's Retrieval Chunks in Playground: Can the Size be Controlled? API assistants	1	1382	November 18, 2023
Assistants API v2. Maximum number of chunks limit API	8	1697	October 31, 2024
File search in Assistant API API assistants-api , assistants-files , vector-store	4	193	November 10, 2024
Customizing chunk size for file_search tool API api , assistants-api	1	76	March 21, 2025

Get retrieved text chunks from file_search tool?

Related topics