Inspecting File Search Chunks

compassjeff · October 29, 2024, 4:20pm

According to the documentation available here: https://platform.openai.com/docs/assistants/tools/file-search, it is possible to retrieve and inspect the file-search chunks that were used by the Assistants API by using the include parameter when retrieving the appropriate run step:

run_step = client.beta.threads.runs.steps.retrieve(
    thread_id="thread_abc123",
    run_id="run_abc123",
    step_id="step_abc123",
    include=["step_details.tool_calls[*].file_search.results[*].content"]
)

However, I am not returning anything extra informaiton when using include. I have checked every run step for the the run, and I know that there are at least 15k tokens worth of context that were used in the last thread I tested this with (not to mention that numerous FileCitations included in the Assistant’s response).

Has anyone actually seen this work? I can’t find any examples on the web of what the response should even look like.

If it is working for other people, is there anyone who is using AzureOpenAI who has this working properly? I know they don’t necessarily always maintain feature parity with OpenAI directly.

pingmeonavinash · November 18, 2024, 3:42pm

I am also facing the same issue,where i would need to know the chunk or index of the text being used in generating the response .
Did you make it work ? @compassjeff

imihailov · December 11, 2024, 1:39pm

@compassjeff I got it to work by modifying the code a bit (used “extra_query=” to insert the “include”):

run_step_details = sync_client.beta.threads.runs.steps.retrieve(
                thread_id="thread_abc123",
                run_id="run_abc123",
                step_id="step_abc123",
                extra_query={
                    "include": ["step_details.tool_calls[*].file_search.results[*].content"]
                }
            )

Then I looped through the steps, excluded steps where there were no chunks, and retrieved the chunks from the step that had them. Hopefully this helps!

Topic		Replies	Views
What is the chunking strategy used by the Assistant? API assistants	6	4046	December 5, 2024
Assistants API v2. Maximum number of chunks limit API	8	1215	October 31, 2024
Get retrieved text chunks from file_search tool? API assistants-files	2	717	June 7, 2024
Hello People , Did any one of you used the File Search if you used File Search , Did you used the Inspecting file search chunks in it? API gpt-4 , gpt-35-turbo , api , assistants-api , file-search	0	57	October 10, 2024
Assistant API - Error with files API	20	6171	October 9, 2024

Inspecting File Search Chunks

Related topics