OpenAI file search gives low consistency for specific prompt

gunjanmimo · January 20, 2025, 12:01pm

I am developing a file search system where data is stored in Markdown format within a vector store and follows a specific structure. To improve efficiency, I have implemented a pre-filtering mechanism to narrow down the search scope.

For example, out of 1000 documents stored, the pre-filtering process selects the 30 most relevant documents to run the search on. From this subset, the system retrieves the top 5 results closest to the search query.

While the system achieves 100% precision in ensuring that all final search results come from the pre-filtered list, it suffers from low consistency. Running the same query multiple times produces different results from the same pre-filtered set of 30 documents.

How can I address this issue to ensure more consistent search output?

My current setup

   const run = await this.openai.beta.threads.runs.createAndPoll(thread.id, {
      assistant_id: searchAssistantId,
      additional_instructions:
        'My prompt here',
      max_prompt_tokens: 20000,
      max_completion_tokens: 2000,
      tool_choice: { type: 'file_search' },
      tools:[
        {
          type: 'file_search',
          file_search: {
            max_num_results:limit,
            ranking_options:{
              score_threshold:0.5,
              ranker:"default_2024_08_21",
            }
          }
        }
      ],
      include:["step_details.tool_calls[*].file_search.results[*].content"]
    });

sathishthiru94 · January 20, 2025, 12:36pm

Check this link; it might help you out.

Enhancing Data Analysis with OpenAI’s File Search and Code Interpreter Assistants - Documentation - OpenAI Developer Forum

hugebelts · January 20, 2025, 1:21pm

Hey, @gunjanmimo. Welcome to the forum by the way and happy new year 2025 to you.

You show everything but the thing that seems to be most relevant, if I understood correctly.

The prompt seems to be inconsistent.

But it’s nowhere to be seen.

That’s like telling someone to repair your car.
Just by telling them what to look for by your judgment. And then they may be asking: where’s your car, so I can have a look?

Or did I get anything wrong? Which is possible as well.

P. S. So the “moving” parts are:
The vector database AND the output of the prompt to handle the search query/queries.

Topic		Replies	Views
How to refine the result from file search API	3	315	September 17, 2024
Improving File Search specificity w/ Assistant for accurate document processing API assistants-api , file-uploads	3	1340	December 3, 2024
How to Ensure Reproducibility in OpenAI Assistant API Responses? API assistants-api	0	120	January 10, 2025
Assistants v2 file_search not using the files in a consistent way API assistants , assistants-api	9	1877	September 17, 2024
Prompt returns answers from only one file in a vector store API gpt-4 , api , vector-store	3	270	September 11, 2024

OpenAI file search gives low consistency for specific prompt

Related topics