File search disregards max num results

Even after specifying the “file_search”: { “max_num_results”: 2 } in an assistant createRun the response sometimes fetching more than 2 annotations from the vector store. Is this a bug or is there another way to do this…
CODE:

const createRunAndPollStatus = async (threadId, tools = [{ 
    "type": "file_search", 
    "file_search": { "max_num_results": 2 }
}]) => {
    const delay = ms => new Promise(res => setTimeout(res, ms));
    const maxAttempts = 10; // Adjust based on your needs
    
    try {
        const createResponse = await axios.post(`https://api.openai.com/v1/threads/${threadId}/runs`, {
            assistant_id: assistantId,
            tools: tools
        }, {
            headers: {
                "Authorization": `Bearer ${process.env.OPENAI_API_KEY}`,
                "OpenAI-Beta": "assistants=v2"
            }
        });

Thanks

I’m using Azure OpenAI (API version 2024-07-01-preview) and have encountered the same problem. Even if I set this parameter to 3, the prompt token for file search in the GPT-4 assistant is still 16k, which equals 800 tokens * 20 chunks.

I think there might be some bugs on OpenAI’s side causing this parameter to malfunction.

assistant = await client.beta.assistants.create(
    model="<PLACEHOLDER>",
    name="<PLACEHOLDER>",
    instructions="<PLACEHOLDER>",
    tools=[
        {
            "type": "file_search",
            "file_search": {
                "max_num_results": 3
            }
        }
    ],
    tool_resources={"file_search": {"vector_store_ids": ["<PLACEHOLDER>"]}},
)

yes I think its a bug too. I also found out I had to specify it as a tool choice to make sure it gets called everytime:

    const createResponse = await axios.post(`https://api.openai.com/v1/threads/${threadId}/runs`, {
        assistant_id: assistantId,
        tools: tools,
        tool_choice: {
            type: "file_search"
        }

Definitively a bug, a few experimentations lead me to believe the max_results (coded at the assistant or run objects) is doubled. This workaround may help:

max_results = assistant.tools[0].file_search.max_num_results
max_results = math.ceil(max_results / 2)
user_prompt = f"{user_prompt}\n - retrieve a maximum of {max_results} items"