How to use multiple files in Assistants API

Hey dear friends,

I have a problem with using multiple files in the assistants API. One file is very fine, with two I get this error: "I have been provided with two files, but unfortunately, neither of these files is accessible with the myfiles_browser tool. The file IDs are:

  1. file-tbi...
  2. file-Cxr...

For further assistance with these files, you would need to provide them in a format compatible with the myfiles_browser tool or request a different type of support that doesn’t require direct file interaction. If you could convert these files into a readable format or re-upload them, I would be able to assist you further."

This is my code:
[…]
files_to_upload = [“CSRD_compressed.pdf”, “XHTMLSustainabilityTemplate.docx”]
file_ids =

# Upload each file and store the returned file IDs
for file_name in files_to_upload:
    with open(file_name, "rb") as file_data:
        file_response = client.files.create(file=file_data, purpose='assistants')
        file_ids.append(file_response['id']) 

assistant = client.beta.assistants.create(instructions="""
      Kim is ...
      """,
                                          model="gpt-4-1106-preview",
                                          tools=[{
                                              "type": "retrieval"
                                          }],
                                          file_ids= file_ids
                                         )

[…]

Do you see any obvious errors or something?

Thanks.

Cheers,
Artur

If there is an undocumented “myfiles browser” tool function within assistants that allows it to iterate at your expense on reading parts of documents, instead of the augmentation described, it is likely that those files are still in the original form and cannot be parsed. A PDF does not contain text for an AI.

Try OCR-enhancing the PDF, then extract the text to a txt file, and then place that into the retrieval system for less uncertainties about AI understanding.

Thanks very much for the answer.
So basically just converting PDF to txt? And what about .docx?

But I have to say PDF and docx worked great being a single file, I get this error since trying to implement multiple files.

I cannot explain why the assistants function works occasionally or poorly, except that overall it is more than occasionally poor the more you investigate.

It may be that one document is placed completely into the model context window by one type of document parser, but with two documents of longer length, the AI is told to “go looking” by a tool that works poorly.

2 Likes