The Assistants API cannot access files for some reason

For some reason, every time I attach a file_id to a message, the assistant cannot retrieve it. The purpose of the file is for retrieval. I would always get a message along the lines of this:

I’m sorry, but as an AI, I do not have the capability to download or view files. However, you can tell me about the contents of the document or ask questions, and I will do my best to provide relevant and accurate information based on the data and knowledge I have been trained on up until April 2023.

And some other variations of this message.

Here’s a sample code:

thread = client.beta.threads.create()

message = client.beta.threads.messages.create(
    thread_id=thread.id,
    role="user",
    content="Tell me about this document.",
    file_ids=["file-xxx"],
)

run = client.beta.threads.runs.create(
    thread_id=thread.id, assistant_id="asst_xxx"
)

while True:
    time.sleep(5)
    retrieve_run = client.beta.threads.runs.retrieve(thread_id=thread.id, run_id=run.id)

    if retrieve_run.status == "completed":
        break

messages = client.beta.threads.messages.list(thread_id=thread.id)

Has anyone encountered the same issue? I would like to know how you solved it. Thanks!

1 Like

I had the same issue earlier today - and then a few hours later it worked again as expected. I suspect this is just the consequence of work being done behind the curtains on the API.

Tried it out again, and somehow it now works. Previously it would fail 100% of the time.

Whenever I mention the word “document” in my prompt, it seems to be able to find the file. However if I don’t mention the word “document”, it’s kind of a hit or miss.

Also, sometimes when it does detect the file, it will mention things like:

I’m unable to access the entire document, but based on the sections I was able to review, it appears to be related to…

or something like

The file seems to be an … However, the text appears to be corrupted or garbled, making it difficult to comprehend.

Would you mind telling me the instructions that you gave to the assistant?

1 Like

The file seems to be an … However, the text appears to be corrupted or garbled, making it difficult to comprehend.

Yeah I’m seeing this exact language for many of my requests as well. Has anyone found a solution here or gotten the attention of the OAI team?

Still experiencing this for PDFs on December 11th with both GPT-4 preview and 3.5-turbo.

I am using the following code with gpt-4-turbo-preview, which is copied directly from the OpenAI documentation:

thread = client.beta.threads.create(
            messages=[
                {
                    "role": "user",
                    "content": "my prompt"
                    "file_ids": [file1.id],
                }
            ]
        )

Every time I run it, I get some version of this message:
I’m sorry, but I can’t directly access or analyze the content of files uploaded. However, if you could provide some details or excerpts from the documents, I would be happy to help analyze those based on the instructions you’ve provided.

I read from others in this chain that it may be an intermittent problem. However, it has not worked for me no matter how many times I run it over the past few days.

I am assuming the problem is on OpenAI’s end and not with the code. If on their end, has anyone come up with a workaround?

FIX: Rookie Mistake. I misunderstood the documentation. I thought that, because I was not storing the docs with the Assistant, that I did not need to turn “Retrieval” on for the assistant. Once I did that, everything worked fine.

Hi, thanks for this clue!

Where do you turn on “Retrieval” as you mentioned?

I don’t see retrieval as an option on the assistants object.

I do see that you can make tools and resources (files) available:

But even with my assistant showing access to the files (when I do a GET Assistant call), I am still getting the same error that the assistant cannot access the file.

Very frustrating! :slight_smile:

Retrieval has been renamed to file search in the latest upgrade to V2 of the assistants api.

See this migration guide for more details.

Thanks. I’m using v2 and file_search is a ‘tool’, it’s not a switch you turn on.

From my understanding, file_search and code_interpreter are tools that do different things. file_search acts on vector files. So you can run code_interpreter without file_search. They can be independent.

Basically, what I’m saying is: the ‘turn on retrieval’ idea is not the solution. I still get the ‘can’t access file’ error even though I have activated both tools for my assistant.

Interestingly, I did notice a big difference with the temperature setting when looking for files. Perhaps a low setting like 0.1 requires the prompt to perfectly state the file name, whereas, setting of 1.0 seems to find it (in the playground).

2 Likes

I created a gpt-3.5-turbo, and gpt-4o assistants so that I can test which had the better response. I was working on a tour package chatbot for a travel agency who is my potential client.
I used her package brochure sheets to create a JSON template and populated it was a few sample tours. I wrote explicit instructions. It seemed to work but after the first few prompts, it always says that it could not retrieve my data, which renders it useless.
I cloned it on got-4o, but same issues. Is this not doable, and we should look for a non-OpenAI solution?

To help assistant reference the correct file in the answer, I explicitly add this file_id to the message query:

message_data = {
                    "role": "user",
                    "content": f"Using the document (file-id: {file_id}): {base_message}",
                    "attachments": [{
                        "file_id": file_id,
                        "tools": [{"type": "file_search"}]
                    }]
                }

Here was my thought process:

  • First document in a thread becomes somewhat “sticky” as primary context
  • Subsequent documents are accessible but not prioritized
  • System can access all documents but tends to default to the first one
  • The file_search tool seems to maintain access to all files but lacks clear context switching

So - the solution is a crutch, but it works (almost always, anyways :sweat_smile:)!