Assistant api, retrieval file api is not working

This is still not working. I am using the API. To over come the “I cannot access your files” error, I added the File IDs to the run instructions as follows:

run = Client.beta.threads.runs.create(
            thread_id=Thread.id,
            assistant_id=AssistantID,
            instructions=Assistant.instructions+"You have access to the following files:\n\n"+\
            "\nfile_ids: %s"%(threadfiles())+ "\n\nUse the files to answer the question.")

This time, I do not get “I cannot access your files” error but the response to the query is “I don’t know”. But I know that the information is in the files.

I can get around this bug by doing the following:

  1. Attach the files to the thread when you create the thread
  2. In the subsequent queries, add messages with the same Files and this time they are noted. Here is my code, which generally works:
# First question in this topic
def startthread(query):
    global Thread
    Thread = Client.beta.threads.create(messages=[
          {
            "role": "user",
            "content": query,
            "file_ids": threadfiles()
          }
        ]) 
     return
  
# Subsequent questions in the same topic
def qtothread(query):
    global Thread

    thread_message = Client.beta.threads.messages.create(
      Thread.id,
      role="user",
      content=query,
      file_ids=threadfiles()
    )

You still get the random BadRequestError. For example, in my last run, I had the following Q&A chain:

Q1. My first question.
A1. I don’t know

Q2:The answer is in the attached text files. Please read them and answer the question. Repeat first question.
A2: Correct answer.

Q3: Follow-up question (this time only the question, not the preamble on the attached text files)
A3: Correct answer.

Q4: Another follow-up question

It bombed out with the following error:

BadRequestError                           Traceback (most recent call last)
...
 run = Client.beta.threads.runs.create(
     20             thread_id=Thread.id,
     21             assistant_id=AssistantID,
     22             instructions=Assistant.instructions)
...
...
raise self._make_status_error_from_response(err.response) from None
    886 except httpx.TimeoutException as err:
    887     if retries > 0:

BadRequestError: Error code: 400 - {'error': {'message': 'Failed to index file', 'type': 'invalid_request_error', 'param': None, 'code': None}}

The API also is very slow in giving answers. It was not like this before.

In conclusion, the above method is useful when you have the same set files for every interaction in one thread. There is still the random BadRequestError but I have a feeling that its cause is different.

I found a less-than-ideal solution but it seems to be working okay for me for now. Here’s what I’m doing:

  1. Upload the file, get the file id.
  2. Delete any previous thread that might exist and create a brand new one.
  3. Create a message on the new thread with the file id attached.
  4. Create a run with additional instructions: "Do not use myfiles_browser to access the file. Instead, write some python code to obtain the file contents as a string using this file path: /mnt/data/" + fileId.

Important/noteworthy info in bold.

It seems like if it’s a new thread it will read the file (for me, at least). Subsequent runs on the same thread always fail so I am creating a new one every time. Of course, this is no good if thread continuity is important for your use case. Luckily in my case it is not. Adding this here in case it helps anyone else!

Same problem, but it does not happen all the time, some days my Openai bot assistant uploads files, I query the content, I receive textual and graphical answers. Other days I receive a standard negative answer like for example “I cannot find any csv file, did you upload it?”. It is unpredictable.
According to the official documentation after uploading a file, it is necessary to attach it to the thread. It happens that it is probably the only way to “inform” the assistant that there is a file to take in consideration. Moreover both tools, code interpreter and retrieval, are activated for the assistant. The model is gpt-3.5-turbo-1106. After uploading the file Openai server sends the following (printed with PHP print_r):

OpenAI\Responses\Files\CreateResponse Object
(
[id] => file-0KD1uVuJMNHaC2Fd7mZw3CBQ
[object] => file
[bytes] => 189220
[createdAt] => 1706604908
[filename] => myname.csv
[purpose] => assistants
[status] => processed
[statusDetails] =>
[meta:OpenAI\Responses\Files\CreateResponse:private] => OpenAI\Responses\Meta\MetaInformation Object
(
[requestId] => f058fec0f51234281410e4b3d7vn80ed
[openai] => OpenAI\Responses\Meta\MetaInformationOpenAI Object
(
[model] =>
[organization] => user-f5qx9xvfkopf71xboa67dsrn
[version] => 2020-10-01
[processingMs] => 395
)
[requestLimit] =>
[tokenLimit] =>
)
)

you see that there is no information about the specific assistant.

In order to avoid misunderstanding with the assistant after uploading I send a specific message embedded in the code then I attach the file Id to the thread (PHP code):

$userMessage = ‘Next instruction will refer to the file now uploaded’;

$res = $clientOpenai->threads()->messages()->create(
$threadId,
[ ‘role’ => ‘user’ , ‘content’ => $userMessage , ‘file_ids’ => [$fileId] ]
);
$run = $clientOpenai->threads()->runs()->create(threadId: $threadId, parameters: [ ‘assistant_id’ => $botOpenaiId ]);
$runId = $run->id;
do {
sleep(1);
$res = $clientOpenai->threads()->runs()->retrieve( threadId: $threadId, runId: $runId );
} while ($res->status !== ‘completed’);

Then I get an object that contains arrays and other objects.
By carefully analysing the object received as answer, I do not detect any anomaly. Therefore, I cannot think about other than “Beta” problems. I would like to detect other kind of problems that I can solve. Beta can be solved exclusively at Openai side.

1 Like

This general reasoning with keeping track of the OpenAI unique fileID helped me a lot to understand how to make Assistant access files ina more consistent manner. Thank you for posting!