400 error on connect assistent files

after uploading a file (.docx), connecting it to an assistant results in a 400 error with the following message:
{
“error”: {
“message”:
“Files with extensions [.docx?api_token=xxxxxx] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/supported-files”,
“type”: “invalid_request_error”,
“param”: null,
“code”: “unsupported_file”
}
}

The file being an authentic .docx is somehow not recognized the way it should be.

This happened at times before. Normally there is no issue.
The files status is “processed”.

Is this a bug or did something change?

1 Like

Short update: tried to reroute the file via a non-public S3 with a pre-signed URL and now i received the following error:

{
“error”: {
“message”: “Files with extensions [none] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/supported-files”,
“type”: “invalid_request_error”,
“param”: null,
“code”: “unsupported_file”
}
}

Seems there is some sort of primary recognition method in place looking at clear extensions before MME?

This makes it very difficult to keep using assistant retrieval when files shared are non-public.

Facing the same issue, my files are just PDFs

Same here with rather routine PDF files. Nothing special about the file and it has a .pdf extension, etc.

Try toFile() with name parameter.

toFile(yourfile, 'xxxx.ext')

https://github.com/openai/openai-node/blob/master/src/uploads.ts#L102

1 Like

Thanks for looking into this!

The way i’m consuming the API, I don’t know how to pass on any other parameter than the file’s location. Not in the post file and connect file steps.

Is there anything i’m overlooking?

I’m trying to preconfigure the Content-Disposition in S3 as well (don’t know if this works). Maybe this will return the filename instead of the pre-signed URL when the file access is authenticated. But i’m a bit afraid the name is already saved based on the initial path submitted in the post request.

Also facing the same issue:

        assistant_file = client.beta.assistants.files.create(
            assistant_id=assistant_id, file_id=file_id
        )

with a regular .pdf file gives

openai.BadRequestError: Error code: 400 - {'error': {'message': 'Files with extensions [none] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/supported-files', 'type': 'invalid_request_error', 'param': None, 'code': 'unsupported_file'}}

I believe the issue is with the file name. If you upload a file, ‘example_file.pdf’ on the web via the assistants tab, it uploads with the correct file name. When I upload a file binary using the API it uploads the file with the default name ‘upload’, having no file extension. Currently trying to figure out how to specify the file name/extension… If anyone knows how to do that, it’ll probably solve this

2 Likes

I think I might have come to a solution here. When I upload the bytes directly:

response = self.client.files.create(
                file=filebytes,
                purpose="assistants"
            )

And I try to create an assistant:

my_assistant = self.client.beta.assistants.create(
                model=self.engine,
                name=assistant_name,
                file_ids=[document_id],
                tools=[{"type": "retrieval"}],
                instructions=CREATE_ASSISTANT_INSTRUCTIONS
            )

It returns error 400: Files with extensions [none] are not supported for retrieval.

But if I open a file with “rb” as in the documentation:

response = self.client.files.create(
                file=open("/path/test.pdf", "rb"),  # filebytes,
                purpose="assistants"
            )

Then it works. It looks indeed that the problem is that in the first upload, the file is uploaded with name “upload” with no extension, therefore giving the error. In the second example the filename is test.pdf, and everything works smoothly.

So, if you can use open(), it will work because open() returns the name as metadata, in addition to the bytes. But in my case, I had to retrieve the file from Google Cloud Storage:

blob = self.bucket.blob(blob_name)
return blob.download_as_bytes()

Which returns the bytes, but it doesn’t return the name as metadata. So, I added it manually:

from io import BytesIO
filebytes = Cloud_Storege_Util().get_file_bytes(blob_name)
file_like_object = BytesIO(filebytes)
file_like_object.name = "nometest.pdf"

So first, I read the bytes from Cloud Storage. This would result in error 400, if I uploaded it directly to OpenAI. Then I use BytesIO to create an object, and finally add the name property to that object, in this case nometest.pdf. This uploads the file with the correct name, and while I haven’t tested this extensively, it seems to get rid of the error.

DISCLAIMER: I am just a junior dev so take everything I say with a grain of salt. :joy:

2 Likes

Thanks for your response, appreciated!
Unfortunately this doesn’t help me. I’m consuming the API via a noCode environment so not many variables to append…

Think for now there is not much that I can do until the API allows to send a filetype, content_type or filename seperate from the path or checks MIME type at the openAI side (regardless of the filepath).

You can also pass a tuple for the ‘file’ arg in the create method to specify a file name. In the tuple you need [bytes, filename] and it will upload the bytes along with a specified file name that is not ‘upload’.

Worth mentioning that the correct order for the tuple to be passed in is (filename, filebytes) like this:

file = client.files.create(
            file=(filename, filebytes),
            purpose='assistants',
        )

See this link for additional details.

2 Likes

Hey. This worked! Thanks!
It used to work for me without this but it suddenly broke. This fixed it.

I’ve just added “uploaded_file.txt” as filename.

Has someone here managed to make assistants file uploads/attachments work through laravel/php? The file attachment was working fine a week ago and then it suddenly broke. My function looks something like this:

    public function uploadFileToAssistant(string $assistantId, UploadedFile $file): AssistantFileResponse
    {
        // Upload the file to OpenAI
        $uploadResponse = OpenAI::files()->upload([
            'file' => fopen($file->getPathname(), 'r'),
            'purpose' => 'assistants',
        ]);

        $fileId = $uploadResponse->id;

        // Attach the uploaded file to the assistant
        $attachResponse = OpenAI::assistants()->files()->create($assistantId, [
            'file_id' => $fileId,
        ]);

        return $attachResponse;
    }

Yet I always get the dreaded

“message”: “Files with extensions [none] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/supported-files

As far as I scanned the php documentation the “rb” parameter for filebytes some solutions have mentioned is not available so I don’t know how I could solve this in php.

Looks like there’s a Python fix above ( @Bucciamarcia ). Anyone have a fix for Javascript? This was working until recently.

  let myfile = await openai.files.create({
          file: fs.createReadStream(req.files[key][0].path),
          purpose: "assistants",
        });

This is STILL not working for php laravel.

Here’s a fix for Javascript from a thread here:

// Set up storage engine
const storage = multer.diskStorage({
  destination: function (req, file, cb) {
    cb(null, "uploads/"); // Specify the directory where files should be saved
  },
  filename: function (req, file, cb) {
    // Use the original file name and append the original extension
    const uniqueSuffix = Date.now() + "-" + Math.round(Math.random() * 1e9);
    cb(
      null,
      file.fieldname + "-" + uniqueSuffix + path.extname(file.originalname)
    );
  },
});

const upload = multer({ storage: storage });

Adds the file extension to the stored file - works for me.

I’m not a code writer so unfortunately this is all a bit abracadabra to me, but perhaps understanding some of the steps a bit better will help me understand if it is feasible to fix something in my situation at all (connecting to the openAI API from a ‘nocode’ platform with my files stored in a non-public AWS S3).

When a file is uploaded, what happens? Am i sending the file over (so basically i’m using the AWS pre-signed url myself (or at least the running Nocode app executing the request is) to fetch the file from S3 and send it over), or am i sending the filepath to openAI so that openAI accesses and downloads the file.

I think this has something to do with the filename, filebytes mentioned before (

Just tried if the

Create vector store file

would perhaps bypass the read extensionftype first, based on an old file with pure AWSSIgnedUrl name but unfortunately also:

Raw response for the API
Status code 400
{
“error”: {
“message”: “Files with extensions [none] are not supported for retrieval. See https://platform.openai.com/docs/assistants/tools/supported-files”,
“type”: “invalid_request_error”,
“param”: “file_id”,
“code”: “unsupported_file”
}

My older files were all extracted from the URL based on whatever was between last ‘/’ and the ‘?’ (which also led to recognised files from AWS S3 presigned url) but now it is looking at what is behind the last #

so an AWS presigned url looks like (manipulated to remove some actual data)

https://abcde.s3.eu-central-1.amazonaws.com/FILE.docx?response-content-disposition=inline&X-Amz-Security-Token=IQoJb3JpZ2luX2VjEMz%2F%2F%2F%2F%2F%2F%2F%2F%2F%2FwEaCmV1LW5vcnRoLTEiRzBFAiEAv0kFoXcAvjqsDoPXHiNjBvvA2sJi7hLQzuPbajWp16cCIFo5aYhMZNXOHfTMR..........52%2Fn96DaR%2BuIXGlc2nr7r0vzgiBNZEQ%2BoW09%2FL1i%2FAxB7CllqQMdqVPo9Q85mpsc%2FlKqHDwWuHXNZmWtKufRPI2&X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Date=20240419T115336Z&X-Amz-SignedHeaders=host&X-Amz-Expires=7200&X-Amz-Credential=ASIAYS2NQ5SXOSU.....u-central-1%2Fs3%2Faws4_request&X-Amz-Signature=cb578be3116869869342b4f227e3dd18db2c871536a89cde4e55d55abcde

leads to a file within openAI named:

aws4_request&X-Amz-Signature=cb578be3116869869342b4f227e3dd18db2c871536a89cde4e55d55abcde

Would be really great if the filename extracation would be reconfigured to standard URI settings (or perhaps some optional info to pass on in the API to use URI for name recognition (or does not make any sense what i’m saying?)