I think I might have come to a solution here. When I upload the bytes directly:
response = self.client.files.create(
file=filebytes,
purpose="assistants"
)
And I try to create an assistant:
my_assistant = self.client.beta.assistants.create(
model=self.engine,
name=assistant_name,
file_ids=[document_id],
tools=[{"type": "retrieval"}],
instructions=CREATE_ASSISTANT_INSTRUCTIONS
)
It returns error 400: Files with extensions [none] are not supported for retrieval.
But if I open a file with “rb” as in the documentation:
response = self.client.files.create(
file=open("/path/test.pdf", "rb"), # filebytes,
purpose="assistants"
)
Then it works. It looks indeed that the problem is that in the first upload, the file is uploaded with name “upload” with no extension, therefore giving the error. In the second example the filename is test.pdf, and everything works smoothly.
So, if you can use open()
, it will work because open()
returns the name as metadata, in addition to the bytes. But in my case, I had to retrieve the file from Google Cloud Storage:
blob = self.bucket.blob(blob_name)
return blob.download_as_bytes()
Which returns the bytes, but it doesn’t return the name as metadata. So, I added it manually:
from io import BytesIO
filebytes = Cloud_Storege_Util().get_file_bytes(blob_name)
file_like_object = BytesIO(filebytes)
file_like_object.name = "nometest.pdf"
So first, I read the bytes from Cloud Storage. This would result in error 400, if I uploaded it directly to OpenAI. Then I use BytesIO
to create an object, and finally add the name
property to that object, in this case nometest.pdf
. This uploads the file with the correct name, and while I haven’t tested this extensively, it seems to get rid of the error.
DISCLAIMER: I am just a junior dev so take everything I say with a grain of salt.