I’m struggling with how to even handle this particular bug.
Does one parse the messages in every thread for certain keywords such as “myfiles_browser” because this condition does not raise an exception.
And in some cases, it seems like the same keywords aren’t even being used but the end result is the same: we could not read the files associated with this request.
if ('re-upload' in last_content.lower()
or last_content.startswith("I apologize")
or last_content.startswith("Apologies")
or last_content.startswith("I'm sorry for the misunderstanding")
or "The file you've uploaded is not accessible with the tool that allows me to view" in last_content
or 'I’m sorry for the confusion, but it seems there was an issue with the file accessing tool.' in last_content
or 'myfiles_browser' in last_content):
Same error, it works in the playground but not with the API. It always mentions myfiles_browser, even though the file has been uploaded and I can see it within the files tab…
The fact that you have to re-upload a file over the playground to a thread and not use an existing file is suspicious. In combination with the fact that we are billed for every message this file is used (full token count) makes me belive that under the hood the whole file_ids stuff is not working yet.
I really think that they use the uploaded file, take the content and make it part of their internal messages… Or am I on the wrong track here?
I’m seeing that we are all the same, we have tried making a delay between requests, calls through php (curl) and Python and nothing… 50% of responses the system returns the file access error. Hopefully it will be resolved soon or they will make some kind of statement.
Do you get working annotations? When I try this, I don’t get the “myfiles_browser” error message from the model anymore. It replies and cites it sources with a syntax like 【13†source】, but without providing any annotations alongside.
You have to retrieve the message object and parse the substrings of the annotations to see them. From the openAI Documentation:
# Retrieve the message object
message = client.beta.threads.messages.retrieve(
thread_id="...",
message_id="..."
)
# Extract the message content
message_content = message.content[0].text
annotations = message_content.annotations
citations = []
# Iterate over the annotations and add footnotes
for index, annotation in enumerate(annotations):
# Replace the text with a footnote
message_content.value = message_content.value.replace(annotation.text, f' [{index}]')
# Gather citations based on annotation attributes
if (file_citation := getattr(annotation, 'file_citation', None)):
cited_file = client.files.retrieve(file_citation.file_id)
citations.append(f'[{index}] {file_citation.quote} from {cited_file.filename}')
elif (file_path := getattr(annotation, 'file_path', None)):
cited_file = client.files.retrieve(file_path.file_id)
citations.append(f'[{index}] Click <here> to download {cited_file.filename}')
# Note: File download functionality not implemented above for brevity
# Add footnotes to the end of the message before displaying to user
message_content.value += '\n' + '\n'.join(citations)
I’ve noticed that including “add annotations” in your user message helps substantially the same way confirming the assistant has the files in question helps with the retrieval. I’d say try to beef up the description/instruction of the assistant with these additions. Also hopefully these issues are just the bugs of a beta version and will be fixed soon
@abhinavgujjar FYI I was able to get around this error by introducing a timeout after uploading the files and before creating the assistant, e.g.
// Upload a file with an "assistants" purpose
const file = await openai.files.create({
file: fs.createReadStream('test.pdf'),
purpose: 'assistants',
})
// wait for the file to be processed
log('Waiting for file to process...')
await wait(2000)
// Add the file to the assistant
const assistant = await openai.beta.assistants.create({
A better solution may be to poll the files endpoint to check that the file is available there.
EDIT: This only worked for so long. Even after waiting for 10s and polling the files api, I ran into sporadic issues.
I have seen similar stuff. If you submit an empty message to a thread you can get back results like this. Data leakage? Security problem? Hallucination?
Beta is one thing…this API needs serious attention.