Hello community,
I’ve encountered a critical issue in a workflow I developed using the GPT-4o model with the assistants feature, specifically when leveraging file_search. Previously, everything worked seamlessly—the model was able to identify demands correctly and perform appropriate file searches. However, since November 20, coinciding with the release of the latest version of the model, I’ve noticed an unexpected regression in this behavior.
The Issue:
After the update, the workflow is no longer able to correctly identify the requests for file_search. The model often responds that the file hasn’t been uploaded properly, even though the upload process completes successfully, as verified by logs. Notably, there were no changes to the workflow’s code before this issue began occurring.
File Type in Use:
The files being used in this workflow are .pcap files that are converted into .txt format. Previously, GPT-4o worked well with these converted files, correctly analyzing and referencing them. However, since the update, the model struggles to identify and process these files consistently, even when they are successfully uploaded.
Scenarios:
1. Before the Update:
- Files were automatically attached to messages and processed correctly.
- The model could perform searches and provide accurate responses based on the attached files.
2. After the Update:
- Even when files are successfully uploaded, the model often insists that files need to be reattached.
- Logs confirm that the files are properly stored in the vector_store, but the model remains unable to access them.
Investigated Hypotheses:
1. File Upload Issues:
- Initially, files were uploaded automatically with messages. To address the issue, we adjusted the workflow to manually create the vector_store and upload the files.
- Result: This adjustment did not resolve the issue.
2. Conflicts Related to Asynchronous Processing:
- We suspected that the completion was being called before the file upload process had finished.
- Test: Even after waiting a significant amount of time and confirming the files were available in the vector_store, the model continued to fail in recognizing them.
3. Difficulty Recognizing Specific File Types:
- We initially suspected that the .txt files generated from .pcap data might be problematic, but further testing revealed that the model does generally work with these file types. However, it inconsistently fails to process them in certain workflows.
- Test: We confirmed that the files are not empty and conform to accepted formats for upload, yet the model still fails in some cases.
4. Workflow Order Issues:
- We investigated whether the sequence of creating the vector_store and attaching it to the thread might be causing the issue.
- Result: No consistent improvements were observed.
Inconsistent Behavior:
One intriguing case occurred when, during a second execution of the same workflow (with no changes to the code), the model successfully identified and analyzed the uploaded files. However, in subsequent runs with different files, the issue reappeared, suggesting an inconsistency in how the model processes requests.
Logs from the vector_store clearly show that files are uploaded successfully, with the correct number of files stored. Despite this, the model does not reliably recognize the files in all attempts.
Questions:
Have other community members experienced similar issues with file_search after the GPT-4o update? Are there any documented changes in the model’s or API’s behavior that could explain this? Lastly, are there any recommendations for mitigating this issue while we await a potential fix or clarification?
Thank you in advance for your attention. Any insights or suggestions would be greatly appreciated!