I am developing a CRM application focused on insurance brokers, which includes a module for extracting relevant information from insurance proposals in PDF format. For this functionality, I have integrated OpenAI’s Assistant, and the API call flow follows the process below:
- Creation of a
Vector_Store
; - Upload of the PDF file;
- Attachment of the file to the vector, specifying the ID of both;
- Creation of a new
Thread
, specifying the ID of theVector_Store
; - Execution of the
Thread
, providing the IDs of theVector_Store
and the PDF file; - API call to
messages
to obtain the response.
The system is working very well in the application’s beta version, currently being used by 15 users daily. However, I am concerned about scaling to a larger number of users after the official launch. My question is whether OpenAI’s system can handle a high volume of consecutive requests without mixing up the data, as it is crucial to avoid responses being sent to the wrong user, which could lead to the exposure of sensitive information.
I would like to know if there is a real risk of data intermingling between different requests, and if so, what best practices should be followed to prevent this issue.