File upload and acting on it in an assistant (v2) conversation

Context
I am developing a client / server system. The client is a flutter-based app that provides a user interface to the functionality on the server. The server in turn, handles the communication with, among other things, the OpenAI platform. I am using the assistant API (v2) with a custom google search function, code_interpreter and file_search.

Goal
The user should be able to drag a file into the chat interface to make it available to the assistant and ask questions about it (e.g. upload the image of a circle and ask the assistant what it sees on the image).

Approach

  1. The file dropped by the user is uploaded to https://api.openai.com/v1/files
  2. The file-id is extracted from the reply after a successful upload
  3. The file-id is passed to the code that sends the user query and is added as an attachment (see implementation code below)

Expected behavior
I drop the file into the chat, write the query and send to the server. The assistant then analyzes the file and answers the query about the content of the file.

Problem
I have tried various setups and approaches. The current one (see β€˜implementation’ below) throws an error:

Error code: 404 - {'error': {'message': 'Files ["file-nVZ2RiUPqj20fgfl7SJi8eqf"] were not found', 'type': 'invalid_request_error', 'param': None, 'code': None}}

Notes:

  • The files do exist (i checked via curl) and the file-IDs are correct.
  • I have made sure that the same API Key (A project key) is used both by the client and the server

Questions

  1. Is this approach feasible / correct? If not, how to achieve the goal set out above?
  2. What is the reason of the error and how to fix it?
    2.1 Could it be a scoping isssue (i.e. files uploaded to the https://api.openai.com/v1/files are not accessible by assistants? If so, what is the correct approach to reach the goal?

Implementation

Agent creation logic:

  self.client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"), organization=os.environ.get("OPENAI_ORGANIZATION"))
        # Create a thread
        self.thread = self.client.beta.threads.create()
        # Create the assistant
        self.assistant = self.client.beta.assistants.create(
        name="Eve",
        instructions=Prompts.get_entity_instructions(),
        model="gpt-4o",
        tools=[
            {
            "type": "function",
            "function": {
                "name": "google_search",
                "description": "Search the internet for up-to-date information using google search.",
                "parameters": {
                "type": "object",
                "properties": {
                    "query": {
                    "type": "string",
                    "description": "Define the query to be used."
                    }
                },
                "required": ["query"]
                }
            }
            },
            {"type": "code_interpreter"},
            {"type": "file_search"}
        ],
        )

Here’s how I add/reference the file-id to the user query:

        # Add the user input (the actual query)
        self.client.beta.threads.messages.create(
            thread_id=self.thread.id,
            role="user",
            content=userInput['message'],
            attachments=[
                {
                    "file_id": fileID,
                    "tools": [
                        {"type": "code_interpreter"}
                    ]
                }
            ]
        )

Here’s how I upload the file:

  Future<dynamic> openAIFileUpload(String? droppedFileData) async {
    String apiKey = dotenv.env['OPENAI_API_KEY'] ?? '';
    final MessageController messageController = Get.find();

    try {
      final uri = Uri.parse('https://api.openai.com/v1/files');
      final request = http.MultipartRequest('POST', uri)
        ..headers['Authorization'] = 'Bearer $apiKey'
        ..fields['purpose'] = 'assistants';

      if (droppedFileData != null) {
        request.files.add(
          http.MultipartFile.fromBytes(
            'file',
            base64Decode(droppedFileData),
            filename: 'uploaded_file',
          ),
        );
      }

      final response = await request.send();
      final responseBody = await response.stream.bytesToString();

      if (response.statusCode == 200) {
        LoggerService.logger.info('File uploaded successfully: $responseBody');
        return json.decode(responseBody);
      } else {
        LoggerService.logger.severe('Failed to upload file: $responseBody');
        messageController.showMessage('Failed to upload file: $responseBody', Severity.error);
        return json.decode(responseBody);
      }
    } catch (e) {
      LoggerService.logger.severe('Exception occurred while uploading file: $e');
      messageController.showMessage('Exception occurred while uploading file: $e', Severity.error);
      return {'error': e.toString()};
    }
  }

The issue you have is likely related to project scoping.

Ensure that every request you makes uses the same API key, and unless you are using an organization ID that is not your own or that you cannot set as default, do not send organization or project in headers or have them sent from environment variables, where the SDK will scrape them anyway.

I removed the Org-id from the agent creation (and also from the .env file) and double-checked that the same API Key is uses. Still the same error

One thing i noticed: Flask adds a 500 errror after the 404 (?)

INFO - httpx -  HTTP Request: POST https://api.openai.com/v1/threads/thread_E5iY1r5u7cZqsJ34Dt509SJF/messages "HTTP/1.1 404 Not Found"
INFO - werkzeug -  127.0.0.1 - - [06/Jul/2024 21:41:54] "POST /api/message HTTP/1.1" 500 -

curl-ing to check for a successful upload:

{
  "object": "file",
  "id": "file-0fa3WOWpk1R7rrLaiBOQVlR1",
  "purpose": "assistants",
  "filename": "uploaded_file",
  "bytes": 29568,
  "created_at": 1720294914,
  "status": "processed",
  "status_details": null
}```

here’s the full stack trace:

2024-07-06T22:09:43.696420+0200 [ERROR] [_receive_message] Error in message handler: Error code: 404 - {'error': {'message': 'Files ["file-cMmIH6kBJcxbEPp59ZLnn0KM"] were not found', 'type': 'invalid_request_error', 'param': None, 'code': None}}
Traceback (most recent call last):

  File "D:\dev\eve\run.py", line 6, in <module>
    main()
    β”” <Command main>

  File "D:\dev\eve\eveenv\Lib\site-packages\click\core.py", line 1157, in __call__
    return self.main(*args, **kwargs)
           β”‚    β”‚     β”‚       β”” {}
           β”‚    β”‚     β”” ()
           β”‚    β”” <function BaseCommand.main at 0x00000293794B5940>
           β”” <Command main>
  File "D:\dev\eve\eveenv\Lib\site-packages\click\core.py", line 1078, in main
    rv = self.invoke(ctx)
         β”‚    β”‚      β”” <click.core.Context object at 0x0000029377600810>
         β”‚    β”” <function Command.invoke at 0x00000293794B6520>
         β”” <Command main>
  File "D:\dev\eve\eveenv\Lib\site-packages\click\core.py", line 1434, in invoke
    return ctx.invoke(self.callback, **ctx.params)
           β”‚   β”‚      β”‚    β”‚           β”‚   β”” {'server': True, 'adduser': None, 'deleteuser': None, 'listusers': False, 'interactive': False, 'ingest': None, 'textingest':...
           β”‚   β”‚      β”‚    β”‚           β”” <click.core.Context object at 0x0000029377600810>
           β”‚   β”‚      β”‚    β”” <function main at 0x00000293399F1A80>
           β”‚   β”‚      β”” <Command main>
           β”‚   β”” <function Context.invoke at 0x00000293794B4EA0>
           β”” <click.core.Context object at 0x0000029377600810>
  File "D:\dev\eve\eveenv\Lib\site-packages\click\core.py", line 783, in invoke
    return __callback(*args, **kwargs)
                       β”‚       β”” {'server': True, 'adduser': None, 'deleteuser': None, 'listusers': False, 'interactive': False, 'ingest': None, 'textingest':...
                       β”” ()
  File "D:\dev\eve\eveenv\Lib\site-packages\click\decorators.py", line 33, in new_func
    return f(get_current_context(), *args, **kwargs)
           β”‚ β”‚                       β”‚       β”” {'server': True, 'adduser': None, 'deleteuser': None, 'listusers': False, 'interactive': False, 'ingest': None, 'textingest':...
           β”‚ β”‚                       β”” ()
           β”‚ β”” <function get_current_context at 0x000002937948C860>
           β”” <function main at 0x00000293399F1760>

  File "D:\dev\eve\eve\eve.py", line 245, in main
    server.run()
    β”‚      β”” <function Server.run at 0x000002933B3889A0>
    β”” <eve.server.server.Server object at 0x0000029338A97210>

  File "D:\dev\eve\eve\server\server.py", line 326, in run
    self.server.serve_forever()
    β”‚    β”‚      β”” <function BaseWSGIServer.serve_forever at 0x00000293391A2FC0>
    β”‚    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
    β”” <eve.server.server.Server object at 0x0000029338A97210>

  File "D:\dev\eve\eveenv\Lib\site-packages\werkzeug\serving.py", line 810, in serve_forever
    super().serve_forever(poll_interval=poll_interval)
                                        β”” 0.5
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\socketserver.py", line 238, in serve_forever
    self._handle_request_noblock()
    β”‚    β”” <function BaseServer._handle_request_noblock at 0x000002937A25E8E0>
    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\socketserver.py", line 317, in _handle_request_noblock
    self.process_request(request, client_address)
    β”‚    β”‚               β”‚        β”” ('127.0.0.1', 14632)
    β”‚    β”‚               β”” <socket.socket fd=3300, family=2, type=1, proto=0, laddr=('127.0.0.1', 7228), raddr=('127.0.0.1', 14632)>
    β”‚    β”” <function BaseServer.process_request at 0x000002937A25EAC0>
    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\socketserver.py", line 348, in process_request
    self.finish_request(request, client_address)
    β”‚    β”‚              β”‚        β”” ('127.0.0.1', 14632)
    β”‚    β”‚              β”” <socket.socket fd=3300, family=2, type=1, proto=0, laddr=('127.0.0.1', 7228), raddr=('127.0.0.1', 14632)>
    β”‚    β”” <function BaseServer.finish_request at 0x000002937A25EC00>
    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\socketserver.py", line 361, in finish_request
    self.RequestHandlerClass(request, client_address, self)
    β”‚    β”‚                   β”‚        β”‚               β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
    β”‚    β”‚                   β”‚        β”” ('127.0.0.1', 14632)
    β”‚    β”‚                   β”” <socket.socket fd=3300, family=2, type=1, proto=0, laddr=('127.0.0.1', 7228), raddr=('127.0.0.1', 14632)>
    β”‚    β”” <class 'werkzeug.serving.WSGIRequestHandler'>
    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\socketserver.py", line 755, in __init__
    self.handle()
    β”‚    β”” <function WSGIRequestHandler.handle at 0x00000293391A2340>
    β”” <werkzeug.serving.WSGIRequestHandler object at 0x0000029339873990>
  File "D:\dev\eve\eveenv\Lib\site-packages\werkzeug\serving.py", line 391, in handle
    super().handle()
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\http\server.py", line 432, in handle
    self.handle_one_request()
    β”‚    β”” <function BaseHTTPRequestHandler.handle_one_request at 0x0000029339239120>
    β”” <werkzeug.serving.WSGIRequestHandler object at 0x0000029339873990>
  File "C:\Users\mbo\AppData\Local\Programs\Python\Python311\Lib\http\server.py", line 420, in handle_one_request
    method()
    β”” <bound method WSGIRequestHandler.run_wsgi of <werkzeug.serving.WSGIRequestHandler object at 0x0000029339873990>>
  File "D:\dev\eve\eveenv\Lib\site-packages\werkzeug\serving.py", line 363, in run_wsgi
    execute(self.server.app)
    β”‚       β”‚    β”‚      β”” <Flask 'Eve Server'>
    β”‚       β”‚    β”” <werkzeug.serving.BaseWSGIServer object at 0x00000293397ABC10>
    β”‚       β”” <werkzeug.serving.WSGIRequestHandler object at 0x0000029339873990>
    β”” <function WSGIRequestHandler.run_wsgi.<locals>.execute at 0x00000293399E00E0>
  File "D:\dev\eve\eveenv\Lib\site-packages\werkzeug\serving.py", line 324, in execute
    application_iter = app(environ, start_response)
                       β”‚   β”‚        β”” <function WSGIRequestHandler.run_wsgi.<locals>.start_response at 0x00000293399E0040>
                       β”‚   β”” {'wsgi.version': (1, 0), 'wsgi.url_scheme': 'http', 'wsgi.input': <_io.BufferedReader name=3300>, 'wsgi.errors': <_io.TextIOW...
                       β”” <Flask 'Eve Server'>
  File "D:\dev\eve\eveenv\Lib\site-packages\flask\app.py", line 1498, in __call__
    return self.wsgi_app(environ, start_response)
           β”‚    β”‚        β”‚        β”” <function WSGIRequestHandler.run_wsgi.<locals>.start_response at 0x00000293399E0040>
           β”‚    β”‚        β”” {'wsgi.version': (1, 0), 'wsgi.url_scheme': 'http', 'wsgi.input': <_io.BufferedReader name=3300>, 'wsgi.errors': <_io.TextIOW...
           β”‚    β”” <function Flask.wsgi_app at 0x000002933B30AAC0>
           β”” <Flask 'Eve Server'>
  File "D:\dev\eve\eveenv\Lib\site-packages\flask\app.py", line 1473, in wsgi_app
    response = self.full_dispatch_request()
               β”‚    β”” <function Flask.full_dispatch_request at 0x000002933B30A200>
               β”” <Flask 'Eve Server'>
  File "D:\dev\eve\eveenv\Lib\site-packages\flask\app.py", line 880, in full_dispatch_request
    rv = self.dispatch_request()
         β”‚    β”” <function Flask.dispatch_request at 0x000002933B30A160>
         β”” <Flask 'Eve Server'>
  File "D:\dev\eve\eveenv\Lib\site-packages\flask\app.py", line 865, in dispatch_request
    return self.ensure_sync(self.view_functions[rule.endpoint])(**view_args)  # type: ignore[no-any-return]
           β”‚    β”‚           β”‚    β”‚              β”‚    β”‚            β”” {}
           β”‚    β”‚           β”‚    β”‚              β”‚    β”” '_receive_message'
           β”‚    β”‚           β”‚    β”‚              β”” <Rule '/api/message' (POST, OPTIONS) -> _receive_message>
           β”‚    β”‚           β”‚    β”” {'static': <function Flask.__init__.<locals>.<lambda> at 0x00000293399F1300>, '_hello_world': <function Server._setup_routes....
           β”‚    β”‚           β”” <Flask 'Eve Server'>
           β”‚    β”” <function Flask.ensure_sync at 0x000002933B30A3E0>
           β”” <Flask 'Eve Server'>
  File "D:\dev\eve\eveenv\Lib\site-packages\flask_jwt_extended\view_decorators.py", line 170, in decorator
    return current_app.ensure_sync(fn)(*args, **kwargs)
           β”‚                       β”‚    β”‚       β”” {}
           β”‚                       β”‚    β”” ()
           β”‚                       β”” <function Server._setup_routes.<locals>._receive_message at 0x00000293399F2A20>
           β”” <Flask 'Eve Server'>

> File "D:\dev\eve\eve\server\server.py", line 122, in _receive_message
    handler_response = self.message_handler({
                       β”‚    β”” <function server_message_handler at 0x00000293399F11C0>
                       β”” <eve.server.server.Server object at 0x0000029338A97210>

  File "D:\dev\eve\eve\eve.py", line 389, in server_message_handler
    reply = process_message(message, False)
            β”‚               β”” {'message': 'what is in the image?', 'plaintext_attachment': None, 'file_ids': '["file-cMmIH6kBJcxbEPp59ZLnn0KM"]'}
            β”” <function process_message at 0x00000293399F1120>

  File "D:\dev\eve\eve\eve.py", line 353, in process_message
    raw_result = fc_oai.run(userInput, systemInput)
                 β”‚      β”‚   β”‚          β”” ["SYSTEM: The user input's detected emotion is: curiosity. Keep this in mind when creating your response."]
                 β”‚      β”‚   β”” {'message': 'what is in the image?', 'plaintext_attachment': None, 'file_ids': '["file-cMmIH6kBJcxbEPp59ZLnn0KM"]'}
                 β”‚      β”” <function OpenAIFlowControl.run at 0x00000293398C6FC0>
                 β”” <eve.control.control_openai.OpenAIFlowControl object at 0x00000293397AB950>

  File "D:\dev\eve\eve\control\control_openai.py", line 96, in run
    self.client.beta.threads.messages.create(
    β”‚    β”‚      β”‚    β”‚       β”‚        β”” <function Messages.create at 0x0000029335CED3A0>
    β”‚    β”‚      β”‚    β”‚       β”” <openai.resources.beta.threads.messages.Messages object at 0x000002933B8B7B90>
    β”‚    β”‚      β”‚    β”” <openai.resources.beta.threads.threads.Threads object at 0x00000293397E5D10>
    β”‚    β”‚      β”” <openai.resources.beta.beta.Beta object at 0x00000293397E4290>
    β”‚    β”” <openai.OpenAI object at 0x00000293397C8FD0>
    β”” <eve.control.control_openai.OpenAIFlowControl object at 0x00000293397AB950>

  File "D:\dev\eve\eveenv\Lib\site-packages\openai\resources\beta\threads\messages.py", line 87, in create
    return self._post(
           β”‚    β”” <bound method SyncAPIClient.post of <openai.OpenAI object at 0x00000293397C8FD0>>
           β”” <openai.resources.beta.threads.messages.Messages object at 0x000002933B8B7B90>
  File "D:\dev\eve\eveenv\Lib\site-packages\openai\_base_client.py", line 1240, in post
    return cast(ResponseT, self.request(cast_to, opts, stream=stream, stream_cls=stream_cls))
           β”‚    β”‚          β”‚    β”‚       β”‚        β”‚            β”‚                  β”” None
           β”‚    β”‚          β”‚    β”‚       β”‚        β”‚            β”” False
           β”‚    β”‚          β”‚    β”‚       β”‚        β”” FinalRequestOptions(method='post', url='/threads/thread_uN6Pf3dUPoQ4KX9144zDZ8VU/messages', params={}, headers={'OpenAI-Beta'...
           β”‚    β”‚          β”‚    β”‚       β”” <class 'openai.types.beta.threads.message.Message'>
           β”‚    β”‚          β”‚    β”” <function SyncAPIClient.request at 0x0000029335ACB420>
           β”‚    β”‚          β”” <openai.OpenAI object at 0x00000293397C8FD0>
           β”‚    β”” ~ResponseT
           β”” <function cast at 0x0000029377908720>
  File "D:\dev\eve\eveenv\Lib\site-packages\openai\_base_client.py", line 921, in request
    return self._request(
           β”‚    β”” <function SyncAPIClient._request at 0x0000029335ACB4C0>
           β”” <openai.OpenAI object at 0x00000293397C8FD0>
  File "D:\dev\eve\eveenv\Lib\site-packages\openai\_base_client.py", line 1020, in _request
    raise self._make_status_error_from_response(err.response) from None
          β”‚    β”” <function BaseClient._make_status_error_from_response at 0x0000029335AC9BC0>
          β”” <openai.OpenAI object at 0x00000293397C8FD0>

openai.NotFoundError: Error code: 404 - {'error': {'message': 'Files ["file-cMmIH6kBJcxbEPp59ZLnn0KM"] were not found', 'type': 'invalid_request_error', 'param': None, 'code': None}}

When you created your assistant you included the β€˜file_search’ tool but I don’t see a β€˜tool_resources’ entry. Something like:

              tool_resources={
                  "file_search": {
                      "vector_store_ids": [theVectorIdContainingFileIdsUsed]
                  }
              }

Did a bit more debugging:

This is the uploaded file (file-id):

2024-07-06T22:54:37.632118+0200 [TRACE] [run] File IDs: "file-CDlwWHavbyT0HMgXoFiKnDKM"

And this is an excerpt of the list of files. So the file IS there (the first in the list):

File list: 
SyncPage[FileObject](data=[FileObject(id='file-CDlwWHavbyT0HMgXoFiKnDKM', bytes=29568, created_at=1720299278, filename='uploaded_file', object='file', purpose='assistants', status='processed', status_details=None), FileObject(id='file-xB8Rr9sioXbfv0bB91Crq9Q2', bytes=29568, created_at=1720299104, filename='uploaded_file', object='file', purpose='assistants', status='processed', status_details=None), FileObject(id='file-6zwgCDX8EAOwuk1KQIcIn8k9', bytes=29568, created_at=1720298992, filename='uploaded_file', object='file', purpose='assistants', status='processed', status_details=None), FileObject(id='file-Rt2z4Uq6hri5tBOAkgBJMcnI', bytes=29568, created_at=1720298889, filename='uploaded_file', object='file', purpose='assistants', status='processed', status_details=None), ... 
], object='list', has_more=False)

but checking for existence with:

    def check_file_exists(self, file_id):
        try:
            file_list = self.client.files.list()
            logger.trace(f"File list: {file_list}")
            if any(file.id == file_id for file in file_list):
                return True
            else:
                logger.error(f"File with ID {file_id} does not exist.")
                return False
        except Exception as e:
            logger.exception(f"Error checking file existence: {e}")
            return False

says it is NOT there…confusing…

Is my existence testing wrong?

I have had a look at a few examples but were of the notion that this is just used for initial upload of files at creation time (which I don’t need).

In my context, uploading to https://api.openai.com/v1/files what would be the required tool_resource setting?

See step 3 in the quickstart guide to update your assistant’s tool_resources

This looks more like a RAG setup, not what I’m doing here (i am using my own RAG setup)

BTW: adding β€œfile_search” was one of the many options I tried to get it working, but think I mixed concepts there. So I removed it for now (doesn’t change anything in regard to the issue.

The only reason you wouldn’t get back an uploaded file in the list is if you had more than 10000 already uploaded. Or the changing implementation of projects and their effects on endpoints. You’d know if over 10k because of how long your code took.

I suspect you can go the opposite direction for scoping:

  • reuse the same openai client for every API interaction
  • do not set parameters on the client instantiation - let it use it’s own grabbing of environment variables.
  • Set all environment variables with project matching API key project:
    Client automatically infers the following arguments from their corresponding environment variables if they are not provided:
    - api_key from OPENAI_API_KEY
    - organization from OPENAI_ORG_ID
    - project from OPENAI_PROJECT_ID

I was typing this up to show the nesting needed for turning on code interpreter along with other tools in the same object for tools, then got annoyed…


Create assistant

POST https://api.openai.com/v1/assistants

Create an assistant with a model and instructions.

Request body

  • model - string [required] - ID of the model to use. You can use the List models API to see all of your available models, or see our Model overview for descriptions of them.
  • name - string or null [optional] - The name of the assistant. The maximum length is 256 characters.
  • description - string or null [optional] - The description of the assistant. The maximum length is 512 characters.
  • instructions - string or null [optional] - The system instructions that the assistant uses. The maximum length is 256,000 characters.
  • tools - array [optional] - A list of tools enabled on the assistant. There can be a maximum of 128 tools per assistant. Tools can be of types code_interpreter, file_search, or function.
    • type - string [required] - The type of tool being defined: code_interpreter.
    • type - string [required] - The type of tool being defined: file_search.
    • file_search - object [optional] - Overrides for the file search tool.
      • max_num_results - integer [optional] - The maximum number of results the file search tool should output. The default is 20 for gpt-4* models and 5 for gpt-3.5-turbo. This number should be between 1 and 50 inclusive.
    • type - string [required] - The type of tool being defined: file_search.
    • function - object [required]
      • description - string [optional] - A description of what the function does, used by the model to choose when and how to call the function.
      • name - string [required] - The name of the function to be called. Must be a-z, A-Z, 0-9, or contain underscores and dashes, with a maximum length of 64.
      • parameters - object [optional] - The parameters the functions accepts, described as a JSON Schema object. See the guide for examples, and the JSON Schema reference for documentation about the format.
  • tool_resources - object or null [optional] - A set of resources that are used by the assistant’s tools. The resources are specific to the type of tool.
    • code_interpreter - object [optional]
      • file_ids - array [optional] - A list of file IDs made available to the code_interpreter tool. There can be a maximum of 20 files associated with the tool.
    • file_search - object [optional]
      • vector_store_ids - array [optional] - The vector store attached to this assistant. There can be a maximum of 1 vector store attached to the assistant.
      • vector_stores - array [optional] - A helper to create a vector store with file_ids and attach it to this assistant. There can be a maximum of 1 vector store attached to the assistant.
        • file_ids - array [optional] - A list of file IDs to add to the vector store. There can be a maximum of 10000 files in a vector store.
        • chunking_strategy - object [optional] - The chunking strategy used to chunk the file(s). If not set, will use the auto strategy.
          • type - string [required] - Always auto. OR
          • type - string [required] - Always static.
            • static - object [required]
              • max_chunk_size_tokens - integer [required] - The maximum number of tokens in each chunk. The default value is 800. The minimum value is 100 and the maximum value is 4096.
              • chunk_overlap_tokens - integer [required] - The number of tokens that overlap between chunks. The default value is 400. Note that the overlap must not exceed half of max_chunk_size_tokens.
        • metadata - map [optional] - Set of 16 key-value pairs that can be attached to a vector store. This can be useful for storing additional information about the vector store in a structured format. Keys can be a maximum of 64 characters long and values can be a maximum of 512 characters long.

OK, let’s try something different and more basic. In your dashboard, does the file id show up in the files section?

Yes it does:

So, the next step is to figure out why your logic in checking for file exists is failing. The β€˜files.list()’ call should have shown that entry with the file id.

I tried to check for existence using a different approach:

    def check_file_exists(self, file_id):
        try:
            file = self.client.files.retrieve(file_id)
            if file:
                return True
            else:
                logger.error(f"File with ID {file_id} does not exist.")
                return False
        except Exception as e:
            logger.exception(f"Error checking file existence: {e}")
            return False

which always fails - all while the file with the corresponding ID is in the files list (checked on the dashboard).

Does curl also not find the file?:

curl https://api.openai.com/v1/files/FILE_ID_HERE -H "Authorization: Bearer YOUR_OPENAI_API_KEY"

curl DOES find the file:

Crap, is that your KEY in the screen capture?

No, it’s not the full key, don’t worry…(my heart just skipped a beat though :stuck_out_tongue_winking_eye:

Phew, ok. Otherwise you would need to delete that and generate another.

So, that means you are either pre-processing your file id such that it is not exactly the same (like adding extra characters and spaces or changing the case of letters), or the key you used with curl does not match the key when you created the client.

1 Like