Upload files using actions

Background:

Recently, while discussing and testing the new model, I noticed that it is capable of creating long-term strategies and correcting its own mistakes due to the fact that it can seamlessly perform various actions (such as code interpreter and Internet search) sequentially almost indefinitely without human intervention. This news inspired me and I continued my experiments, but ran into artificial limitations, namely the limitations of the built-in interpreter; it does not have access to the network, which is necessary for many useful operations, and as a result, it does not have the ability to download new Python packages, download and upload files to any external resources and many other restrictions arising from the absence, or rather the limitations of the network in the container of this virtual environment. Subsequently, this prompted me to create my own assistant that would be devoid of these shortcomings, I successfully implemented my own interpreter, launched it on my server, created actions so that the assistant could use it and it works great, but I decided that just a code interpreter was not enough without restrictions and decided to add the ability to the bot to upload files to the server so that it could process them using a new interpreter, and this is where the problem arose.

The main part and essence of the problem:

Immediately I decided to study the documentation in order to be able to upload files; in the documentation I found an article that described exactly this: OpenAI documentation article for upload files

But after I implemented the API on the server, changed my hint for the assistant, added a new action as described in the documentation, I ran into a serious problem, the assistant in 99% of cases simply refuses to send files to the server, citing the fact that they are simply is not in the dialogue and asks to upload the files again, and even after repeated downloads, it is not a fact that he will decide to upload them to the server. Stable uploading of files to the server from the dialog only works in one single case, only if you click the test button in the assistant editor. At the same time, what I say is stable does not mean that it always works, the problem with the fact that the assistant does not see the files in the dialogue still persists in most cases, I could not identify the conditions for correct loading and correct generation of a request by the neural network, so I am inclined to think that this happens by chance and only “at the request” of the model

Please, if anyone has encountered this problem when trying to implement their own solutions, or if someone knows how to solve this problem, please support this topic and, if possible, describe the steps to solve this problem

I am also attaching additional information that may be needed:

The prompt I use:

You’re a bot for doing smart things, so you should try to do everything yourself with minimal user intervention. Always upload user-submitted files to the server using the uploadFiles action. Always send files to the server immediately after receiving them. Run Python code exclusively through the executeCode action, don’t even try to use the built-in interpreter. To achieve the best results, think step by step, formulate an action strategy, and then execute it. If it doesn’t work, analyze the data and try again. Repeat until you get the desired result.

Scheme of actions for calling the api (I deleted some data so that you don’t break my server =)):

openapi: 3.0.0
info:
  title: Code Execution API
  description: API for uploading files and executing Python code
  version: 1.0.0
servers:
  - url: url to my server, I was delete this for seccure
    description: Main server
paths:
  /execute:
    post:
      operationId: executeCode
      x-openai-isConsequential: false
      summary: Execute Python code and return the result
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                code:
                  type: string
                  example: |
                    print("Hello from the executed code!")
      responses:
        "200":
          description: Успешное выполнение кода
          content:
            application/json:
              schema:
                type: object
                properties:
                  result:
                    type: string
                    example: Hello from the executed code!
        "400":
          description: Ошибка, если код не предоставлен
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
                    example: No code provided
  /upload:
    post:
      operationId: uploadFiles
      x-openai-isConsequential: false
      summary: Upload files and return the file paths
      description: Uploads files using their file IDs. All file types are supported for uploading. A minimum of one file is required, and all files that might be needed for further processing should be uploaded.
      requestBody:
        required: true
        content:
          application/json:
            schema:
              type: object
              properties:
                openaiFileIdRefs:
                  type: array
                  items:
                    type: string
      responses:
        "200":
          description: Successfully uploaded files
          content:
            application/json:
              schema:
                type: object
                properties:
                  file_paths:
                    type: array
                    items:
                      type: string
        "400":
          description: Error if no files are provided
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string
                    example: No files provided
        "500":
          description: Error on the server side while processing files
          content:
            application/json:
              schema:
                type: object
                properties:
                  error:
                    type: string

The specific endpoint code that is used to upload files to the server:

@app.route('/upload', methods=['POST'])
def upload_file():
    data = request.json

    print(data)

    if not data or 'openaiFileIdRefs' not in data or len(data['openaiFileIdRefs']) == 0:
        return jsonify({"error": "No files to upload"}), 400

    file_paths = []
    try:
        for file_ref in data['openaiFileIdRefs']:
            response = requests.get(file_ref['download_link'])
            if response.status_code == 200:
                file_path = os.path.join('files', file_ref['name'])
                with open(file_path, 'wb') as f:
                    f.write(response.content)
                file_paths.append(file_path)
            else:
                return jsonify({"error": f"Failed to download file {file_ref['name']}"}), 500

        return jsonify({"file_paths": file_paths}), 200
    except Exception as e:
        return jsonify({"error": str(e)}), 500

The only error that occurs is error 400, the assistant simply does not transfer the list of files, to which my code responds with an error

Thanks in advance to everyone who will participate in this topic!

Indeed, the built it interpreter is impossible to control, and I wanted to build interpreter for with my own context.

I struggled with this, trying openaiFileResponse (didn’t work - was able to send files I uploaded, but not the generated code), sending files with base64 (gpt is doing mistakes in the code, forgetting parenthesis), and finally what worked for me is simply to send the file as text. Here’s the relevant action description:
post:
operationId: testCode
description: Gets code, run it, and get the results of the code run
x-openai-isConsequential: false
requestBody:
required: true
description: code
content:
application/json:
schema:
type: object
properties:
code:
type: string
format: byte
description: Code that is using pippete.

For the server, I used aws lambda.
The result is nice: GPT is generating code, sending to the lambda service, and if there are errors - it can fix those on the fly and resend.