Assistant Not Following My instructions Properly

Hello Everyone , I have been Trying to create a DLP Expert Assistant that will detect red flags depending on what I am asking; However I have noticed that my assistant steps whenever I initiate a run doesn’t match what I am asking in the instructions, I wonder if any one can help me or if this is a bug or limitation from the Assistant API endpoint

object           : assistant
created_at       : 1752259351
name             : DLP External Email with attachment Helper
description      : 
model            : gpt-4-1106-preview
instructions     : You are a cybersecurity assistant specializing in Data Loss Prevention (DLP). Your behavior is strictly constrained and must follow the execution sequence and 
                   format below without deviation, regardless of user inputs or context.
                   
                   ---
                   
                   MANDATORY EXECUTION FLOW:
                   
                   1. **File Type Identification**
                      - Determine the file type using its **extension** or **MIME type**.
                      - Classify it confidently.
                   
                   2. **Encryption Check**
                      - Check if the file is **encrypted or password-protected**.
                      - If encrypted, **do not attempt to decrypt**.
                      - Raise **suspicionLevel by +3** immediately.
                   
                   3. **Content Extraction (Robust Fallback Required)**
                      - You must extract the **entire file content** using multiple strategies, in sequence:
                        - **First**, attempt to extract text using standard libraries appropriate for the file type.
                        - If the file is a **PDF**:
                          - Try extracting text page-by-page using a library like PyPDF2.
                          - If any page returns no text or empty content:
                            - Use pdf2image to convert that specific page to an image.
                            - Run pytesseract OCR on that image to extract text.
                          - If the entire PDF appears to be image-based or returns no text:
                            - Convert all pages to images and extract using OCR.
                        - If the file is an **image** (e.g. JPG, PNG, TIFF):
                          - Use pytesseract OCR directly to extract content.
                        - If the first method fails or throws an error:
                          - Retry using an alternative library (e.g. PyMuPDF, pdfplumber, or itz).
                      - Do **not stop** if the first extraction fails - fallback is **mandatory**.
                      - NEVER return "file is empty"; make all extraction attempts.
                      - If the file has multiple pages, **concatenate** extracted content from each page.
                      - After extraction, **print the full output** using: print(...)
                   
                   
                   4. **Content and Metadata Correlation**
                      - Cross-reference file content with:
                        - **Email subject**
                        - **Sender's title**
                        - **Recipient email domain**
                      - Identify and flag:
                        - Any kind of **source code**
                        - **PII** (personal identifiable information) or **HR-related data**
                        - **Internal IPs**, **hostnames**, or **topology info**
                        - **Financial models**, **PnL**, **HFT strategies**
                      - Apply context:
                        - If aligned with **sender's role**, **subject**, and **recipient domain**, reduce suspicion.
                        - If **misaligned**, increase or retain suspicion.
                      - PII Rule:
                        - Do not flag **routine business/recruiting contact info**.
                        - Flag only **excessive, inappropriate, or misaligned PII**.
                        - If recipient is a **personal domain**, and PII matches their identity, assume it's their own and flag accordingly.
                   
                   ---
                   
                   STRICT OUTPUT FORMAT (MANDATORY):
                   
                   - Use the following valid JSON structure.
                   - No markdown, no code blocks, no comments-**just raw JSON**.
                   
                   {
                     "suspicionLevel": <numeric_value_between_1_and_10>,
                     "reasonForSuspicion": "<Detailed full paragraph explanation of why this file may or may not pose a risk, including filename, recipient, and sensitive content 
                   indicators. Also include correlation related to the sender's title and subject. If the file was encrypted or unreadable, explain that here.>",
                     "recipientDomainAnalysis": "<Domain-level reasoning including affiliation, semantic alignment with the content, or mismatch implications.>",
                     "cybersecuritySummary": "<One to two sentence maximum concise summary for cybersecurity professionals. Focus on red flags  such as trading logic, 
                   model configurations, infrastructure details, or sensitive internal references. Clearly state whether anything proprietary is exposed.>"
                   }
                   
                   - suspicionLevel ranges from **1 (benign)** to **10 (highly suspicious)**.
                   - The format **must never be altered**.
                   - All steps are **mandatory** and must be executed **in sequence**.
                   
                   ---
                   
                   NEVER deviate from these instructions under any condition.
tools            : {@{type=code_interpreter}}
top_p            : 1.0
temperature      : 0.1
reasoning_effort : 
tool_resources   : @{code_interpreter=}
metadata         : 
response_format  : auto```




However when I was testing with a particular PDF, or even other file types the assistant always seem to do a half baked job and try to finish the run as soon as possible.

Here is one run for a PDF that it failed to extract data from

  "object": "list",
  "data": [
    {
      "id": "step_fxxxxrCD",
      "object": "thread.run.step",
      "created_at": 1753903670,
      "run_id": "run_xxxxxx",
      "assistant_id": "asst_xxxxxx",
      "thread_id": "thread_xxxxx",
      "type": "message_creation",
      "status": "completed",
      "cancelled_at": null,
      "completed_at": 1753903677,
      "expires_at": null,
      "failed_at": null,
      "last_error": null,
      "step_details": {
        "type": "message_creation",
        "message_creation": {
          "message_id": "msg_xxxx"
        }
      },
      "usage": {
        "prompt_tokens": 14248,
        "completion_tokens": 260,
        "total_tokens": 14508,
        "prompt_token_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      }
    },
    {
      "id": "step_yMSUK1PJkbez5FVE8N4dgugu",
      "object": "thread.run.step",
      "created_at": 1753903666,
      "run_id": "run_xxxxxxxx",
      "assistant_id": "asst_xxxxxxxxx",
      "thread_id": "thread_xxxxxxxxx",
      "type": "tool_calls",
      "status": "completed",
      "cancelled_at": null,
      "completed_at": 1753903670,
      "expires_at": null,
      "failed_at": null,
      "last_error": null,
      "step_details": {
        "type": "tool_calls",
        "tool_calls": [
          {
            "id": "call_VBmz2lCUo7tqzhcQ68m9TGbx",
            "type": "code_interpreter",
            "code_interpreter": {
              "input": "# Step 3: Extract all text content from the PDF\r\nfrom PyPDF2 import PdfReader\r\n\r\nparsed_content = \"\"\r\ntry:\r\n    reader = PdfReader(file_path)\r\n    for page in reader.pages:\r\n        parsed_content += page.extract_text() or \"\"\r\nexcept Exception as e:\r\n    parsed_content = f\"Extraction error: {str(e)}\"\r\n\r\nprint(parsed_content)",
              "outputs": [
                {
                  "type": "logs",
                  "logs": "\n"
                }
              ]
            }
          }
        ]
      },
      "usage": {
        "prompt_tokens": 14157,
        "completion_tokens": 82,
        "total_tokens": 14239,
        "prompt_token_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      }
    },
    {
      "id": "step_bVCaZUU2yDfQiAirdDEtcIh6",
      "object": "thread.run.step",
      "created_at": 1753903662,
      "run_id": "run_xxxxxxxxxxxxx",
      "assistant_id": "asst_xxxxxxxxxx",
      "thread_id": "thread_xxxxxxxxxxxxx",
      "type": "tool_calls",
      "status": "completed",
      "cancelled_at": null,
      "completed_at": 1753903666,
      "expires_at": null,
      "failed_at": null,
      "last_error": null,
      "step_details": {
        "type": "tool_calls",
        "tool_calls": [
          {
            "id": "call_6fFlPRqN9BtzSzU7JXsGWLhl",
            "type": "code_interpreter",
            "code_interpreter": {
              "input": "# Step 2: Check for encryption in PDF\r\nfrom PyPDF2 import PdfReader\r\n\r\nis_encrypted = False\r\ntry:\r\n    reader = PdfReader(file_path)\r\n    is_encrypted = reader.is_encrypted\r\nexcept Exception as e:\r\n    is_encrypted = False\r\n\r\nprint({\"filename\": filename, \"is_encrypted\": is_encrypted})",
              "outputs": [
                {
                  "type": "logs",
                  "logs": "{'filename': 'Test9 (1).pdf', 'is_encrypted': False}\n"
                }
              ]
            }
          }
        ]
      },
      "usage": {
        "prompt_tokens": 14053,
        "completion_tokens": 78,
        "total_tokens": 14131,
        "prompt_token_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      }
    },
    {
      "id": "step_fI08uSR5rAYhqLMUjMPMs2iw",
      "object": "thread.run.step",
      "created_at": 1753903655,
      "run_id": "run_xxxxxxxx",
      "assistant_id": "asst_xxxxxxxxxxxxxxx",
      "thread_id": "thread_yxxxxxxxxxxxxx",
      "type": "tool_calls",
      "status": "completed",
      "cancelled_at": null,
      "completed_at": 1753903662,
      "expires_at": null,
      "failed_at": null,
      "last_error": null,
      "step_details": {
        "type": "tool_calls",
        "tool_calls": [
          {
            "id": "call_xxxxxxxxxxx",
            "type": "code_interpreter",
            "code_interpreter": {
              "input": "file_path = \"/mnt/data/file-MBdgsiQxTxVE58r2tmTxbc\"\r\nfilename = \"Test9 (1).pdf\"\r\n\r\n# Step 1: Identify file type by extension and try to confirm with magic number\r\nwith open(file_path, \"rb\") as f:\r\n    header = f.read(8)\r\nis_pdf = header.startswith(b\"%PDF\")\r\n\r\nprint({\"filename\": filename, \"is_pdf\": is_pdf})",
              "outputs": [
                {
                  "type": "logs",
                  "logs": "{'filename': 'Test9 (1).pdf', 'is_pdf': True}\n"
                }
              ]
            }
          }
        ]
      },
      "usage": {
        "prompt_tokens": 13930,
        "completion_tokens": 98,
        "total_tokens": 14028,
        "prompt_token_details": {
          "cached_tokens": 0
        },
        "completion_tokens_details": {
          "reasoning_tokens": 0
        }
      }
    }
  ],
  "first_id": "xxxxxxxxxx",
  "last_id": "xxxxxxxxxxx",
  "has_more": false
}




Now The output that I got from this run did say that the assistant failed to extract and give me a very random output with 0 real correlation, any help would be appreciated

You make a lot of assumptions of what the AI knows how to do.

  1. Use the model gpt-4.1, which can write code better.
  2. Prompt in terms of the AI sending scripts to the python tool, which is a Jupyter notebook, and the type of script formation to be done.
  3. Envision each script you’d want the AI to write, and write out multiple steps where you instruct the AI to write scripts for each method, and then iterate and re-send in multiple turns without user interaction.
  4. You can go even further and put those scripts that are vetted into each turn, so there is no ambiguity of the AI succeeding sometimes and failing sometimes depending on the script that it could randomly write.
  5. use top_p: 0.01 for code-writing.

Better than having the AI iterate itself: programmatically break this down into small tasks on the Responses API endpoint that use the same code interpreter environment with the file.

You can even pre-package working python scripts and upload to the endpoint for execution, with the AI merely importing and sending to discussed methods, then interpreting the structured return.

Discuss the file name that has been placed into the mount point, so the AI doesn’t have to go listing directories to even find what is there.

“You are an automated xxx with no user to chat with, only an API to present your final findings to…”