Responses API stops responding with multiple file attachments

Issue Description

I am experiencing an issue where streaming responses from the Responses API get stuck when processing with multiple file attachments. The stream stops after emitting a few events and never completes.

I tried to do it without the streaming hoping to get a responses perhaps with some errors or check the header for any useful information, but when streaming is not enabled, I don’t get any responses at all, the http client just waits till it times out.

Sometimes if I retry a couple of times, it works, so it happens like 4 out of 5 times, the more the attachments involved, the higher the chances, with no attachment or one or two, it always works;

During the stream, I am able to get the response ID resp_xxx in the response.created
event, when I try to look up the response ID in the open AI logs or try to fetch it via the API, it doesn’t exist

Request structure

{
  "input": [
    {
      "role": "user",
      "type": "message",
      "content": "Process the uploaded attachment(s) and generate content based on the following description:"
    },
    {
      "role": "user",
      "type": "message",
      "content": [
        {
          "type": "input_text",
          "text": "worksheet attachments with the file extension `pdf`."
        },
        {
          "type": "input_file",
          "file_id": "file-ABC123..."
        },
        {
          "type": "input_file",
          "file_id": "file-DEF456..."
        },
        {
          "type": "input_file",
          "file_id": "file-GHI789..."
        },
        {
          "type": "input_file",
          "file_id": "file-JKL012..."
        }
      ]
    },
    {
      "role": "user",
      "type": "message",
      "content": [
        {
          "type": "input_text",
          "text": "curriculum document attachments with the file extension `pdf`."
        },
        {
          "type": "input_file",
          "file_id": "file-MNO345..."
        },
        {
          "type": "input_file",
          "file_id": "file-PQR678..."
        }
      ]
    }
  ],
  "model": "gpt-4o",
  "instructions": "# Sample Instructions\n\nGenerate content based on uploaded documents...",
  "truncation": "auto",
  "tool_choice": "required",
  "tools": [
    {
      "type": "function",
      "name": "submit_assessment",
      "description": "Submit generated content for review",
      "parameters": {
        "type": "object",
        "properties": {
          "assessment": {
            "type": "object",
            "properties": {
              "title": {"type": "string"},
              "description": {"type": "string"},
              "learningObjectives": {
                "type": "array",
                "items": {"type": "string"}
              },
              "markingScheme": {"type": "string"},
              "gradingType": {"type": "string"},
              "maximumMarks": {"type": "integer", "nullable": true}
            },
            "required": ["title", "description", "learningObjectives", "markingScheme", "gradingType", "maximumMarks"],
            "additionalProperties": false
          }
        },
        "required": ["assessment"],
        "additionalProperties": false
      },
      "strict": true
    },
    {
      "type": "file_search",
      "vector_store_ids": ["vs_1234567890abcdef..."]
    },
    {
      "type": "code_interpreter",
      "container": {
        "type": "auto",
        "file_ids": [
          "file-ABC123...",
          "file-DEF456...",
          "file-GHI789..."
        ]
      }
    }
  ],
  "stream": true,
  "conversation": "conv_1234567890abcdef..."
}

This is really frustrating as I have clients waiting for me to release some features for them, I cannot proceed with this, with the erractic nature of the responses API

Environment Details

  • API: Responses API v1
  • Model gpt-4o, gpt-5
  • File Count: 3+ attachments (PDFs, DOCX, PPTX)
  • Tools: file_search + code_interpreter + custom function
  • Success Rate: ~20% (1 out of 5 attempts)
  • Environment: Staging (Kubernetes)

Note that if non pdf and non image files are present, they will be attached to the vector store and added via the file_search and also to the code_interpreter

“input_file” is only for PDFs. There’s also PDFs that can be password-protected or locked against extraction of searchable text, for further issues.

A PDF attachment to a user message is not a method that you can use for bulk ingestion of data. It is not for loading documents into a vector store.

Instead, each PDF has text extract done on it, in addition to each page being rendered as an image. This is directly placed as content as part of a user message in full.

https://platform.openai.com/docs/guides/pdf-files?api-mode=responses

It is a full, unmitigated context loading. Besides the file size and page limits documented, you must do your own estimation and guessing of how much could be tokenized as an input message (which is limited to 1MB when you send text yourself), as you don’t get to observe the resulting PDF document extraction except as an input usage bill. You can imagine doing this every time you send a new request is going to be computationally intensive and ripe for triggering problems.

Other file types are simply not supported; the user message attachment is solely for PDF.

If you want knowledge and searching, and not full context of every document attempted every turn, you’d want to decide on solely vector store attachment of PDF files also and drop the other methods being attempted.

Having the PDFs in code interpreter and paying for a container is also wasteful, as there is nothing more to “extract” that hasn’t already been done and placed.


If your product is “talk to your PDFs”, you really need make that your product and your document extraction and RAG placement by code, and fail on the PDF and not the API call employing it.

Thank you for the detailed explanation! This makes now.

For all PDFs, I use the input_file and for non PDF and non image, I use the file_search and the code_interpreter tool.

What I will try to do will be to extract the contents myself and send the text content to the API instead of allowing the AI API to process the files

Good luck to you! The more you think about possible applications, the more you will likely want an agent workflow that discovers the user’s intentions and intelligently provides what they need:

  • Are they asking to have an entire PDF summarized
  • Are they wanting a knowledge database to inquire about
  • Are they wanting PDF operated on, “delete the blank pages”
  • Are they smarter than you with new use-case?

Then you’d start to think:

  • Should I have independent AI calls to summarize whole documents for later understanding and retrieval?
  • Should I have a UI that starts processing PDFs before a question is even posed?
  • Can I provide a PDF OCR product without even involving AI if that is what is desired?
  • Do I want single turns in a chat history with persistent PDFs as past user messages; should they be stripped out and expired? Should they be on-demand retrieval?

If you want every facet of task possibility done well, and an AI with high attention to the task, you wouldn’t use a generic offering.