Regression - Support for File Uploads in Chat Completions

As of Sept 12, 2025. ChatCompletions API longer supports Files as inputs and we need to have workarounds to re-implement or refactor this on the client side. The announcement is made here : https://platform.openai.com/docs/guides/pdf-files?api-mode=chat. This introduces regressions in multiple implementations of chatcompletions where file uploads are used as “inputs”. This is to have basic feature parity between chatCompletions and Responses API.

https://platform.openai.com/docs/guides/pdf-files?api-mode=responses → now has File Inputs

but https://platform.openai.com/docs/api-reference/chat/create → No longer does.

This has been a recent ( last 2 weeks) change and is causing production issues and delays in feature releases.

The API reference has files for me:

Expand “input” “user messages” down to the types of “content”.

Have example usage employing the Python SDK, how to send a list of PDF files (vision and audio can be similar).

'''python
Title: Chat Completions with PDF Attachments
Endpoint: POST /v1/chat/completions
Content-Type: application/json

Request Body Schema:
  • model: string
  • messages: [
      {
        role: "system" | "user" | "assistant",
        content: string | [
          { type: "text", text: string }
          | { type: "file", file: { filename: string, file_data: string } }
          | { type: "file", file: { file_id: string } }
        ]
      }
    ]

Example in Python 3.12
---
'''
from pathlib import Path
import base64
from typing import List, Dict, Any
from openai import OpenAI

def build_pdf_contents(pdf_paths: List[str]) -> List[Dict[str, Any]]:
    """Convert local PDF file paths into chat-API file message objects."""
    pdf_contents: List[Dict[str, Any]] = []
    for p in pdf_paths:
        path = Path(p)
        if path.suffix.lower() != ".pdf":
            raise ValueError(f"{path!s} is not a PDF file")
        raw = path.read_bytes()
        b64 = base64.b64encode(raw).decode("utf-8")
        pdf_contents.append({
            "type": "file",
            "file": {
                "filename": path.name,
                "file_data": f"data:application/pdf;base64,{b64}"
            }
        })
    return pdf_contents

# 1) Prepare any local PDFs you want to inline:
pdf_files = ["learners_paper.pdf"]
pdf_contents = build_pdf_contents(pdf_files)

# 2) Build your messages array:
messages = [
    {
        "role": "system",
        "content": "You are a meticulous research assistant."
    },
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": (
                    "Summarize the attached PDFs and list "
                    "three key findings with page references."
                )
            },
            *pdf_contents,
            # Or reference a PDF already uploaded to OpenAI:
            #{
            #    "type": "file",
            #    "file": { "file_id": "file-abc123def4567890" }
            #}
        ]
    }
]

# 3) Call the Chat Completions API:
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
    max_completion_tokens=2000,
)

print(response.choices[0].message.content)

Still answering about user-attached PDFs.

Here’s a summary of the key findings from the document “Language Models are Unsupervised Multitask Learners” along with page references:

Summary of Key Findings

  1. Zero-shot Task Performance: Language models, particularly GPT-2, can achieve competitive results on various natural language processing tasks without any task-specific fine-tuning or supervised training. For instance, GPT-2 achieved a 55 F1 score on the CoQA dataset, surpassing many supervised baseline models without any manually collected examples (Page 7).

  2. Impact of Model Size: The performance of language models improves significantly with increased model size. Larger models tend to outperform smaller counterparts across multiple datasets, indicating a log-linear relationship between model capacity and performance. This trend is evidenced in experiments where GPT-2 results improved as model dimensions increased (Page 5).

  3. Training Dataset Diversity: The performance of language models is heavily influenced by the diversity and volume of training data. The authors emphasize that models trained on large, diverse datasets like WebText showed enhanced ability to generalize across tasks, demonstrating that they can learn patterns and relationships within the data without explicit supervision (Page 4).

These findings highlight the capabilities of unsupervised language models in performing complex language tasks, emphasizing the importance of data diversity and model scaling.

There have been instances where multiple files in one message gets only the last one “encoded”, and if it happened once, I would create multiple user messages for PDF info before a final user input query just for assurance it does not happen again.

Do you have any issue with such a usage pattern? Remember: Chat Completions does not have internal tools; you have to build your own functions.

1 Like

Thanks @_j but all the examples and File Inputs section specify examples with responses vs completions. We have built our own functions and have been on chatCompletions for over a year now ( already a Tier 4 customer if that helps). I add complexity based on a clear user need so just recently we have had a clear use case where a user uploaded file needed to be compared / reviewed against our external vector store hence the uptake of File Uploads in chat completions, and this was working as of last week before some of these regressions related to a likely internal change in the File Uploads API to clamp down purpose = “user_data“ happened. What I am noticing is the change to deprecate the file uploads for chat completions was perhaps unintentional but this seems to be pushing everyone towards responses api.

Specifically this part from your code example will not work with chat.completions.create. Of course I will be implementing a workaround but the File Uploads is basically broken from chat.completions.create and file inputs with URLs are no longer allowed so any external links - S3/Vercel or any blob storage will not work either.

This is what I previously documented:

# FILES: -- Messages example for Chat Completions, demonstrating PDF `file` content parts. --
# - PDFs are extracted text + page image, provided to the model in-context.
# - Only the "user" role may include `file` parts. 
# - Provide one content part per PDF.
# - Use exclusively one of `file_id` OR base64 `file_data` inside each `file` object.

file_messages = [
  {
    "role": "system",
    "content": "You are a meticulous research assistant."
  },
  {
    "role": "user",
    "content": [
      {
        "type": "text",
        "text": "Summarize the attached PDFs and list three key findings with page references."
      },

      # --- PDF via uploaded file_id (stored in OpenAI Files) ---
      {
        "type": "file",
        "file": {
          "file_id": "file-abc123def4567890"
        }
      },

      # --- PDF via inline base64 data (no prior upload) ---
      {
        "type": "file",
        "file": {
          "filename": "product-brochure.pdf",
          "file_data": "data:application/pdf;base64,JVBERi0xLjQKJcTl8uXr..."
        }
      }
    ]
  }
]

I tried it with a fresh upload, and the request succeeds as before (maybe a little slower with file_id).
openai.__version__
'1.101.0'

Ensure:

  • upload "purpose": "user_data"
  • uploaded by the same project ID (api key) making the API call (data scoping)
  • not sending alternate organization or project headers in request
  • don’t use SDKs that are consistently out-of-date and being damaged

A function with a “content” part creator for any type of attachment you want – I just tried all modalities expected on Chat Completions: OK

from pathlib import Path
import base64
from openai import OpenAI


def make_content_part(item: str, *, kind: str | None = None, detail: str = "auto") -> dict[str, object]:
    """
    Create a single chat content part for Chat Completions:
      - text           => {"type": "text", "text": "..."}
      - PDF file       => {"type": "file", "file": {"file_id": "..."}}
                          or {"type": "file", "file": {"filename": "...", "file_data": "data:application/pdf;base64,..."}}  # local only
      - audio (mp3/wav)=> {"type": "input_audio", "input_audio": {"data": "...", "format": "mp3|wav"}}
      - image          => {"type": "image_url", "image_url": {"url": "<http(s) or data: URI>", "detail": "low|high|auto"}}

    Classification precedence (default):
      1) file id: length and startswith "file-" or "file_"
      2) image URL: http(s) with no whitespace OR data:image/*;base64,...
      3) local file: existing path:
           - .pdf  -> PDF "file"
           - .mp3/.wav -> input_audio
           - else -> image as data URI (JPEG if no extension)
      4) fallback -> plain text

    'kind' is a hint ("text" | "file" | "image" | "audio"), used if it can be satisfied without violating the spec.
    'detail' is passed only for image content and defaults to "auto".
    """
    def _is_http_url(s: str) -> bool:
        return s.startswith(("http://", "https://")) and not any(ch.isspace() for ch in s)

    def _is_data_image_uri(s: str) -> bool:
        # Minimal detection for spec: only image data URIs are treated as images
        return s.startswith("data:image/")

    def _image_mime_for_ext(ext: str) -> str:
        e = ext.lower().lstrip(".")
        if e in ("", "jpg", "jpeg"):
            return "image/jpeg"
        if e == "png":
            return "image/png"
        if e == "gif":
            return "image/gif"
        if e == "webp":
            return "image/webp"
        if e in ("tif", "tiff"):
            return "image/tiff"
        if e == "bmp":
            return "image/bmp"
        if e == "svg":
            return "image/svg+xml"
        return f"image/{e}"

    def _as_image_url(url: str) -> dict[str, object]:
        return {"type": "image_url", "image_url": {"url": url, "detail": detail}}

    s = item
    is_file_id = (len(s) in range(20, 40)) and s.startswith(("file-", "file_"))  # len(fileID) currently 27
    is_http = _is_http_url(s)
    is_data_img = _is_data_image_uri(s)
    p = Path(s)
    exists = p.exists() and p.is_file()
    ext = p.suffix.lower().lstrip(".") if exists else ""

    # Try the hint first, but only if satisfiable per spec; otherwise fall through to auto.
    if kind:
        k = kind.lower().strip()
        if k in ("text", "input_text"):
            return {"type": "text", "text": s}
        if k in ("file", "pdf"):
            if is_file_id:
                return {"type": "file", "file": {"file_id": s}}
            if exists and ext == "pdf":
                b64 = base64.b64encode(p.read_bytes()).decode()
                return {"type": "file", "file": {"filename": p.name, "file_data": f"data:application/pdf;base64,{b64}"}}
            # else: cannot satisfy 'file' hint here; fall through to auto
        if k in ("audio", "input_audio"):
            if exists and ext in ("mp3", "wav"):
                b64 = base64.b64encode(p.read_bytes()).decode()
                return {"type": "input_audio", "input_audio": {"data": b64, "format": ext}}
            # else: fall through to auto
        if k in ("image", "image_url"):
            if is_http or is_data_img:
                return _as_image_url(s)
            if exists:
                raw = p.read_bytes()
                b64 = base64.b64encode(raw).decode()
                mime = _image_mime_for_ext(ext)
                return _as_image_url(f"data:{mime};base64,{b64}")
            # else: fall through to auto

    # Default auto-classification (spec precedence)
    if is_file_id:
        return {"type": "file", "file": {"file_id": s}}

    if is_http or is_data_img:
        return _as_image_url(s)

    if exists:
        raw = p.read_bytes()
        b64 = base64.b64encode(raw).decode()
        if ext == "pdf":
            return {"type": "file", "file": {"filename": p.name, "file_data": f"data:application/pdf;base64,{b64}"}}
        if ext in ("mp3", "wav"):
            return {"type": "input_audio", "input_audio": {"data": b64, "format": ext}}
        mime = _image_mime_for_ext(ext)
        return _as_image_url(f"data:{mime};base64,{b64}")

    return {"type": "text", "text": s}



# -- Procedural demo --

# Start with a list of attachments (could be filenames, URLs, or file IDs)
attached_files = ["241207-164148-ballad.mp3"]
attached_files = ["catcube.png"]
attached_files = ["learners_paper.pdf"]
attached_files = ["file-AHNwnQacW8EbfQpudJMYPm"]

# Convert each attachment into a content part
content_parts = [make_content_part(item) for item in attached_files]

# Build messages array for chat.completions
messages = [
    {"role": "system", "content": "You are a meticulous research assistant."},
    {
        "role": "user",
        "content": [
            {
                "type": "text",
                "text": """
What is being shown in the following content?
""".strip()
            },
            *content_parts
        ]
    }
]

# Send request
from openai import OpenAI
client = OpenAI()
response = client.chat.completions.create(
    model="gpt-4.1-mini",  # gpt-4o-audio-preview for audio types
    messages=messages,
    max_completion_tokens=2500,
    service_tier="priority",
)

print(response.choices[0].message.content)

(I distilled down the original function a bit with gpt-5 style mangling.)

1 Like

Thank you for this example, this works out of the box using the API natively. I have a langchain implementation which adds a bit of an overhead while passing the content_parts as part of the PromptTemplate using a custom MessagesPlaceholder. The langchain implementation will remain as v1 tech debt until we refactor in the future. I appreciate the detailed example. Your example is detailed enough to help anyone use file uploads with the Chat Completions API until URL or file-ID fetching for user_data is available.
fyi langchain code supplementing your example :

# Wrap any structured content parts (e.g., files) as a user message
content_parts_messages = [HumanMessage(content=content_parts)] if content_parts else []
prompt = ChatPromptTemplate.from_messages([
    ("system", "{system_prompt}"),
    ("system", "{context}"),
    ("user", "{input}"),
    MessagesPlaceholder("content_parts"),
])
# Example inputs for invocation
inputs = {
"system_preamble": system_preamble_text,
"context": context_text,  
"input": user_text,
"content_parts": content_parts_messages,
}
1 Like