Chat-completions API: multi-file input, only uses last file?

GPT-4o and GPT-4.1

I tried sending multiple files (base64) in one user msg, using this syntax. I pass each file as its own dict.

https://platform.openai.com/docs/guides/pdf-files?api-mode=chat

When I ask a question that requires info from multiple pages, it only seems to consider the last file. Is this expected (i.e. multi-files are not supported) or a bug?

Hi @DarthFader

There are certain limits that should be considered when using PDFs with chat completions:

File size limitations

You can upload up to 100 pages and 32MB of total content in a single request to the API, across multiple file inputs.

yes I know those limits; I was just sending two small toy pdfs of 2-3 pages each.

It definitely looks like multi-file inputs do not work, i.e. only the last file is used.

It definitely does not work, although it seemed you could previously go over context with lists of PDFs. Don’t even need a needle in this mini-haystack of image-only test PDF’s I made:

# 1 = jokes, 2 = pricing
myfilepaths = ["test1.pdf", "test2.pdf"]
# myfilepaths.reverse()

Ask about both using chat completions + gpt-4.1?:

Here are the answers based on the content of the PDF you provided (see page 1):

  1. Document’s Website Title:
    The title of the website as shown on page 1 is “Pricing”.

  2. Costs of GPT-4.1 to make similar (per 1M tokens):
    According to the pricing table on page 1 for gpt-4.1:

    • Input: $2.00
    • Cached Input: $0.50
    • Output: $8.00

These prices are for processing 1 million text tokens. If you want to create a service similar to GPT-4.1 using this API, these would be your base costs per 1 million tokens processed for input and output.

Reversed:

Here are the answers based on the provided PDF:

1. What’s the document’s website title?

  • The website title, as shown at the top of the page (page 1), is “My Joke Website”.

2. What are costs of gpt-4.1 to make similar?

  • The PDF itself does not contain any information about GPT-4.1 or its costs.

  • However, based on OpenAI’s current pricing for GPT-4.1 (as of June 2024):

    • API pricing (approximate, may change):
      • GPT-4.1 (8K context): ~$10 per 1M input tokens and ~$30 per 1M output tokens.
      • GPT-4.1 (32K context): ~$30 per 1M input tokens and ~$60 per 1M output tokens.
    • If you use GPT-4.1 to generate jokes or content like the “My Joke Website” shown:
      • Simple joke generation (very short text) will cost only a fraction of a cent per request.
      • Monthly costs depend on website usage. For 1,000,000 jokes generated per month, expect under ~$20–$60 if heavily optimized.

Summary:

  • The website in the PDF is “My Joke Website”.
  • GPT-4.1 cost for a similar project would mainly depend on the number

In both cases, the AI refers to a PDF in the singular. Also, GPT-4.1 fabricates its prices if it doesn’t see them.

gpt-4o behaves the same:

  1. The document does not provide a website title, as it’s a pricing table from an image in the PDF.

  2. According to the image on page 1, the costs for gpt-4.1 are as follows:

    • Input: $2.00 per 1M tokens
    • Cached Input: $0.50 per 1M tokens
    • Output: $8.00 per 1M tokens
Quickie test code for your two PDFs
import base64
from openai import OpenAI; client = OpenAI()

myfilepaths = ["test1.pdf", "test2.pdf"]  # 1 = jokes, 2 = pricing
# myfilepaths.reverse()
myfiles = []
for path in myfilepaths:
    with open(path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode("utf-8")
        myfiles.append(b64)

content_items = []
content_items.append({
    "type": "text",
    "text": ("Use PDFs to answer:\n"
             "1. What's the document's website title?\n"
             "2. What are costs of gpt-4.1 to make similar?"
             ),
})
for idx, b64data in enumerate(myfiles, start=1):
    content_items.append({
        "type": "file",
        "file": {
            "file_data": f"data:application/pdf;base64,{b64data}",
            "filename": f"file_{idx}.pdf",
            # or just "file_id": "id-12345"
        }
    })
req = {
    "model": "gpt-4.1", "max_tokens": 300,
    "messages": [
        {
            "role": "user",
            "content": content_items
        },
    ]
}

response = client.chat.completions.create(**req)
print(response.choices[0].message.content)
1 Like