Chat-completions API: multi-file input, only uses last file?

It definitely does not work, although it seemed you could previously go over context with lists of PDFs. Don’t even need a needle in this mini-haystack of image-only test PDF’s I made:

# 1 = jokes, 2 = pricing
myfilepaths = ["test1.pdf", "test2.pdf"]
# myfilepaths.reverse()

Ask about both using chat completions + gpt-4.1?:

Here are the answers based on the content of the PDF you provided (see page 1):

  1. Document’s Website Title:
    The title of the website as shown on page 1 is “Pricing”.

  2. Costs of GPT-4.1 to make similar (per 1M tokens):
    According to the pricing table on page 1 for gpt-4.1:

    • Input: $2.00
    • Cached Input: $0.50
    • Output: $8.00

These prices are for processing 1 million text tokens. If you want to create a service similar to GPT-4.1 using this API, these would be your base costs per 1 million tokens processed for input and output.

Reversed:

Here are the answers based on the provided PDF:

1. What’s the document’s website title?

  • The website title, as shown at the top of the page (page 1), is “My Joke Website”.

2. What are costs of gpt-4.1 to make similar?

  • The PDF itself does not contain any information about GPT-4.1 or its costs.

  • However, based on OpenAI’s current pricing for GPT-4.1 (as of June 2024):

    • API pricing (approximate, may change):
      • GPT-4.1 (8K context): ~$10 per 1M input tokens and ~$30 per 1M output tokens.
      • GPT-4.1 (32K context): ~$30 per 1M input tokens and ~$60 per 1M output tokens.
    • If you use GPT-4.1 to generate jokes or content like the “My Joke Website” shown:
      • Simple joke generation (very short text) will cost only a fraction of a cent per request.
      • Monthly costs depend on website usage. For 1,000,000 jokes generated per month, expect under ~$20–$60 if heavily optimized.

Summary:

  • The website in the PDF is “My Joke Website”.
  • GPT-4.1 cost for a similar project would mainly depend on the number

In both cases, the AI refers to a PDF in the singular. Also, GPT-4.1 fabricates its prices if it doesn’t see them.

gpt-4o behaves the same:

  1. The document does not provide a website title, as it’s a pricing table from an image in the PDF.

  2. According to the image on page 1, the costs for gpt-4.1 are as follows:

    • Input: $2.00 per 1M tokens
    • Cached Input: $0.50 per 1M tokens
    • Output: $8.00 per 1M tokens
Quickie test code for your two PDFs
import base64
from openai import OpenAI; client = OpenAI()

myfilepaths = ["test1.pdf", "test2.pdf"]  # 1 = jokes, 2 = pricing
# myfilepaths.reverse()
myfiles = []
for path in myfilepaths:
    with open(path, "rb") as f:
        b64 = base64.b64encode(f.read()).decode("utf-8")
        myfiles.append(b64)

content_items = []
content_items.append({
    "type": "text",
    "text": ("Use PDFs to answer:\n"
             "1. What's the document's website title?\n"
             "2. What are costs of gpt-4.1 to make similar?"
             ),
})
for idx, b64data in enumerate(myfiles, start=1):
    content_items.append({
        "type": "file",
        "file": {
            "file_data": f"data:application/pdf;base64,{b64data}",
            "filename": f"file_{idx}.pdf",
            # or just "file_id": "id-12345"
        }
    })
req = {
    "model": "gpt-4.1", "max_tokens": 300,
    "messages": [
        {
            "role": "user",
            "content": content_items
        },
    ]
}

response = client.chat.completions.create(**req)
print(response.choices[0].message.content)
1 Like