GPT-4o / GPT-4 API pricing differences when using API/Playground


It seems as though in my testing GPT-4o (through playground & API) uses way more token than GPT-4 Turbo.
Using the same Assistant (file retrieving, same settings), on four different questions this is what I found:

  • GPT-4o (one thread) used between 17968 and 21288 input tokens. Totalling 76860 tokens ($0.3843 in input tokens), or 9.6 cent per question.
  • GPT-4 (one thread, a different one) used between 2304 and 3220 input tokens. Totalling 11012 tokens ($0.11012 in input tokens), or 2.76 cent per question.

Now, I understand that this could be due to the text formatting in my files, or a bug.
I also know that I could counter the problem by setting a max input token when using the API.
But I would like to know if anyone else is having the same issue. Because for me it seems that GPT-4o just prefers to use more tokens by default, increasing the cost on my end (unless I set a max input tokens).

No one seems to be mentioning this, I might be the only one to have that issue, but I’d like to know if anyone else has noticed this, or if I might be doing something wrong…

Has anyone else noticed similar differences? Any advice on managing token usage better with 4o? Thanks!

PS: It’s in the Default project, so the organization/project limits should be the same for GPT-4 Turbo and 4o.

Let’s assume:

  • file_search gives similar quantity of results for similar inputs

So that leaves us with:

  • the language the AI generates, the number of calls to search and follow results, in iterative calls before responding.

You can investigate what’s actually going on in those particular threads by investigating the run steps. You might find that the AI is not using tools effectively - or too effectively. Cheap is to ignore any tools that would be useful.

How well or poorly it follows: Issue multiple queries to the msearch command only when the user's question needs to be decomposed to find different facts.

Run steps lets you see the individual usage and iterations, but the context you paid for and which would be useful for diagnosis is hidden behind…

  "step_details": {
    "type": "message_creation",
    "message_creation": {
      "message_id": "msg_abc123"


For now, this is always going to be an empty object.

Same problem here. The moment I switched to the new “cheaper” version, the system started to register two times more tokens from my apps, leading to exactly the same spending.

Same thing, as soon as we switched to GPT4o we see almost 2x increase in token usage

Hey, I’m here with an update!
So turns out that GPT 4o will automatically use the file search, but GPT 4 won’t do it on it’s own (unless you turn on the force to use the file_search).
So that was the reason for it. I wonder why given the same set of parameters GPT 4o will always use the files and GPT 4T won’t…
But anyways, no real mistery!

1 Like