You’ll never need 128k context, some say, hold my beer

This is not a joke.

The assistant called a function, that returns full FAQ (10k character) for almost every chat message. It’s a search tool., that splits the query to an array of words and uses each term in a contains filter.

{
    "id": "run_ZLu3e3EMTjOsSYRtTiMgeQf6",
    "object": "thread.run",
    "created_at": 1743867475,
    "assistant_id": "asst_eDTguXSJ1i1zjmhgaDwOEtDe",
    "thread_id": "thread_c10prvB5Jn1mlx0RAySzhRLx",
    "status": "completed",
    "started_at": 1743867491,
    "expires_at": null,
    "cancelled_at": null,
    "failed_at": null,
    "completed_at": 1743867495,
    "required_action": null,
    "last_error": null,
    "model": "gpt-4o-mini",
    "instructions": "- REMOVED - a total of 2k token",
    "tools": [
        "removed": "8 Tools"
    ],
    "tool_resources": [],
    "metadata": [],
    "temperature": 0.3,
    "top_p": 1,
    "reasoning_effort": null,
    "max_completion_tokens": null,
    "max_prompt_tokens": null,
    "truncation_strategy": {
        "type": "auto",
        "last_messages": null
    },
    "incomplete_details": null,
    "usage": {
        "prompt_tokens": 155693,
        "completion_tokens": 174,
        "total_tokens": 155867,
        "prompt_token_details": {
            "cached_tokens": 0
        },
        "completion_tokens_details": {
            "reasoning_tokens": 0
        }
    },
    "response_format": {
        "type": "json_object"
    },
    "tool_choice": "auto",
    "parallel_tool_calls": true
}

The whole thread ended at:

Length: 70 messages
Tokens: 2378073 · 2372764 in, 5309 out

I use an auto-chat tool, and speaking of the devil, just while i’m writing here:

{
    "prompt_tokens": 213621,
    "completion_tokens": 139,
    "total_tokens": 213760,
    "prompt_token_details": {
        "cached_tokens": 0
    },
    "completion_tokens_details": {
        "reasoning_tokens": 0
    }
}

The whole chat took

length: 92
token: 3,432,500

Did i broke the matrix?