Why does GPT-4.1 get weekdays wrong in long date tasks?

Hi everyone,

I’m a developer and I’ve been noticing some strange behavior when using the GPT-4.1 model for relatively simple tasks involving dates.

Even when I provide multiple references in the prompt (current day, time, year, etc.), the model often gets the day of the week wrong when processing large volumes of information that include dates. In different parts of the response, it ends up calculating the weekday incorrectly, usually being one or more days ahead or behind.

I even tried integrating a calculator step to help with this, but the issue still occurs occasionally, which seems like a type of “hallucination” related to temporal calculations.

I’d like to know:

  • Has anyone else experienced this behavior?

  • Are there any best practices to reduce this kind of error?

  • Any prompt, validation, or architectural strategies that work well in these cases?

Thanks in advance for any insights or experiences you can share.

What 'calculator step’ did you try? Im a bit confused as to what your problem is.

Are you giving the LLM a date (1/29/26 for example) and then asking it to find out what day of the week that is?

If thats the case, I would supply a simple tool that is connected to some function that can do this automatically. For example, the LLM calls a tool called something like (date_to_weekday) that takes the full ISO formatted date as input, and the tool outputs the day of the week that would be calculated by deterministic code.

This would prevent any issues with the LLM needing to calculate it on its own and making mistakes.

Quick brainstorm of what you can furnish the AI automatically or as a tool to call.

"""ai_cal.py - produce markdown calendars for AI consumption"""
from __future__ import annotations

import calendar
import datetime


def markdown_calendar(year: int, month: int, month_count: int = 1) -> str:
    def coerce_int(name: str, value: object) -> tuple[int | None, str | None]:
        if type(value) is bool:
            return None, f"Error: {name} must be an integer, got bool ({value})."
        if isinstance(value, int):
            return value, None
        if isinstance(value, str):
            s = value.strip()
            if not s:
                return None, f"Error: {name} must be an integer, got an empty string."
            try:
                return int(s, 10), None
            except ValueError:
                return None, f"Error: {name} must be an integer, got {value!r}."
        return None, f"Error: {name} must be an integer, got {type(value).__name__}."

    y, err = coerce_int("year", year)
    if err:
        return err
    m, err = coerce_int("month", month)
    if err:
        return err
    mc, err = coerce_int("month_count", month_count)
    if err:
        return err

    if y is None or m is None or mc is None:
        return "Error: invalid input."

    if not (1 <= y <= 9999):
        return f"Error: year must be in 1..9999, got {y}."
    if not (1 <= m <= 12):
        return f"Error: month must be in 1..12, got {m}."
    if mc < 1:
        return f"Error: month_count must be >= 1, got {mc}."

    cal = calendar.Calendar(firstweekday=6)  # Sunday-first

    def ym_add(y0: int, m0: int, add: int) -> tuple[int, int]:
        t = y0 * 12 + (m0 - 1) + add
        return (t // 12, (t % 12) + 1)

    header = "| Su | M | T | W | Th | F | Sa |"
    sep = "| --- | --- | --- | --- | --- | --- | --- |"

    try:
        blocks: list[str] = []
        for i in range(mc):
            yy, mm = ym_add(y, m, i)
            month_name = calendar.month_name[mm]
            lines: list[str] = [f"### {month_name} {yy}", "", header, sep]

            for week in cal.monthdayscalendar(yy, mm):
                cells = [str(d) if d else "" for d in week]
                lines.append("| " + " | ".join(cells) + " |")

            blocks.append("\n".join(lines))

        return "\n\n".join(blocks)
    except Exception as e:
        return f"Error: failed to generate calendar ({type(e).__name__}: {e})."


def main() -> None:
    today = datetime.date.today()
    print(markdown_calendar(today.year, today.month, month_count=3))


if __name__ == "__main__":
    main()

Result:

January 2026

Su M T W Th F Sa
1 2 3
4 5 6 7 8 9 10
11 12 13 14 15 16 17
18 19 20 21 22 23 24
25 26 27 28 29 30 31

February 2026

Su M T W Th F Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

March 2026

Su M T W Th F Sa
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31

3 months: 408 tokens

Decide if 300 tokens, always, cached, is cheaper than a tool call that repeats the entire billed input context in another call.

Do note that OpenAI is actively trying to break your API application by injecting their own date into their system message that is not aware of the user locale.

1 Like