Why does GPT-4.1 get weekdays wrong in long date tasks?

Hi everyone,

I’m a developer and I’ve been noticing some strange behavior when using the GPT-4.1 model for relatively simple tasks involving dates.

Even when I provide multiple references in the prompt (current day, time, year, etc.), the model often gets the day of the week wrong when processing large volumes of information that include dates. In different parts of the response, it ends up calculating the weekday incorrectly, usually being one or more days ahead or behind.

I even tried integrating a calculator step to help with this, but the issue still occurs occasionally, which seems like a type of “hallucination” related to temporal calculations.

I’d like to know:

  • Has anyone else experienced this behavior?

  • Are there any best practices to reduce this kind of error?

  • Any prompt, validation, or architectural strategies that work well in these cases?

Thanks in advance for any insights or experiences you can share.

What 'calculator step’ did you try? Im a bit confused as to what your problem is.

Are you giving the LLM a date (1/29/26 for example) and then asking it to find out what day of the week that is?

If thats the case, I would supply a simple tool that is connected to some function that can do this automatically. For example, the LLM calls a tool called something like (date_to_weekday) that takes the full ISO formatted date as input, and the tool outputs the day of the week that would be calculated by deterministic code.

This would prevent any issues with the LLM needing to calculate it on its own and making mistakes.

Quick brainstorm of what you can furnish the AI automatically or as a tool to call.

Calendar - token-tuned

"""Generate compact markdown calendar tables optimized for AI/LLM consumption."""

from __future__ import annotations

import calendar
import datetime

__all__ = ["markdown_calendar"]

def markdown_calendar(
    year: int | str,
    month: int | str,
    month_count: int | str = 1,
) -> str:
    def coerce_int(name: str, value: object) -> tuple[int | None, str | None]:
        if type(value) is bool:
            return None, f"Error: {name} must be an integer, got bool ({value})."
        if isinstance(value, int):
            return value, None
        if isinstance(value, str):
            s = value.strip()
            if not s:
                return None, f"Error: {name} must be an integer, got an empty string."
            try:
                return int(s, 10), None
            except ValueError:
                return None, f"Error: {name} must be an integer, got {value!r}."
        return None, f"Error: {name} must be an integer, got {type(value).__name__}."

    y, err = coerce_int("year", year)
    if err:
        return err
    m, err = coerce_int("month", month)
    if err:
        return err
    mc, err = coerce_int("month_count", month_count)
    if err:
        return err

    if 13 <= y <= 99:
        y += 2000
    elif not (1 <= y <= 9999):
        return f"{y} not a year"

    if not (1 <= m <= 12):
        return f"Error: month must be in 1..12, got {m}."
    if not (1 <= mc <= 12):
        return f"Error: month_count must be 1-12 months; got {mc}."

    cal = calendar.Calendar(firstweekday=6)  # Sunday-first

    def ym_add(y0: int, m0: int, add: int) -> tuple[int, int]:
        t = y0 * 12 + (m0 - 1) + add
        return (t // 12, (t % 12) + 1)

    header = " Sunday| Monday| Tuesday| Wednesday| Thursday| Friday| Saturday"
    delim = "---|---|---|---|---|---|---"
    no_day = "x"
    try:
        blocks: list[str] = []
        for i in range(mc):
            yy, mm = ym_add(y, m, i)
            month_name = calendar.month_name[mm]
            lines: list[str] = [
                f"### Calendar for {yy}-{mm} {month_name}",
                "",
                header,
                delim,
            ]

            for week in cal.monthdayscalendar(yy, mm):
                cells = [str(d) if d else no_day for d in week]
                lines.append("|".join(cells).rstrip("|" + no_day))

            blocks.append("\n".join(lines))

        return "\n\n".join(blocks)
    except Exception as e:
        return f"Error: failed to generate calendar ({type(e).__name__}: {e})."


def main() -> None:
    today = datetime.date.today()
    print(markdown_calendar(today.year, today.month, month_count=3))


if __name__ == "__main__":
    main()

Result:

Calendar for 2026-2 February

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28

Calendar for 2026-3 March

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
1 2 3 4 5 6 7
8 9 10 11 12 13 14
15 16 17 18 19 20 21
22 23 24 25 26 27 28
29 30 31

Calendar for 2026-4 April

Sunday Monday Tuesday Wednesday Thursday Friday Saturday
x x x 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30

3 months: 294 tokens

Decide if 300 tokens, always, cached, is cheaper than a tool call that repeats the entire billed input context in another call and adds latency.

Do note that OpenAI is actively trying to break your API application by injecting their own date into their system message that is not aware of the user locale.