GPTBot ignoring robots.txt and hammering single URL in a loop — potential infinite crawl bug

pbucker · February 18, 2026, 2:01pm

I’m running a WooCommerce/WordPress site and noticed GPTBot has been aggressively crawling one specific URL repeatedly, hitting it approximately once every 10 seconds around the clock. I want to flag this because it looks like a bug in how GPTBot handles dynamic query strings, and it’s causing real server load.

What I’m seeing:

GPTBot continuously requests the same base URL with variations of these query strings:

?jet_blog_ajax=1&nocache=
?nocache=

Each variation has a unique or changing nocache= value, which means GPTBot appears to be treating each request as a distinct URL. This is creating what looks like an infinite crawl loop — it never stops returning to this URL because each response likely contains a new variation of the query string.

The robots.txt problem:

My robots.txt explicitly disallows these patterns for all bots:

Disallow: /*?nocache=
Disallow: /*?*nocache=*

GPTBot is ignoring these rules entirely. This is not a misconfiguration on my end — the rules are valid and other well-behaved crawlers respect them.

Why this matters:

This isn’t just a nuisance. A bot hitting a single endpoint every 10 seconds indefinitely generates unnecessary server load, inflates my bandwidth, and pollutes crawl logs. Multiply this across potentially thousands of sites with similar dynamic query string patterns and this could be a significant infrastructure problem at scale.

Furthermore, this behavior is counterproductive for OpenAI’s own goals. By getting stuck in a loop on a single URL, GPTBot is failing to crawl the rest of my site at all. Any site owner who wants their content indexed for AI training is being poorly served.

mat.eo · February 18, 2026, 3:21pm

It’s entirely plausible that someone is pretending to be GPTBot. Have you checked the IP Address(es) attached to the requests?

You can match it here, and in the future ensure that any user agent as GPTBOT is valid

https://openai.com/gptbot.json

Lastly, you could be more explicit and include this in your robots.txt (not sure if this would help)

User-agent: GPTBot
Disallow: /*?*nocache=*

Topic		Replies	Views
GPTBot Mass Crawling Truncated URLs Community chatgpt	6	482	January 6, 2026
Bots generating errors on my website Community robots	19	2731	April 2, 2025
Bots hitting my website at peak times....a lot API	7	791	January 6, 2026
Gptbot is ddos me and scan my website in a suspect way Plugins / Actions builders cybersecurity	6	1078	February 16, 2025
ChatGPT stuck on making the same requests again and over again to plugin Plugins / Actions builders gpt-4 , chatgpt , plugin-development , chatgpt-plugin	6	2199	July 31, 2023

GPTBot ignoring robots.txt and hammering single URL in a loop — potential infinite crawl bug

Related topics