My website gets hit a lot by openai gptbot, far to much!!, I’ve got bot blockers on, but i know they are helpful for seo, well I hope! I don’t want to block them completely I guess.
Is there a way to make them only hit at say 2am to 4am???
Hi and welcome to the community!
OpenAI has a dedicated mailbox where you can report your issue:
I hope the problem will be resolved soon!
Interesting suggestion! There’s not currently a web standard for this but we can think about whether there might a simple way for sites to express that intent at scale.
In the meantime you can find more information about OpenAI crawlers here. Note that it’s possible to block GPTBot but continue to allow SearchBot, though both are careful to maintain rate limits that are typical of other well behaved crawlers.
I just want them to do it at say 2am when my site is quiet. robots txt doesn’t do it
Seems to always be ending in …https://openai.com/gptbot
I have to keep upping my aws server from t3.medium to t3.xlarge to make it work, so costs me more money!
I can’t think of any benefit of allowing AI bots to crawl your site. They are not search engines (which are welcome crawlers) and are unlikely to provide links to your site or increase traffic.
You are best off blocking them completely. OpenAI’s GPTBot seems to ignore robots.txt, but does currently include a detectable string in the user-agent.
It may be possible to block it in the Apache .htaccess file, or you can add the following first in your index.php (if using PHP):
```php
if (\strpos($_SERVER[‘HTTP_USER_AGENT’] ?? ‘’, ‘openai.com’) !== false) {
\http_response_code(403);
exit();
}
```
I suggest all website owners or maintainers do likewise.
GPTBot does respect robots.txt directives. Aside from simple misconfigurations, a common mistake is when sites (or their hosting providers) inadvertently fail to actually serve their robots.txt file to web crawlers. Please share more information, such as a domain name that you’re referring to, to help understand your issue. You can email gptbot (at) openai.com if you prefer not to share more info here.
I’m even worried about adding names here, might get more marketers!