We have observed that GPTBot is making requests to malformed API endpoints on our production server, such as:
/workspace/web/brand/filters-{/
/reviews/list-{/
This endpoint does not exist in our codebase, and our server correctly returns a 404 Not Found. However, these requests are being logged and are visible in our monitoring tools (e.g., Kibana), causing unnecessary noise.
Details
-
Legitimate endpoint: /workspace/web/filters
-
Malformed endpoint hit by GPTBot: /workspace/web/filters-{/
-
User-Agent: GPTBot
-
Stacktrace/Logs:
The requests are being logged as 404s, and the stacktrace shows the request passing through our standard Express middleware and error handling.
Investigation:-
- We have thoroughly searched our frontend and backend codebase and confirmed that no code is generating or referencing this malformed endpoint.
- The requests are only coming from GPTBot, not from any user or internal process.
- This appears to be a result of GPTBot “guessing” or mutating known endpoints during its crawl.
Impact:-
- These requests are not breaking anything, but they are cluttering our logs and monitoring dashboards.
- We do not want to block GPTBot, but we would like to understand why it is making these malformed requests and if this behavior can be improved.
Request:-
- Is this expected behavior from GPTBot?
- Can GPTBot be made less aggressive in mutating/crawling API endpoints, especially when the base endpoint exists and is well-formed?
- Is there a recommended way to suppress or reduce such noise, short of blocking GPTBot or filtering logs by User-Agent?