AI Bot Blocker
Overview
The AI Bot Blocker protects services exposed through GateControl from unwanted AI crawlers. It detects and blocks access from known AI companies (OpenAI, Google, AWS, DeepSeek, GitHub Copilot, Microsoft Azure) based on their IP address ranges — directly at the reverse proxy level, before the request reaches the backend.
How It Works
The Bot Blocker is based on the caddy-defender plugin for Caddy. It is inserted as the first handler in the Caddy route chain — before request tracing, rate limiting, authentication, and compression. This ensures bots are rejected immediately without burdening other handlers.
Detected IP Ranges
| Provider | Description |
|---|---|
| OpenAI | GPTBot, ChatGPT-User, and other OpenAI services |
| AWS | Amazon Web Services (frequently used by AI crawlers) |
| Google Cloud | Google-Extended, Gemini, and other Google AI services |
| GitHub Copilot | GitHub Copilot requests |
| DeepSeek | DeepSeek AI crawler |
| Azure | Microsoft Azure Public Cloud |
Configuration
Enable Bot Blocker
- Open route (Edit Modal → Security tab)
- Enable AI Bot Blocker toggle
- Choose Mode
- Save route
Available Modes
| Mode | Behavior | Use Case |
|---|---|---|
| Block (403) | Returns HTTP 403 Forbidden | Default — clear and unambiguous |
| Tarpit | Responds extremely slowly (drip-feed) | Wastes crawler resources |
| Drop | Drops the TCP connection immediately | Most aggressive, no response |
| Garbage | Sends random data as response | Poisons crawler training data |
| Redirect (308) | Redirects to another URL | E.g. to an "access denied" page |
| Custom | Custom message with configurable status code | E.g. 451 "Unavailable For Legal Reasons" |
Bot Counter
A background task counts blocked requests per route every 60 seconds. An orange badge is displayed in the route list (e.g. 🤖 42).
Note: The counter counts all HTTP 403 responses on the route, not just those from the Bot Blocker.
Handler Position
1. defender (Bot Blocker) ← blocks bots immediately
2. trace (Request Tracing)
3. headers (Custom Headers)
4. rate_limit
5. mirror (Request Mirroring)
6. encode (Compression)
7. reverse_proxy (Backend)
Limitations
- HTTP routes only: L4/TCP routes do not support bot blocking
- IP-based: Blocking is based on IP addresses, not User-Agent strings
- No custom IP ranges: Uses the plugin-maintained standard ranges
- No whitelist: Individual IPs cannot be exempted from blocking