AI Bot Blocker
Overview
The AI Bot Blocker protects services exposed via GateControl from unwanted AI crawlers. It detects and blocks requests from known AI companies (OpenAI, Google, AWS, DeepSeek, GitHub Copilot, Microsoft Azure) based on their IP address ranges — directly at the reverse-proxy level, before the request reaches the backend.
License feature key: bot_blocking
How it works
The Bot Blocker is based on the caddy-defender plugin for Caddy. It is inserted as the first handler in the Caddy route chain — before Request Tracing, Rate Limiting, Authentication, and Compression. This means bots are rejected immediately without burdening other handlers.
Detected IP ranges
The plugin automatically maintains up-to-date IP lists for the following providers:
| Provider | Description |
|---|---|
| OpenAI | GPTBot, ChatGPT-User, and other OpenAI services |
| AWS | Amazon Web Services (frequently used by AI crawlers) |
| Google Cloud | Google-Extended, Gemini, and other Google AI services |
| GitHub Copilot | GitHub Copilot requests |
| DeepSeek | DeepSeek AI crawler |
| Azure | Microsoft Azure Public Cloud |
The IP lists are regularly updated by the plugin maintainer.
Bot counter
How it works
A background task counts the blocked requests per route every 60 seconds. The counter is based on HTTP 403 responses in the Caddy access log, filtered by the route's domain.
Display
An orange badge is shown in the route list:
- Bot icon + number (e.g.
🤖 42): Number of requests blocked so far - Bot icon only (no number): Bot Blocker is active, but no bots have been blocked yet
The badge is only shown for HTTP routes (not for L4/TCP routes).
Known limitation
The counter counts all HTTP 403 responses on the route, not only those from the Bot Blocker. If IP access control or ACL is also active on the same route, these can produce 403 responses that are counted as well. The accuracy is sufficient for most use cases.
Testing
Verify bot blocking
# Normal request — should go through
curl -s -o /dev/null -w "%{http_code}" https://your-route.com/
# Expected result: 200 (or 302 on auth)
# Simulate a request from an OpenAI IP (only possible on the local network)
# Instead: look for "defender" entries in the GateControl log
docker logs gatecontrol 2>&1 | grep "defender"
Check the counter
# Fetch route data — bot_blocker_count contains the current counter
curl -s /api/v1/routes/:id | jq '.route.bot_blocker_count'
Limitations
- HTTP routes only: L4/TCP routes do not support bot blocking (caddy-defender is an HTTP handler)
- IP-based: Blocking is based on IP addresses, not User-Agent strings. Bots coming from non-listed IP ranges are not detected.
- No custom IP ranges: The default ranges maintained by the plugin are used
- No whitelist: Individual IPs cannot be exempted from blocking
- Counter accuracy: Counts all 403s, not just bot blocks (see above)
Database
Fields in the routes table
| Column | Type | Default | Description |
|---|---|---|---|
bot_blocker_enabled |
INTEGER | 0 | Feature enabled (0/1) |
bot_blocker_mode |
TEXT | 'block' | Active mode |
bot_blocker_count |
INTEGER | 0 | Cumulative block counter |
bot_blocker_config |
TEXT | null | JSON with mode-specific options |
Migration
Version 28 (add_bot_blocker) — created on 2026-03-28.
Backup/Restore
The Bot Blocker configuration (bot_blocker_enabled, bot_blocker_mode, bot_blocker_config) is included in backup/restore. The bot_blocker_count is not exported — the counter starts at 0 after a restore.
Technical details
Caddy handler config
{
"handler": "defender",
"raw_responder": "block",
"ranges": ["openai", "aws", "gcloud", "githubcopilot", "deepseek", "azurepubliccloud"]
}
Handler position in the route chain
1. defender (Bot Blocker) ← blocks bots immediately
2. trace (Request Tracing)
3. headers (Custom Headers)
4. rate_limit
5. mirror (Request Mirroring)
6. encode (compression)
7. reverse_proxy (backend)
Go module
pkg.jsn.cam/caddy-defender (originally github.com/JasonLovesDoggo/caddy-defender)
Background task
- Interval: 60 seconds
- Source:
/data/caddy/access.log - Logic: Parses JSON lines, filters by
status === 403, matchesrequest.hostagainst routes withbot_blocker_enabled, incrementsbot_blocker_count - Log rotation: Timestamp-based tracking (no offset), compatible with Caddy's log rotation (10 MB, 3 files)