Traefik + Fail2Ban: The "Black Hole" Integration

Traefik + Fail2Ban: The “Black Hole” Integration

This setup monitors Traefik container logs and injects DROP rules into the DOCKER-USER iptables chain. By using a “Black Hole” approach, we discard malicious packets at the kernel level before they ever reach your application code.

Prerequisites

  • Traefik: v3.x+ in Docker with access logs enabled.
  • Host Tools: Fail2Ban (v0.11+) and iptables on the host OS.
  • Mounts: Traefik log directory must be volume-mounted (e.g., ./logs:/var/log/traefik).

Step 1: Configure Traefik (JSON Format)

JSON is the gold standard for Fail2Ban. It ensures that a bot with a weird User-Agent (like one containing [ or ]) doesn’t break your regex.

traefik.yml:

accessLog:
  filePath: "/var/log/traefik/access.log"
  format: json
  bufferingSize: 0 # Instant logging for immediate protection


Step 2: Define the “Zero Tolerance” Filter

We will split this into two parts: standard 4xx monitoring and Instant Ban paths.

Filter (/etc/fail2ban/filter.d/traefik-bot.conf):

Definition]
# 1. Standard 4xx errors on sensitive paths
# 2. Direct-to-IP Access (Replace YOUR_VPS_IP with your actual IP)
failregex = ^.*"ClientAddr":"<HOST>:\d+".*?"DownstreamStatus":(401|403|404|429).*?"RequestPath":"/(\.env|\.git|wp-login|setup\.php|config\.php|admin/config\.php|shell).*".*$
            ^.*"ClientAddr":"<HOST>:\d+".*?"RequestAddr":.*"YOUR_VPS_IP:\d+",.*$

# Critical fix for JSON timestamp parsing
datepattern = ^.*"StartLocal":"<DATE>.*"

ignoreregex = ^.*"request_User-Agent":".*(Googlebot|Bingbot).*",.*$



Step 3: Create the Jail

We target the DOCKER-USER chain to ensure the ban happens before Docker’s internal routing (NAT).

Jail (/etc/fail2ban/jail.d/traefik-bot.local):

[traefik-bot]
enabled  = true
port     = http,https
filter   = traefik-bot
logpath  = /var/log/traefik/access.log
maxretry = 3
findtime = 15m
bantime  = 24h
# The "Black Hole" Action
action   = iptables-allports[chain="DOCKER-USER"]


Step 4: High-Efficiency Log Rotation

Large logs cause Fail2Ban to spike CPU. We rotate at 100MB and signal Traefik to start a fresh file.

/etc/logrotate.d/traefik:

/var/log/traefik/*.log {
    daily
    rotate 14
    maxsize 100M
    missingok
    compress
    delaycompress
    postrotate
        # Signal Traefik to release the file handle
        docker kill --signal=USR1 traefik >/dev/null 2>&1
        # Tell Fail2Ban to refresh its tailing of the new file
        fail2ban-client reload traefik-bot >/dev/null 2>&1
    endscript
}


Step 5: The “Safe-Check” Validation Script

Run this script to see exactly what Fail2Ban would catch before you commit to the bans.

**Create test-traefik-f2b.sh**:

#!/bin/bash
LOG_FILE="/var/log/traefik/access.log"
FILTER_FILE="/etc/fail2ban/filter.d/traefik-bot.conf"

echo "---  Testing Regex Match Count ---"
# This should now show 'Failregex: X total' instead of 0
fail2ban-regex "$LOG_FILE" "$FILTER_FILE"

echo -e "\n---  Direct IP Hits Count ---"
# Check how many people are hitting your IP directly
grep "RequestAddr" "$LOG_FILE" | grep -v "yourdomain.com" | wc -l

Requires jq (sudo apt install jq).


Verification & Monitoring

  • Live Jail Status: sudo fail2ban-client status traefik-bot
  • Check the Wall: sudo iptables -L DOCKER-USER -n -v (Look for the DROP target).
  • Manual Unban: sudo fail2ban-client set traefik-bot unbanip <IP>

Common Troubleshooting

Problem Solution
IPs aren’t being dropped Ensure the action is iptables-allports and the chain is exactly DOCKER-USER.
Logs are empty after rotation Verify your Traefik container is actually named traefik. If not, update the docker kill command.
I banned myself! Add your IP to the ignoreip line in the jail config: ignoreip = 127.0.0.1/8 ::1 123.123.123.123.

2 Likes

To wrap up your setup, we’ll build a Dashboard Script. script acts as a “Daily Recon” tool, pulling data from the Fail2Ban SQLite database to show you exactly who is attacking your Traefik instance, what paths they are hitting, and where they are from.


The “Daily Recon” Dashboard

This script provides a high-level summary of your security posture. It’s perfect for running once a week or as a daily cron job.

Step 1: Create the Dashboard Script

Create a file named f2b-dashboard.sh:

#!/bin/bash

# Configuration
DB_PATH="/var/lib/fail2ban/fail2ban.sqlite3"
JAIL="traefik-bot"
DAYS_BACK=7

echo "=================================================="
echo "  FAIL2BAN SUMMARY: LAST $DAYS_BACK DAYS ($JAIL)"
echo "=================================================="

# 1. Total Bans in the last X days
TOTAL=$(sqlite3 $DB_PATH "SELECT count(*) FROM bans WHERE jail='$JAIL' AND timeofban > strftime('%s', 'now', '-$DAYS_BACK days');")
echo " Total IPs Banned: $TOTAL"

# 2. Top 5 Most Active Banned IPs
echo -e "\n Top 5 Repeat Offenders:"
sqlite3 $DB_PATH <<EOF
.headers off
.mode column
SELECT ip, count(*) as count 
FROM bans 
WHERE jail='$JAIL' AND timeofban > strftime('%s', 'now', '-$DAYS_BACK days')
GROUP BY ip 
ORDER BY count DESC 
LIMIT 5;
EOF

# 3. Geo-Location (Requires curl and jq)
echo -e "\n Top Banned Countries:"
sqlite3 $DB_PATH "SELECT ip FROM bans WHERE jail='$JAIL' AND timeofban > strftime('%s', 'now', '-$DAYS_BACK days');" | \
while read ip; do 
    curl -s https://ipapi.co/$ip/country_name/
    echo ""
done | sort | uniq -c | sort -nr | head -n 5

echo -e "==================================================\n"

Step 2: Make it Executable and Test

chmod +x f2b-dashboard.sh
sudo ./f2b-dashboard.sh


what and how

Now that you have all the pieces—Log Management, Regex Filtering, and the Dashboard—here is how the traffic flows through your system:

  1. Incoming Request: A bot hits Traefik looking for /.env.
  2. Logging: Traefik writes the 404/403 event to access.log in JSON format.
  3. Detection: Fail2Ban’s traefik-bot filter identifies the “Zero Tolerance” path instantly.
  4. Action (Ban): Fail2Ban injects a DROP rule into the DOCKER-USER iptables chain.
  5. Persistence: The IP is logged in the SQLite database for your weekly Dashboard report.

Final Best Practices

Feature Best Practice Why?
Bantime Use 24h or more for botnets. Temporary bans (1h) often just result in the bot returning later.
IgnoreIP Always include your home/office IP. Prevents “friendly fire” during maintenance or testing.
JSON Logs Never use common or CLF. JSON is unambiguous; bots can’t “spoof” log fields with spaces.
Maxsize Keep logs under 100MB. Fail2Ban has to re-read the log file on every reload; size kills performance.

One Final Addition: Cloudflare Users (Optional)

If you use Cloudflare as a proxy, the IP addresses in your logs will be Cloudflare’s IPs, not the attacker’s. If this is your setup, you must:

  1. Enable the RealIP middleware in Traefik.
  2. Use the cloudflare-apiv4 action in Fail2Ban instead of iptables to ban the IP at the Cloudflare Edge level.

Real-Time Alerts: Discord & Email

Integrating alerts ensures you know exactly when someone is poking at your sensitive files (like .env or /wp-login).

Option A: Discord Webhook (Recommended)

This uses a simple curl command to post a formatted message to your Discord channel.

1. Create the Discord Action

Create /etc/fail2ban/action.d/discord-notify.conf:

[Definition]
# When the jail starts
actionstart = curl -H "Content-Type: application/json" -X POST -d '{"content": " **Fail2Ban Started**: Traefik-Bot jail is now active on <hostname>."}' <webhook_url>

# When the jail stops
actionstop = curl -H "Content-Type: application/json" -X POST -d '{"content": " **Fail2Ban Stopped**: Traefik-Bot jail has been shut down."}' <webhook_url>

# When an IP is banned
actionban = curl -H "Content-Type: application/json" -X POST -d '{"content": " **IP Banned**: `<ip>` has been dropped after <failures> attempts against Traefik. \n**Map:** <https://db-ip.com/<ip>>"}' <webhook_url>

# When an IP is unbanned
actionunban = curl -H "Content-Type: application/json" -X POST -d '{"content": " **IP Unbanned**: `<ip>` is no longer blocked."}' <webhook_url>

[Init]
# Replace this with your actual Discord Webhook URL
webhook_url = https://discord.com/api/webhooks/YOUR_WEBHOOK_HERE


Option B: Email Notifications (The Classic Way)

This requires sendmail or mailutils installed on your host OS.

1. Install Mail Tools

sudo apt install mailutils -y

2. Configure the Jail to use mail-whois

You don’t need a new action file for this; Fail2Ban comes with mail-whois pre-configured. You just need to tell the jail to use it.


Step 2: Update the Jail Config

Now, link your chosen notification method to your traefik-bot jail.

Update /etc/fail2ban/jail.d/traefik-bot.local:

[traefik-bot]
enabled  = true
port     = http,https
filter   = traefik-bot
logpath  = /var/log/traefik/access.log
maxretry = 3
findtime = 15m
bantime  = 24h

# --- MULTI-ACTION SETUP ---
# Action 1: The Firewall Ban (DOCKER-USER)
# Action 2: The Discord Notification (or Email)
action = iptables-allports[chain="DOCKER-USER"]
         discord-notify
         mail-whois[dest="your-email@example.com", sender="fail2ban@yourdomain.com"]


Step 3: Test the Notification

You don’t want to wait for a real attack to see if your alerts work. Use the fail2ban-client to trigger a manual ban on a test IP (use a fake IP like 1.2.3.4).

# Manually trigger a ban to test notifications
sudo fail2ban-client set traefik-bot banip 1.2.3.4

# Verify it appeared in Discord/Email, then unban
sudo fail2ban-client set traefik-bot unbanip 1.2.3.4


Pro-Tip: “Whois” Enrichment

If you use the Discord action, you can enrich the message with more data by adding the whois command to the actionban line.

Discord Message Example:

:prohibited: IP Banned: 192.x.x.x

Reason: Hit Zero-Tolerance path /.env

Organization: DigitalOcean, LLC

Country: Germany :germany:


Final Checklist

  1. Webhook Security: Keep your Discord webhook URL private. If it leaks, anyone can spam your channel.

  2. Rate Limiting: If you are under a massive DDoS attack, Fail2Ban might spam your Discord. If this happens, you can change the action to only notify you on the first 10 bans.

  3. Persistence: Ensure fail2ban is set to start on boot: sudo systemctl enable fail2ban.

This is very cool :slight_smile:

Having done a similar setup in the past, I will mention to check for SNI errors in particular (Server Name Indications). The majority of my Discord alerts are people going to HTTPS on my host and asking for HTTPS with the IP. This is a clear indication they do not know what the server is hosting and are instantly banned for 15 minutes (with an increment if they try again).

I documented it at the time; based on a lot of reading, testing and checking guides here and the Discord server.[link]

The small change I always propose is use ufw-docker to prevent even port 80 from being exposed (and every other non essential port); only 443 and the newt connection port are open.

1 Like

sadly we can’t call off 80. most of them use http validation.

Your point about SNI validation. didn’t understand… i might be missing something.

Maybe I am the one missing something. Why would port 80 be needed?

I block 80 using ufw_docker (Pangolin or Traefik depending) meaning that my site’s IP will not answer on HTTP.

I therefore only get traffic on port 443 in, which means I get https://IP/ ← meaning they do not know the domain I am serving and therefore SNI fail, and I block them instantly using fail2ban

The SNI rule is simple:

failregex = ^\{.*“ClientHost”:“”.*“RequestAddr”:"REPLACE_VPS_IP.*\}$

1 Like

oh I see you mean that they are not using DNS Challenge. Okay yes in that case, the SNI rule would not work indeed.

If they are using DNS Challenge, nothing needs to answer on port 80 for it to work, so ban on SNI failure is valid in that case (hopefully this helps)

1 Like

Most people yet depend on http validation. So I took a general approach.

1 Like