Skip to content

Archives

Zip bombs to frustrate AI crawlers

  • Zip bombs to frustrate AI crawlers

    Nifty trick; redirecting abusive AI crawlers to a gzipped file containing 100GB of zeros with a few lines of nginx config:

    set $redir_to_gz 1;
    if ($host = gz.niko.lgbt) {
        set $redir_to_gz 0;
    }
    if ($http_user_agent !~* (claudebot|ZoominfoBot|GPTBot|SeznamBot|DotBot|Amazonbot|DataForSeoBot|2ip|paloaltonetworks.com|SummalyBot|incestoma)) {
        set $redir_to_gz 0;
    }
    if ($redir_to_gz) {
        return 301 
    }
    
    as for the actual stuff behind gz.niko.lgbt
    
    server {
        # SSL and listen -- snipped
    
        # static files
        root /var/www/gz.niko.lgbt;
        location / {
            add_header Content-Encoding gzip;
            try_files /42.gz =404;
            gunzip off;
            types { text/html gz; }
        }
    
        # gunzip off is very important because if the client doesn't support gzip encoding nginx will blow its foot off without that
        # 42.gz is generated with dd if=/dev/zero bs=1M count=102400 | gzip -c - > 42.gz
    
        # additional config -- snipped
    }
    

    Nice one @niko, I'm definitely going to use that :)

    Tags: gzip zip zip-bombs defence bots crawling scraping crawlers dev-zero abuse