Skip to main content

Resolving the 'Earnings at Risk' Ads.txt Error for Raptive Publishers

 For a publisher, few notifications trigger immediate anxiety like Raptive's "Earnings at Risk" alert. The specific error—"Ads.txt Not Found"—is particularly frustrating because, for 90% of site owners, the file is technically there. You can navigate to yourdomain.com/ads.txt in a browser and see the list of authorized sellers. Yet, the crawler fails, and your revenue potential plummets.

This is rarely a file upload issue. It is almost always a server configuration issue.

When a browser loads ads.txt, it is forgiving. When a programmatic crawler (like Googlebot or Raptive’s verification bot) requests it, strict adherence to HTTP standards, MIME types, and routing logic is required. If your Nginx or Apache configuration unintentionally routes this request through a CMS (like WordPress), adds incorrect headers, or creates a redirect loop, the verification fails.

Here is the technical root cause analysis and the server-level configuration required to fix it permanently.

The Root Cause: Why Browsers See It But Bots Don’t

To resolve this, we must understand how the verification bot acts. The bot expects a raw, static resource returning a 200 OK status code immediately.

1. The CMS Interception

In modern stacks, requests are often routed blindly to an index file (e.g., index.php in WordPress or index.js in Node apps) to handle routing via software. If the web server passes the ads.txt request to the CMS, the response is generated dynamically. This increases Time to First Byte (TTFB) and risks returning a text/html Content-Type header instead of text/plain. Crawlers strictly require text/plain.

2. The Redirect Loop

Publishers frequently force HTTPS or www subdomains via global rewrite rules. If the crawler requests http://domain.com/ads.txt and gets a 301 to https://domain.com/ads.txt, followed by a 301 to https://www.domain.com/ads.txt, the crawler may abort due to redirect chain limits.

3. Nginx Location Priority

In Nginx, regex matching (location ~) often overrides standard prefix matching. If you have a block caching static assets (location ~* \.(css|js|txt)$), it might be misconfigured with restrictive permissions or CORS headers that block the specific bot user agent.

The Fix: Nginx Configuration

The most robust solution for Nginx is to define an exact match location block. This isolates ads.txt from your CMS logic, rewrite rules, and security filters.

Open your site configuration file (usually found in /etc/nginx/sites-available/ or /etc/nginx/conf.d/).

Add the following block. Crucially, place this before your PHP/CMS location blocks to ensure priority.

server {
    # ... existing server configuration ...

    # EXACT MATCH for ads.txt
    # The "=" modifier ensures this block takes precedence over regex matches
    location = /ads.txt {
        # 1. Define the actual path to the file
        root /var/www/your-site-root/public_html;

        # 2. Force the correct MIME type
        default_type text/plain;

        # 3. Allow all access (overriding potential strict security rules)
        allow all;

        # 4. Optimization: distinct logging
        access_log /var/log/nginx/ads_access.log;
        error_log /var/log/nginx/ads_error.log;

        # 5. Caching headers (Ads.txt doesn't change hourly)
        # 86400 seconds = 24 hours
        add_header Cache-Control "public, max-age=86400";
        
        # 6. Bypass PHP/FastCGI entirely
        try_files $uri $uri/ =404;
    }

    # ... remaining configuration (PHP blocks, etc) ...
}

Applying the Changes

Always test your configuration syntax before reloading to prevent downtime.

sudo nginx -t
sudo systemctl reload nginx

The Fix: Apache (.htaccess) Configuration

If you are on a LAMP stack or shared hosting, you likely rely on .htaccess. The goal here is to prevent WordPress (or other CMSs) from handling the file and ensuring the correct headers are sent.

Add the following snippet to the very top of your root .htaccess file.

<IfModule mod_headers.c>
    # Match the specific file
    <Files "ads.txt">
        # Force the Content-Type
        Header set Content-Type "text/plain"
        
        # Ensure it is accessible
        Order Allow,Deny
        Allow from all
        
        # Disable caching logic that might serve stale 404s
        FileETag None
        Header unset ETag
        Header set Cache-Control "max-age=86400, public"
    </Files>
</IfModule>

<IfModule mod_rewrite.c>
    RewriteEngine On
    
    # CRITICAL: Exclude ads.txt from further rewrite rules
    # This prevents the request from being passed to index.php
    RewriteRule ^ads\.txt$ - [L]
</IfModule>

Handling Redirect Loops in Apache

If you have forced HTTPS redirection, ensure ads.txt is handled gracefully. Sometimes it is safer to check for the file specifically before forcing the redirect, although modern crawlers handle a single HTTP-to-HTTPS hop well.

If you encounter redirect loops, ensure your HTTPS force rule looks like this:

RewriteCond %{HTTPS} off
RewriteCond %{REQUEST_URI} !^/ads\.txt$ [NC]
RewriteRule ^(.*)$ https://%{HTTP_HOST}%{REQUEST_URI} [L,R=301]

Note: The second line excludes ads.txt from the forced redirect. While not always necessary, this acts as a failsafe if SSL handshakes are failing for the bot.

Deep Dive: Verifying the Solution

Do not rely on your browser to verify the fix. Browsers render content liberally. Use curl to mimic a programmatic request and inspect the headers directly.

Run the following command in your terminal:

curl -I -L http://yourdomain.com/ads.txt

What to Look For (Success Criteria)

  1. HTTP/1.1 200 OK: If you see 403 (Forbidden), 404 (Not Found), or 500 (Server Error), the configuration is still wrong.
  2. Content-Type: text/plain: If this says text/html, your CMS is still intercepting the request, or your server defaults are incorrect.
  3. Content-Length: Ensure this is not 0.
  4. X-Powered-By: If you see X-Powered-By: PHP/8.1, your Nginx try_files or Apache RewriteRule exclusion failed. The web server should serve this file directly without invoking PHP.

Common Edge Cases and Pitfalls

Even with the correct web server config, external factors can block Raptive's verification.

1. Cloudflare "Bot Fight Mode"

If you use Cloudflare, the "Super Bot Fight Mode" or high security settings often flag programmatic requests to text files as suspicious.

  • Check: Go to Cloudflare Dashboard -> Security -> WAF -> Tools.
  • Action: Lookup the IP addresses of the ad verification bots (if known) or create a Firewall Rule:
    • If URI Path equals /ads.txt -> Action: Skip / Bypass Legacy CAPTCHA.

2. File Permissions

The web server user (usually www-data or apache) must have read access to the file.

# Correct permissions
sudo chown www-data:www-data /var/www/html/ads.txt
sudo chmod 644 /var/www/html/ads.txt

If permissions are set to 640 or 600 and the file is owned by root, the web server returns a 403 Forbidden error.

3. Case Sensitivity

Linux file systems are case-sensitive. Ads.txt is not the same as ads.txt.

  • Standard: The IAB standard strictly requires lowercase ads.txt.
  • Fix: Rename the file. Do not use rewrite rules to correct the casing; simply ensure the physical file is lowercase.

Conclusion

The "Ads.txt Not Found" error is rarely a mystery; it is a conflict between strict bot requirements and loose browser behaviors. By overriding the default CMS routing using an Nginx location = block or an Apache RewriteRule exclusion, you ensure the file is served as a raw, static resource with the correct text/plain headers.

Once the headers are correct (verified via curl), the Raptive crawler will successfully parse the file on its next pass, securing your ad revenue pipeline.