Start with the map:

GSC Indexing Statuses Explained (guide)

What "robots.txt unreachable" means

Googlebot tried to fetch https://your-domain.com/robots.txt and could not reliably access it.

This matters because robots.txt is a gatekeeper file:

if Google cannot fetch it, crawling can become conservative
many systems treat the site as unstable until the file is reachable again

It is not a ranking factor by itself, but it can cause a cascade:

fewer pages crawled
slower re-crawls
more "crawl anomaly" style noise

The common root causes

1) robots.txt returns 403/401 to Googlebot

This is typically WAF/CDN/security rules.

Symptoms:

you can load robots.txt in the browser
but Googlebot (or some IP ranges) gets blocked

2) robots.txt returns 5xx intermittently

Often:

origin is unstable
serverless cold starts
timeouts under load

Server error (5xx): debug checklist

3) Redirect chains on /robots.txt

Robots should not bounce through multiple redirects.

Goal:

one stable URL
200 OK

4) Rate limiting / bot protection

Some bot protection policies block everything that looks like a bot, including Googlebot.

The 10-minute checks

Fetch robots.txt from a clean network:

open https://your-domain.com/robots.txt
confirm it returns 200 and loads fast

Check headers:

do you see caching headers?
is there a weird redirect?

In GSC:

use robots testing tools (or URL Inspection on /robots.txt if available)

Check logs (if you have them):

requests to /robots.txt
response codes over time

Fix checklist

Fix A: Make /robots.txt boring and static

The best robots.txt is:

served directly
consistent 200
cached safely

Avoid:

middleware that rewrites /robots.txt
geo redirects
auth

Fix B: Whitelist Googlebot (carefully)

If WAF rules block bots, you often need allow rules for:

verified Googlebot IPs (or WAF built-in Googlebot verification)
user agent checks alone are not enough

Fix C: Remove redirect chains

If /robots.txt redirects:

collapse to one hop
ideally serve directly

Fix D: Fix origin instability

If it is a 5xx/timeout issue, stabilize the origin first.

Validation

You want to see:

robots.txt fetch succeeds consistently (200)
GSC stops reporting unreachable (with delay)

Expect lag:

GSC is not real-time; give it 3-14 days.

FAQ

Will this prevent indexing entirely?

Not necessarily, but it often slows crawling and makes Google conservative until robots becomes reachable again.

Should I block more stuff in robots.txt to "save crawl budget"?

Be careful. Blocking crawling can prevent Google from seeing noindex and canonicals. For content sites, cleaner URL hygiene and sitemaps usually do more than aggressive robots blocks.

How Google behaves when robots.txt is unreachable

Google does not want to crawl blindly if it cannot verify your robots policy.

In practice you often see one of these patterns:

crawling slows down until robots.txt is reachable again
Google retries robots.txt periodically (which can show up as recurring errors)
indexing of new URLs becomes more conservative because the site looks unstable

This is why "robots.txt unreachable" is less about the single file and more about site reliability.

Practical fixes by setup (common patterns)

If you use a CDN/WAF

Common mistake: bot protection blocks /robots.txt.

Fix:

ensure /robots.txt is always allowed
do not challenge it with JS/captcha
avoid country rules on this path

If you deploy frequently (serverless)

Sometimes the file is unreachable only during deploy windows.

Fix:

keep robots.txt static
cache it at the edge/CDN
avoid runtime logic on the route

A safe robots.txt baseline

If you are not sure, start with something boring like:

` User-agent: * Disallow:

Sitemap: https://your-domain.com/sitemap.xml `

Then add disallows only when you are confident the path should never be crawled.

Common mistakes

redirecting /robots.txt multiple times
blocking /robots.txt in WAF rules
returning 200 with HTML instead of plain text
serving different robots rules based on cookies/geo

robots.txt unreachable: Why it happens and how to fix it

Share

Key takeaways

Table of Contents