Sitemap could not be read (Google Search Console): what it means and how to fix it (2026)

“Sitemap could not be read” is not an indexing verdict. It’s a transport + parsing failure.

It means: Google tried to fetch the sitemap URL you submitted, and it could not reliably interpret the response as a valid sitemap.

That usually reduces discovery (fewer URLs enter the pipeline), which then shows up later as:

“Discovered — currently not indexed”
“Crawled — currently not indexed”

If you want the bigger map first:

What it means (plain English)

Google expected a sitemap file. It got something else:

an error response
a login / blocked response
HTML instead of XML
a redirect chain that breaks, loops, or changes content
a file that is too large or malformed

So it can’t trust the sitemap as a stable input.

The 80/20 causes

1) The sitemap URL returns a non‑200 or unstable response

Common culprits:

403/401 (blocked, WAF, auth)
404 (wrong URL)
5xx (server issues)
timeouts

If you see access/server errors elsewhere in GSC, fix those first:

2) Redirects on the sitemap URL

Redirects aren’t automatically wrong, but they are a frequent source of “could not be read” because they create instability:

redirect loops
redirect chains
different content per user-agent

See:

3) Content-type / format mismatch (HTML, not XML)

The most common “silent” failure: the sitemap URL serves a normal webpage (HTML), not a sitemap.

Typical reasons:

wrong route (points to /sitemap page, not /sitemap.xml)
proxy/cache misconfig serving HTML fallback
framework route not matching in production

4) The sitemap is malformed XML (or wrong sitemap syntax)

Google can be tolerant, but not infinitely. Common mistakes:

invalid XML
wrong encoding
invalid <loc> URLs
invalid date formats in <lastmod>

5) Size and compression issues

Hard limits matter because “could not be read” often happens when you exceed them or produce a file that’s heavy to fetch:

too many URLs in one sitemap
too large uncompressed
broken gzip

If you have many URLs, use a sitemap index + chunked sitemaps.

How to diagnose fast (without guessing)

Open the submitted sitemap URL in a browser:

It should look like XML, not a webpage.
The file should contain <urlset> or <sitemapindex>.

Fetch it with a simple HTTP client (curl / powershell) and check:

status code is 200
redirects are minimal (ideally none)
response is stable on repeat requests

Validate the XML quickly:

if the file isn’t valid XML, fix generation first

Confirm the URLs inside are canonical, indexable representations:

avoid stuffing parameter URLs, duplicate paths, or URLs that redirect

If you’re treating sitemaps as a “crawl budget lever”, read this first:

Sitemaps and crawl budget myths

What to do next (the honest version)

Fix the sitemap so it becomes a stable, parseable input. Then measure what changes in the pipeline:

discovery and crawl cadence
the ratio of discovered vs indexed
whether “discovered not indexed” drops over time

If your pages are indexed but still not visible, that’s a different problem (selection, not storage):

Sitemap could not be read (Google Search Console): what it means and how to fix it (2026)

Share

Key takeaways

Table of Contents