2.89 min read

Sitemap could not be read (Google Search Console): what it means and how to fix it (2026)

Key takeaways

  • “Sitemap could not be read” means Google failed to fetch or parse your sitemap as a sitemap
  • This guide explains the failure modes (HTTP, redirects, content-type, format, size), how to diagnose fast, and what changes actually remove the error

“Sitemap could not be read” is not an indexing verdict. It’s a transport + parsing failure.

It means: Google tried to fetch the sitemap URL you submitted, and it could not reliably interpret the response as a valid sitemap.

That usually reduces discovery (fewer URLs enter the pipeline), which then shows up later as:

  • “Discovered — currently not indexed”
  • “Crawled — currently not indexed”

If you want the bigger map first:

What it means (plain English)

Google expected a sitemap file. It got something else:

  • an error response
  • a login / blocked response
  • HTML instead of XML
  • a redirect chain that breaks, loops, or changes content
  • a file that is too large or malformed

So it can’t trust the sitemap as a stable input.

The 80/20 causes

1) The sitemap URL returns a non‑200 or unstable response

Common culprits:

  • 403/401 (blocked, WAF, auth)
  • 404 (wrong URL)
  • 5xx (server issues)
  • timeouts

If you see access/server errors elsewhere in GSC, fix those first:

2) Redirects on the sitemap URL

Redirects aren’t automatically wrong, but they are a frequent source of “could not be read” because they create instability:

  • redirect loops
  • redirect chains
  • different content per user-agent

See:

3) Content-type / format mismatch (HTML, not XML)

The most common “silent” failure: the sitemap URL serves a normal webpage (HTML), not a sitemap.

Typical reasons:

  • wrong route (points to /sitemap page, not /sitemap.xml)
  • proxy/cache misconfig serving HTML fallback
  • framework route not matching in production

4) The sitemap is malformed XML (or wrong sitemap syntax)

Google can be tolerant, but not infinitely. Common mistakes:

  • invalid XML
  • wrong encoding
  • invalid <loc> URLs
  • invalid date formats in <lastmod>

5) Size and compression issues

Hard limits matter because “could not be read” often happens when you exceed them or produce a file that’s heavy to fetch:

  • too many URLs in one sitemap
  • too large uncompressed
  • broken gzip

If you have many URLs, use a sitemap index + chunked sitemaps.

How to diagnose fast (without guessing)

  1. Open the submitted sitemap URL in a browser:
  • It should look like XML, not a webpage.
  • The file should contain <urlset> or <sitemapindex>.
  1. Fetch it with a simple HTTP client (curl / powershell) and check:
  • status code is 200
  • redirects are minimal (ideally none)
  • response is stable on repeat requests
  1. Validate the XML quickly:
  • if the file isn’t valid XML, fix generation first
  1. Confirm the URLs inside are canonical, indexable representations:
  • avoid stuffing parameter URLs, duplicate paths, or URLs that redirect

If you’re treating sitemaps as a “crawl budget lever”, read this first:

What to do next (the honest version)

Fix the sitemap so it becomes a stable, parseable input. Then measure what changes in the pipeline:

  • discovery and crawl cadence
  • the ratio of discovered vs indexed
  • whether “discovered not indexed” drops over time

If your pages are indexed but still not visible, that’s a different problem (selection, not storage):

Next in SEO & Search

View topic hub

Up next:

Sitemap errors (Google Search Console): what they mean and what to fix first (2026)

Sitemap errors are not “bad SEO” — they’re input integrity failures. This guide classifies sitemap errors into fetch, format, and URL-level problems, explains why they matter, and shows the fastest checks to remove them.