6.615 min read

GSC Indexing Statuses Explained: What They Mean and How to Fix Them (2026)

By Official

Key takeaways

  • A practical map of Google Search Console indexing statuses (Coverage): what each status means, the most common root causes (canonicals, duplicates, robots, redirects, soft 404s), and the fastest way to validate fixes

Google Search Console indexing statuses are not "errors". They are classification labels.

The fastest way to make them useful is to treat each status as a question:

  • Is this a hard gate? (noindex, robots, 4xx/5xx, redirect loops)
  • Is this duplication? (canonicals, parameters, alternate URLs)
  • Is this prioritization? (Google can fetch it, but does not want to store it yet)

If you want the mental model for the prioritization side first, read:

New here?

The short version (decision tree)

  1. Does the URL return 200 for Google? If not, fix status/redirects first.
  2. Any noindex / robots.txt block? If yes, the status is expected.
  3. Canonical points elsewhere? If yes, you are telling Google to index the other URL.
  4. Does the page look like "nothing" (soft 404)? If yes, it will be dropped.
  5. If none of the above, it is often priority (site-level trust, internal linking, crawl debt).

Status -> meaning -> what to do

Crawled - currently not indexed

Meaning: Google fetched the page, but chose not to store it (yet).

Most common causes: priority + weak internal hierarchy, duplication signals, too much crawl noise.

Fix: make the site easier to understand (hubs/pillars), reduce URL noise, and promote the page from core entry points.

Discovered - currently not indexed

Meaning: Google knows the URL exists (sitemap/internal links), but hasn't fetched it enough to store it.

Most common causes: crawl budget allocation, weak internal linking, lots of low-value URLs competing.

Fix: improve internal linking hierarchy and reduce crawl debt before requesting indexing for everything.

Duplicate without user-selected canonical

Meaning: Google considers this URL a duplicate, and it picked a canonical that you didn't specify.

Most common causes: multiple URL variants, inconsistent canonical tags, parameter URLs, thin archives.

Fix: make the preferred URL explicit (canonical + redirects), then confirm Google's selected canonical matches.

Duplicate: submitted URL not selected as canonical

Meaning: You submitted a URL, but Google decided a different URL is the canonical.

Fix: treat it as a canonicalization + duplication problem: choose one preferred URL per intent and make it boring (200, stable, not redirecting).

Alternate page with proper canonical tag

Meaning: This URL is a known alternate of the canonical URL you declared.

Most common causes: intentional alternates (UTM, pagination, print view) or accidental duplicates.

Fix: keep it (if intentional) or consolidate (if accidental). The "fix" is often not indexing this URL.

Soft 404

Meaning: Google fetched a 200 page that looks like "not found" or "empty value".

Most common causes: template placeholders, missing data in dynamic routes, thin pages with no unique value.

Fix: return a real 404/410 for missing content, or add real content + clear purpose.

Not found (404)

Meaning: Googlebot got a 404.

Fix: classify URLs (keep/move/remove), remove dead URLs from sitemap, 301 only to true successors, 410 for intentional removals.

Submitted URL not found (404)

Meaning: You submitted the URL (usually in sitemap), but Googlebot gets a 404.

Fix: fix the sitemap (remove dead URLs), then decide: restore (200), move (301 to successor), or remove (410).

Blocked due to other 4xx

Meaning: Googlebot received a 4xx (not a plain 404), often 410/429/custom middleware/WAF.

Fix: identify the exact code, decide intent (indexable or not), then make responses consistent and fix sitemap/linking.

Server error (5xx)

Meaning: Googlebot got a server error or timed out.

Fix: stabilize origin, reduce timeouts, remove redirect chains, and keep canonicals fast and boring.

Submitted URL has crawl issue

Meaning: catch-all: the submitted URL couldn't be crawled successfully.

Fix: confirm status code, check robots/noindex/canonicals/redirects/rendering.

Indexed, though blocked by robots.txt

Meaning: Google has an indexed representation from earlier, but it is currently blocked from crawling.

Most common causes: robots rules changed after indexing; blocking CSS/JS; overly broad disallow patterns.

Fix: decide whether you want it indexed. If yes, remove the block; if no, add noindex and allow crawling so Google can see it.

Redirect error

Meaning: Googlebot hit a redirect chain/loop, timeout, or a target that fails.

Most common causes: redirect loops, mixed host rules (www/apex), middleware logic, inconsistent trailing slash.

Fix: make redirects deterministic: 301 to one canonical destination, no chains.

Page with redirect

Meaning: The inspected/submitted URL redirects, so Google treats that URL as non-indexable.

Fix: update internal links and sitemaps to point directly to the final canonical URL. Collapse chains to one hop.

Redirect loop

Meaning: Redirect chain cycles.

Fix: pick one canonical pattern and collapse rules to one deterministic redirect.

Submitted URL marked 'noindex'

Meaning: You submitted a URL (usually via sitemap) that contains a noindex directive.

Fix: decide whether it should be indexed. If yes, remove noindex (meta + headers). If no, remove it from the sitemap.

Submitted URL blocked by robots.txt

Meaning: You submitted a URL but robots.txt blocks crawling.

Fix: choose a strategy: allow crawl + index (if it should rank) or allow crawl + noindex (if it should be dropped).

robots.txt unreachable

Meaning: Googlebot cannot fetch your robots.txt reliably.

Fix: make /robots.txt return 200 consistently (no timeouts, no WAF blocks, no redirect chains).

Blocked due to access forbidden (403)

Meaning: Googlebot received a 403.

Fix: remove accidental blocks (WAF, geo rules, auth leaks) for pages you want indexed; otherwise remove from sitemap and keep access consistent.

Crawl anomaly

Meaning: Unstable crawl behavior (often timeouts, intermittent 5xx, redirect loops/chains).

Fix: stabilize responses and simplify canonicalization.

What to index (and what not to fight for)

Indexing is not a moral goal.

Pick 5-10 "core URLs" (pillars, hubs, best essays), request indexing for those, and build signals (internal linking + coherence).

If a URL exists for navigation, experiments, or legacy access, it may be better as:

  • noindex, follow (utility pages you still want crawled)
  • 410 (obsolete legacy URLs)

If you pivoted recently, this helps clean up crawl debt:

FAQ

Should I request indexing for every page?

No. Request indexing for a small set of core pages. If the site is noisy or hierarchy is weak, mass requests rarely change outcomes.

Can "not indexed" mean a penalty?

Usually no. It is most often prioritization + duplication + crawl debt. Fix the obvious technical gates, then work on coherence and internal linking.

Next in SEO & Search

View topic hub

Up next:

why google search console shows impressions but no clicks

Why Search Console can show impressions but no clicks: positions, intent mismatch, weak snippets, and how to fix CTR without guesswork.