Key takeaways
- A practical map of Google Search Console indexing statuses (Coverage): what each status means, the most common root causes (canonicals, duplicates, robots, redirects, soft 404s), and the fastest way to validate fixes
Table of Contents
Google Search Console indexing statuses are not "errors". They are classification labels.
The fastest way to make them useful is to treat each status as a question:
- Is this a hard gate? (noindex, robots, 4xx/5xx, redirect loops)
- Is this duplication? (canonicals, parameters, alternate URLs)
- Is this prioritization? (Google can fetch it, but does not want to store it yet)
If you want the mental model for the prioritization side first, read:
New here?
The short version (decision tree)
- Does the URL return 200 for Google? If not, fix status/redirects first.
- Any noindex / robots.txt block? If yes, the status is expected.
- Canonical points elsewhere? If yes, you are telling Google to index the other URL.
- Does the page look like "nothing" (soft 404)? If yes, it will be dropped.
- If none of the above, it is often priority (site-level trust, internal linking, crawl debt).
Status -> meaning -> what to do
Crawled - currently not indexed
Meaning: Google fetched the page, but chose not to store it (yet).
Most common causes: priority + weak internal hierarchy, duplication signals, too much crawl noise.
Fix: make the site easier to understand (hubs/pillars), reduce URL noise, and promote the page from core entry points.
Discovered - currently not indexed
Meaning: Google knows the URL exists (sitemap/internal links), but hasn't fetched it enough to store it.
Most common causes: crawl budget allocation, weak internal linking, lots of low-value URLs competing.
Fix: improve internal linking hierarchy and reduce crawl debt before requesting indexing for everything.
Duplicate without user-selected canonical
Meaning: Google considers this URL a duplicate, and it picked a canonical that you didn't specify.
Most common causes: multiple URL variants, inconsistent canonical tags, parameter URLs, thin archives.
Fix: make the preferred URL explicit (canonical + redirects), then confirm Google's selected canonical matches.
Duplicate: submitted URL not selected as canonical
Meaning: You submitted a URL, but Google decided a different URL is the canonical.
Fix: treat it as a canonicalization + duplication problem: choose one preferred URL per intent and make it boring (200, stable, not redirecting).
Alternate page with proper canonical tag
Meaning: This URL is a known alternate of the canonical URL you declared.
Most common causes: intentional alternates (UTM, pagination, print view) or accidental duplicates.
Fix: keep it (if intentional) or consolidate (if accidental). The "fix" is often not indexing this URL.
Soft 404
Meaning: Google fetched a 200 page that looks like "not found" or "empty value".
Most common causes: template placeholders, missing data in dynamic routes, thin pages with no unique value.
Fix: return a real 404/410 for missing content, or add real content + clear purpose.
- Deep dive: Soft 404: what it means and how to fix it
- Related: Submitted URL seems to be a soft 404
Not found (404)
Meaning: Googlebot got a 404.
Fix: classify URLs (keep/move/remove), remove dead URLs from sitemap, 301 only to true successors, 410 for intentional removals.
- Deep dive: Not found (404): what to do
Submitted URL not found (404)
Meaning: You submitted the URL (usually in sitemap), but Googlebot gets a 404.
Fix: fix the sitemap (remove dead URLs), then decide: restore (200), move (301 to successor), or remove (410).
Blocked due to other 4xx
Meaning: Googlebot received a 4xx (not a plain 404), often 410/429/custom middleware/WAF.
Fix: identify the exact code, decide intent (indexable or not), then make responses consistent and fix sitemap/linking.
- Deep dive: Blocked due to other 4xx: fix checklist
Server error (5xx)
Meaning: Googlebot got a server error or timed out.
Fix: stabilize origin, reduce timeouts, remove redirect chains, and keep canonicals fast and boring.
- Deep dive: Server error (5xx): debug checklist
Submitted URL has crawl issue
Meaning: catch-all: the submitted URL couldn't be crawled successfully.
Fix: confirm status code, check robots/noindex/canonicals/redirects/rendering.
- Deep dive: Submitted URL has crawl issue: debug flow
Indexed, though blocked by robots.txt
Meaning: Google has an indexed representation from earlier, but it is currently blocked from crawling.
Most common causes: robots rules changed after indexing; blocking CSS/JS; overly broad disallow patterns.
Fix: decide whether you want it indexed. If yes, remove the block; if no, add noindex and allow crawling so Google can see it.
Redirect error
Meaning: Googlebot hit a redirect chain/loop, timeout, or a target that fails.
Most common causes: redirect loops, mixed host rules (www/apex), middleware logic, inconsistent trailing slash.
Fix: make redirects deterministic: 301 to one canonical destination, no chains.
- Deep dive: GSC redirect error: the fastest fix checklist
- Related: Redirect loop: how to fix it
- Related: Page with redirect (GSC): what it means
Page with redirect
Meaning: The inspected/submitted URL redirects, so Google treats that URL as non-indexable.
Fix: update internal links and sitemaps to point directly to the final canonical URL. Collapse chains to one hop.
Redirect loop
Meaning: Redirect chain cycles.
Fix: pick one canonical pattern and collapse rules to one deterministic redirect.
- Deep dive: Redirect loop: how to find it and fix it
Submitted URL marked 'noindex'
Meaning: You submitted a URL (usually via sitemap) that contains a noindex directive.
Fix: decide whether it should be indexed. If yes, remove noindex (meta + headers). If no, remove it from the sitemap.
Submitted URL blocked by robots.txt
Meaning: You submitted a URL but robots.txt blocks crawling.
Fix: choose a strategy: allow crawl + index (if it should rank) or allow crawl + noindex (if it should be dropped).
robots.txt unreachable
Meaning: Googlebot cannot fetch your robots.txt reliably.
Fix: make /robots.txt return 200 consistently (no timeouts, no WAF blocks, no redirect chains).
- Deep dive: robots.txt unreachable: fixes and validation
Blocked due to access forbidden (403)
Meaning: Googlebot received a 403.
Fix: remove accidental blocks (WAF, geo rules, auth leaks) for pages you want indexed; otherwise remove from sitemap and keep access consistent.
Crawl anomaly
Meaning: Unstable crawl behavior (often timeouts, intermittent 5xx, redirect loops/chains).
Fix: stabilize responses and simplify canonicalization.
- Deep dive: Crawl anomaly: debug flow
What to index (and what not to fight for)
Indexing is not a moral goal.
Pick 5-10 "core URLs" (pillars, hubs, best essays), request indexing for those, and build signals (internal linking + coherence).
If a URL exists for navigation, experiments, or legacy access, it may be better as:
- noindex, follow (utility pages you still want crawled)
- 410 (obsolete legacy URLs)
If you pivoted recently, this helps clean up crawl debt:
FAQ
Should I request indexing for every page?
No. Request indexing for a small set of core pages. If the site is noisy or hierarchy is weak, mass requests rarely change outcomes.
Can "not indexed" mean a penalty?
Usually no. It is most often prioritization + duplication + crawl debt. Fix the obvious technical gates, then work on coherence and internal linking.
Next in GSC statuses
Browse the cluster: GSC indexing statuses.
- Page with redirect (Google Search Console): What it means and how to fix it
- Redirect loop: How to find it and fix it (SEO + GSC)
- GSC redirect error: The fastest fix checklist (chains, loops, and canonical URLs)
- Submitted URL marked 'noindex': The fastest fix checklist (GSC)
- Submitted URL blocked by robots.txt: What it means and what to do (GSC)
- robots.txt unreachable: Why it happens and how to fix it
Next in SEO & Search
Up next:
why google search console shows impressions but no clicksWhy Search Console can show impressions but no clicks: positions, intent mismatch, weak snippets, and how to fix CTR without guesswork.