Crawled but not indexed reasons (2026): 9 causes and what to do first

If Search Console says “Crawled — currently not indexed”, you’re being told one thing: Google can fetch the URL, but it isn’t committing storage.

People interpret that as a page-level failure.

Most of the time it’s not. It’s a site-level prioritization decision with a small number of repeatable causes.

Direct answer (what this status means)

Google fetched and processed the URL, but did not commit it to long-term storage. The reasons almost always reduce to:

Hard gates (robots/canonicals/duplicates/soft‑404/rendering ambiguity), or
Priority (the site/URL is not important enough yet relative to other URLs competing for storage).

Quick triage (what to check first)

If you want a strict checklist: Crawled but not indexed (debug checklist)
If you want the system model: Crawled, not indexed: what actually moves the needle
If you suspect duplication: Canonical vs duplicate content

This page is a reasons list (search-intent entry point). If you want a step-by-step checklist, use:

Crawled but not indexed (GSC): debug checklist

If you want the underlying model (what actually changes the decision), use:

Crawled, not indexed: what actually moves the needle

The two buckets (this decides everything)

Every “crawled but not indexed” case ends up in one of two buckets:

Hard gates: Google can crawl you, but something makes indexing unstable or ambiguous (robots, canonicals, duplicates, soft-404 patterns, rendering).
Priority: indexing is possible, but the site/page isn’t important enough yet relative to everything else Google could store.

Your job is not to guess. Your job is to classify.

9 common reasons (ranked by how often they happen)

1) Canonical points somewhere else (or is effectively ignored)

If your inspected URL canonicals to a different URL, Google often refuses to index the inspected one.

High-signal failure modes:

canonical points to a URL that redirects
canonical points to a URL that is 404/410
canonical target is inconsistent (changes across builds/hosts)

2) Duplication via parameters / host / slash variants

Even if the content is “unique”, Google may still treat your URL as a variant of something it already saw.

Typical footprints:

www vs apex
?m=1 and other low-value query params
trailing slash variants
legacy feed endpoints

The fix is boring and decisive: one stable canonical pattern and 301 the rest.

3) Soft-404 patterns (200 OK, but “nothing”)

Google can crawl a 200 page and still classify it as “not a real document”.

Examples:

thin templates
empty states
“not found” text without a real 404
placeholder pages generated by a route

Soft 404 (GSC)

4) Internal linking doesn’t express that the page matters

Discovery is not priority.

If a URL is only reachable from the /blog list (or deep pagination), Google may crawl it but never promote it into “worth storing”.

Fix patterns:

link the page from /topics/seo (hub)
link it from a pillar or guide
add 2–4 contextual links from neighboring pages in the cluster

5) The page fails the “incremental value” test

Indexing is not just quality. It’s also distinctiveness.

If your page reads like “another generic summary”, Google may crawl it but not store it.

The fix is usually not “more words”, but sharper constraints:

one intent
one promise
a specific angle (for new sites, for GSC statuses, for 2026 SERPs, etc.)

6) The site has too much crawl debt (noise)

If the site still contains lots of low-value legacy URLs, Google becomes conservative overall.

Symptoms:

many 404/410/redirect chains
old topic slugs still being crawled
archive-like URLs being discovered

This is why cleanup matters: reduce the URL surface area before you publish more.

301 vs 410 (and 404): GSC cleanup after a site pivot

7) New site / topic pivot probation (sampling under uncertainty)

If your domain changed topic, Google often samples.

In that phase:

indexing is conservative
refresh is slower
duplicates are judged harsher

Google sandbox (2026): probation and sampling model

8) Rendering is expensive / unstable

This isn’t the most common reason, but it’s real for Next.js sites that rely on heavy client-side rendering.

If the fetched HTML is missing key content, Google may defer indexing.

Submitted URL has crawl issue (GSC)

9) You are indexing a URL you should not index

Sometimes the correct “fix” is: don’t fight for this URL.

Examples:

thin paginated archives
legacy category/tag pages
experiments and utility routes

Index the pages that define your topic. Let everything else be cheap to discard.

What to do first (a minimal sequence)

Confirm it’s a real document (not soft-404, not placeholder, 200 OK).
Confirm canonical clarity (one canonical, stable, 200 OK).
Confirm duplication is controlled (host/params/slash variants).
If all of that is clean: treat it as priority.
Then improve the only two priority signals you control:
- internal linking (role in the graph)
- coherence (cluster density around the intent)

What not to do

Don’t request indexing for 50 URLs. Pick 5–10 core pages.
Don’t “rewrite the intro” as your first move.
Don’t redirect every old URL to /blog (soft‑404 smell). Use 410 for true removals.

System context

Next step

If the page is indexed but still gets no impressions or traffic, you’re in the retrieval/visibility layer: