Entity recognition before indexing: why Google can understand who you are before it trusts all your pages

11.12 min read/

Apr 27, 2026/

Google can sometimes understand a person, brand, or alias before it indexes the whole site. This article explains the difference between entity recognition, URL indexing, and visibility, and how to use branded search, evidence pages, schema, and external profiles to move from recognition to a stable Knowledge Panel.

Get new essays via Substack or RSS. Start with the guided path if you are new.

Substack RSS Start

Start with the main guide

Entity-based SEO (2026): how Google connects author, brand, and topics

Entity-based SEO is not schema spam. It is how the system resolves identity: who wrote this, what brand it belongs to, and which topic universe it lives in. This explains the mechanism, common misconceptions, practical signals, and how entity clarity supports indexing and visibility.

Supporting reads

How to get a Knowledge Panel for a person (without hacks): the system model
A Knowledge Panel is not something you "request". It appears when Google is confident it can resolve a stable person entity and connect it to corroborating sources. This guide explains the decision model (identity -> disambiguation -> corroboration -> persistence) and the few changes that actually increase certainty.
Photo authority for Person entities (2026): how to make Google pick the right image
If your Knowledge Panel image is unstable (small, wrong, or rotating), you do not have a “photo problem”. You have a consensus problem: too many competing images, inconsistent profile hubs, and weak machine-readable pinning. This is the practical fix.
Indexed does not mean visible: the selection layer in AI Mode search
AI Mode turns one question into many retrieval tasks. Visibility is governed by a selection layer beyond indexing and ranking. Here is how to diagnose it and adapt.
Search as trust distribution (2026): why visibility is a privilege, not a reward
Modern search is not a system of answers; it is a system of trust distribution. This signature page explains why indexing is not visibility, why retrieval gets stricter in compressed interfaces, and how sites earn stable distribution.

Key takeaways

Google can sometimes understand a person, brand, or alias before it indexes the whole site
This article explains the difference between entity recognition, URL indexing, and visibility, and how to use branded search, evidence pages, schema, and external profiles to move from recognition to a stable Knowledge Panel

Most people treat indexing as the first milestone.

That is too simple.

A search system can sometimes understand who an entity is before it decides to index, rank, or distribute every page around that entity.

This is why a site can have only a homepage indexed, but still trigger signs of entity understanding:

a branded result that connects a handle to a person
an AI answer that describes the person correctly
a right-side entity card for a longer query
a social profile ranking first for a short name query
a person page that is crawlable but not yet fully trusted as the primary result

That does not mean the site is finished. It means the system is starting to separate two decisions that publishers often mix together:

Entity recognition: "Who or what is this?"
URL trust: "Which pages should we index, select, and show?"

When you understand that separation, a frustrating indexing problem becomes a useful diagnostic signal.

I am Mikhail Drozdov, also known as Casinokrisa. I study indexing-first visibility models: how search systems crawl, store, interpret, trust, and distribute information. The live identity layer for this site is here:

This article explains the model behind the work.

The short version

If Google understands your entity but indexes only one or two pages, the problem is usually not "Google knows nothing."

It is usually one of these:

the entity is recognizable, but the site is still young or narrow
the homepage is the strongest source, so Google treats it as the safest canonical answer
supporting pages are crawlable, but not yet trusted as separate destinations
the external evidence graph is stronger than the internal page graph
Google has enough confidence to describe you, but not enough confidence to expand the panel

That is a different problem from a technical block.

A technical block says:

"I cannot process this."

An entity trust bottleneck says:

"I can see this, but I am not ready to distribute it broadly."

The fix is not mass publishing. The fix is a cleaner graph.

Recognition is not the same as indexing

Indexing is a URL-level decision.

Entity recognition is a graph-level decision.

A URL can be ignored while the entity behind it becomes clearer. A person can be understood through:

LinkedIn
Instagram
X
Google Scholar
ORCID
Amazon
SSRN
press mentions
a homepage
a person page

The system does not need every page indexed before it starts connecting those signals.

That is why branded search is often the earliest visible indicator of entity formation.

If a query like who is casinokrisa produces an answer that connects the alias to a person, the system has already built a basic bridge:

Casinokrisa -> Mikhail Drozdov -> role -> website -> external profiles

If a query like Mikhail Drozdov casinokrisa triggers a small right-side entity block, the bridge is stronger:

name + alias + image + date/fact + profiles

The site may still have a coverage problem. But the entity model is no longer blank.

The three layers: identity, evidence, distribution

I use a simple model for diagnosing this stage:

Identity layer
Evidence layer
Distribution layer

Each layer has a different failure mode.

1. Identity layer

The identity layer answers:

What is the canonical name?
What aliases point to the same person?
What is the primary role?
What is the official site?
Which image represents the person?
Which external profiles are true identity matches?

For a person entity, this layer should be boring.

Use one name, one role line, one canonical URL, and one primary image. Variation feels natural to humans, but it raises cost for machines.

For this site, the intended identity line is:

Mikhail Drozdov (Casinokrisa) is an AI Search & Indexing Systems Researcher and Founder of Casinokrisa.

That line should appear consistently on the homepage, person page, LinkedIn, Instagram, X, and strong external references.

2. Evidence layer

The evidence layer answers:

What independent sources mention the person?
What formal profiles exist?
Are there publications, papers, books, or research objects?
Does the external web repeat the same name and role?
Do third-party pages connect the person to the same site or organization?

This is where a press page becomes useful.

A press page is not a vanity page. It is an evidence router.

It should separate:

official profiles
external references
quotes and mentions
academic or research listings
book or publication records

The page should not pretend every mention has equal weight. A topical quote in an SEO publication is stronger for an SEO entity than a generic business quote. Both can help, but they should not be treated the same.

3. Distribution layer

The distribution layer answers:

Which pages get indexed?
Which pages get selected for snippets or AI answers?
Which sources appear in branded search?
Which image is promoted?
Which facts make it into the panel?

Distribution is where people get nervous because it is visible.

But distribution is downstream. If the identity and evidence layers are unstable, the distribution layer will keep changing.

That is why a small panel can appear, disappear, and reappear. The system is not necessarily confused. It may be recalculating confidence as new signals arrive.

Why only the homepage may index first

For a narrow or new expert site, the homepage is often the safest URL.

It has:

the most internal links
the clearest domain-level signal
the strongest branded relevance
the fewest duplicate intent conflicts
the highest chance of matching navigational queries

If the homepage is indexed and the person page is only crawled, that does not automatically mean the person page is bad.

It can mean the homepage is currently doing the job of:

site root
entity summary
brand page
person summary
topic doorway

That is heavy, but it is common.

The solution is not to force 50 pages into the index. The solution is to make the homepage a clean router that points to the few pages that matter:

person page
press page
research page
book page
one strong topical hub

Then let the supporting URLs earn separate selection over time.

The wrong response: publishing more generic articles

When a site has one indexed page, the instinct is to publish more.

That can make the problem worse.

If the new pages are generic, overlapping, or AI-shaped, they dilute the graph. Google sees a site that is trying to expand surface area before it has proven a stable center.

For an entity-driven site, the better sequence is:

stabilize the homepage
stabilize the person page
build an evidence page
create durable work objects, such as research pages or a book page
publish supporting articles only when they clarify the model

The question is not:

"Can we publish another post?"

The better question is:

"Does this post make the entity easier to understand, verify, or select?"

If the answer is no, do not publish it.

The right response: make the graph legible

There are five moves that usually help.

1. Use the homepage as the current entity anchor

If the homepage is the only reliably indexed URL, do not fight that reality.

Make the homepage say the essential facts cleanly:

name
alias
role
founder relationship
canonical person page
book or research object
evidence page
primary image

This does not mean turning the homepage into a biography. It means making the identity layer unavoidable.

2. Keep the person page as the canonical identity page

The homepage can be the strongest indexed node, but the person page should still be the canonical identity page.

That page should contain:

full name
aliases
role
birth facts if publicly used
official image
official profiles
evidence links
author-of relationships
structured data

The homepage points to it. The person page explains it.

3. Treat the book as a work entity

A book listing on Amazon is useful, but it is not fully controlled by the site.

An on-site book page helps create a work object:

Person -> authorOf -> Book -> sameAs -> Amazon

This gives the site a cleaner way to explain that the person has authored a specific work.

That is why the Indexing-First Search Systems book page matters.

It is not there to sell harder. It is there to make the work entity explicit.

4. Separate `sameAs` from evidence

This is one of the most common structured data mistakes.

sameAs should point to profiles or pages that represent the same entity:

LinkedIn profile
X profile
Instagram profile
Google Scholar profile
ORCID profile
Crunchbase profile

It should not point to generic domains like amazon.com or ssrn.com.

Articles, quotes, press mentions, and book listings are better modeled as evidence:

subjectOf
authorOf
mentions
visible links in a press page

This distinction matters because it tells the system which URLs are identity equivalents and which URLs are supporting proof.

5. Keep the image boring and consistent

A person image is not just a design asset.

It is a candidate for entity resolution.

The best image is not always the most flattering. It is the one that is:

stable
repeated across strong profiles
high resolution
face-forward or clearly recognizable
not competing with many other versions

If Google keeps using a small or awkward image, the answer is usually not to upload ten new photos. The answer is to make one photo the obvious winner.

For the deeper playbook, read Photo authority for Person entities.

How to diagnose the stage you are in

Use queries, not feelings.

Track the same set every week:

Mikhail Drozdov
Mikhail Drozdov casinokrisa
who is casinokrisa
Casinokrisa
Mikhail Drozdov AI Search
Mikhail Drozdov indexing

Then classify what you see.

Stage 1: No entity resolution

Symptoms:

unrelated people dominate
no branded understanding
no consistent image
your site does not appear for your name

Fix:

strengthen identity profiles
use one role line
add a canonical person page
connect official profiles

Stage 2: Alias recognition

Symptoms:

Google connects the handle to the person
a social profile ranks first
the site appears for branded queries
AI answers may describe the entity

Fix:

strengthen homepage and person page
add evidence page
reduce conflicting internal pages
avoid mass content expansion

Stage 3: Partial panel

Symptoms:

right-side card appears for longer queries
image and basic facts show
panel is unstable by query or locale
Google still prefers social profiles for short queries

Fix:

keep signals stable
add stronger independent evidence
improve image consensus
send only core URLs for recrawl

Stage 4: Stable panel

Symptoms:

panel appears for short name or brand queries
image is stable
role is stable
site and profiles are consistently connected

Fix:

stop changing core facts
add new evidence slowly
maintain clean pages
keep external profiles aligned

What to send for recrawl

Do not request indexing for every support page.

Send the core graph:

homepage
person page
press page
research page
book page

If Google indexes only the homepage first, that is still useful. The homepage can carry the strongest summary while the rest of the graph becomes more trusted.

What this means for AI answers

AI answers are often the first place where weak entity understanding becomes visible.

If an AI Overview correctly says:

"Casinokrisa is the pseudonym of Mikhail Drozdov..."

then the entity relationship is already being used in answer generation.

That does not guarantee a Knowledge Panel. But it is a strong intermediate signal.

It means the system can summarize the relationship.

The next challenge is not comprehension. It is confidence.

To move from comprehension to confidence, the web needs to repeat the same facts across independent surfaces.

The practical rule

When only the homepage is indexed, treat it as the current entity anchor.

When the person page starts getting indexed, treat it as the canonical identity source.

When the press, research, and book pages start getting indexed, treat them as proof objects.

Do not panic if those steps happen out of order.

Search systems do not build trust in the same order publishers build websites.

They often understand the entity first, test the homepage second, and distribute the supporting URLs later.

That is why the right goal is not "index everything."

The right goal is:

Make the entity impossible to misunderstand, then make each supporting URL worth selecting.

That is how a small site moves from one indexed page to branded search confidence, then to entity cards, then to a more stable Knowledge Panel.

Entity recognition before indexing: why Google can understand who you are before it trusts all your pages

Key takeaways

Contents

The short version

Recognition is not the same as indexing

The three layers: identity, evidence, distribution

1. Identity layer

2. Evidence layer

3. Distribution layer

Why only the homepage may index first

The wrong response: publishing more generic articles

The right response: make the graph legible

1. Use the homepage as the current entity anchor

2. Keep the person page as the canonical identity page

3. Treat the book as a work entity

4. Separate `sameAs` from evidence

5. Keep the image boring and consistent

How to diagnose the stage you are in

Stage 1: No entity resolution

Stage 2: Alias recognition

Stage 3: Partial panel

Stage 4: Stable panel

What to send for recrawl

What this means for AI answers

The practical rule

Tags

More reading

Key takeaways

Contents

The short version

Recognition is not the same as indexing

The three layers: identity, evidence, distribution

1. Identity layer

2. Evidence layer

3. Distribution layer

Why only the homepage may index first

The wrong response: publishing more generic articles

The right response: make the graph legible

1. Use the homepage as the current entity anchor

2. Keep the person page as the canonical identity page

3. Treat the book as a work entity

4. Separate sameAs from evidence

5. Keep the image boring and consistent

How to diagnose the stage you are in

Stage 1: No entity resolution

Stage 2: Alias recognition

Stage 3: Partial panel

Stage 4: Stable panel

What to send for recrawl

What this means for AI answers

The practical rule

Tags

More reading

4. Separate `sameAs` from evidence