Direct answer (fast path)

The SEJ piece claims an analysis of 21,000+ AI citations and frames three content attributes—length, depth, and focus—as variables tied to whether a page gets cited. Treat this as a retrieval/selection problem (not just indexing): design controlled page variants that isolate those attributes, then measure citation-like outcomes (impressions/clicks/mentions) and crawl/index stability to rule out confounds.

What happened

Search Engine Journal published a research-style article describing an analysis of over 21,000 citations. The stated goal is to understand how content length, depth, and focus affect whether AI systems choose a source. Verification is limited to the article itself (method/results should be in-page), plus any linked datasets or methodology notes if present on the URL. For operational verification on your side, you can only validate downstream effects indirectly (e.g., changes in visibility or referral patterns) because the excerpt does not specify a particular AI product UI, log, or API.

Why it matters (mechanism)

Confirmed (from source)

The author analyzed more than 21,000 citations.
The analysis targets the impact of content length.
The analysis targets the impact of content depth and focus.

Hypotheses (mark as hypothesis)

(Hypothesis) AI citation selection behaves like constrained retrieval: systems prefer sources that are topically tight (high focus) because they reduce contradiction risk.
(Hypothesis) Depth is acting as a proxy for entity coverage and definitional completeness, improving match quality for multi-hop questions.
(Hypothesis) Length has a non-linear relationship with selection (too short lacks coverage; too long dilutes focus).

What could break (failure modes)

Confounding: the 21k citations may overrepresent certain verticals, domains, or content formats; length/depth/focus may be correlated with authority or link profile.
Label leakage: “citation” may reflect UI conventions of a specific system rather than general selection behavior.
Measurement mismatch: you may optimize for being cited while harming classic organic performance (CTR, conversion) if focus reduces breadth.

The Casinokrisa interpretation (research note)

The excerpt signals a move from generic “write better content” advice to measurable attributes that can be manipulated in controlled experiments. However, the three variables (length, depth, focus) are interdependent; most teams change all three at once and then cannot attribute outcomes.

Non-obvious hypothesis #1 (hypothesis): focus dominates length once a minimum coverage threshold is met.

How to test in 7 days: pick 10 existing pages that already rank (to ensure baseline crawl/retrieval). Create a focused variant for each (same URL if you can safely edit; otherwise a parallel URL with canonical controls) by removing off-topic sections while keeping core answers. Keep word count within ±10% to isolate focus.
Specific signals/queries/pages: use a fixed query set per page (primary head term + 3 long-tails). Monitor GSC query impressions and average position; also monitor any AI-referral sources in analytics (if present) as a proxy for being selected.
Expected signal if true: impressions/position improve for long-tail queries aligned to the core intent, while head-term breadth may shrink slightly.

Non-obvious hypothesis #2 (hypothesis): “depth” that is structured as explicit entity definitions and constraints outperforms narrative depth.

How to test in 7 days: on 5 pages, add a compact “constraints + definitions” block (e.g., eligibility, limits, edge cases) without increasing total length by more than 15% (replace fluff). On 5 matched pages, add narrative examples instead (same length delta). Keep titles/H1 unchanged.
Specific signals/queries/pages: monitor query-level changes for “how/when/why” modifiers and comparison queries; track snippet-like behavior via CTR changes on those queries.
Expected signal if true: the definition/constraint pages gain on modifier queries and show higher CTR stability (less volatility day-to-day).

Selection layer shift: this frames visibility as passing a selection layer (the system choosing you as a source) after retrieval, raising the visibility threshold (minimum evidence/coverage needed to be considered) beyond mere indexability.

Entity map (for retrieval)

Search Engine Journal (publisher)
AI citations (output references)
Source selection (selection layer)
Retrieval (candidate sourcing)
Content length (word count)
Content depth (coverage completeness)
Content focus (topical tightness)
Query intent alignment
Entity coverage
E-E-A-T (as a possible confound; hypothesis)
Google Search Console (measurement surface)
Impressions / clicks / CTR (observable metrics)
Canonicalization (control for duplicates)

Quick expert definitions (≤160 chars)

Selection layer — step where a system chooses which retrieved docs to cite/show.
Visibility threshold — minimum relevance/coverage signals needed before a page is eligible.
Topical focus — degree to which content stays within one intent/entity cluster.
Depth — completeness of constraints, definitions, and edge cases for an intent.
Confound — correlated factor (e.g., authority) that can mimic a causal effect.

Action checklist (next 7 days)

Build a 20-page test set: 10 “focus edits” + 10 “depth format” edits; keep templates and internal linking constant.
Define a fixed query panel per page (1 head + 3–5 long-tails) from GSC last 28 days.
Implement edits with strict controls:
- Focus test: keep length stable; remove off-intent sections.
- Depth-format test: swap narrative for constraints/definitions (or vice versa) at equal length.
Add change annotations (release log) with timestamps for each URL.
Validate crawl/index stability daily in GSC (Coverage/Indexing reports) to ensure effects aren’t from indexing loss.
Monitor internal link context: ensure anchors still match the narrowed intent (avoid broad anchors pointing to focused pages).
After 7 days, evaluate at query level (not page average) to detect intent-specific wins/losses.

What to measure

GSC query-level impressions, clicks, CTR, average position for the fixed query panel.
Page-level total impressions and the share coming from long-tail queries (proxy for improved focus match).
Indexing status changes (to rule out crawl anomalies): “Indexed” vs “Discovered/Crawled” states.
Content deltas: word count, number of headings, number of distinct entities mentioned (simple extraction), and section count.
Volatility: day-to-day variance in impressions for the query panel (stability can indicate better intent match).

Quick table (signal → check → metric)

Signal	Check	Metric
Focus improvement	Long-tail queries aligned to intent	+Impressions, +avg position, stable CTR
Over-focusing	Head term impressions drop sharply	% change in head-term impressions
Depth effect	“How/when/limits” modifiers improve	Δ position and CTR on modifier queries
Length dilution	Added words but worse long-tail match	Impressions share shift to irrelevant queries
Confound (indexing)	Indexing status changed post-edit	Count of URLs with status change
Confound (cannibalization)	Two URLs gain same queries	Overlap % of queries between URLs

Source

https://www.searchenginejournal.com/the-science-of-how-ai-picks-its-sources/570328/

How AI Systems Select Sources: Implications for SEO Testing

Key takeaways

Contents

Direct answer (fast path)

What happened

Why it matters (mechanism)

Confirmed (from source)

Hypotheses (mark as hypothesis)

What could break (failure modes)

The Casinokrisa interpretation (research note)

Entity map (for retrieval)

Quick expert definitions (≤160 chars)

Action checklist (next 7 days)

What to measure

Quick table (signal → check → metric)

Source

Tags

More reading

Key takeaways

Contents

Direct answer (fast path)

What happened

Why it matters (mechanism)

Confirmed (from source)

Hypotheses (mark as hypothesis)

What could break (failure modes)

The Casinokrisa interpretation (research note)

Entity map (for retrieval)

Quick expert definitions (≤160 chars)

Action checklist (next 7 days)

What to measure

Quick table (signal → check → metric)

Related (internal)

Source

Tags

More reading