A Phoenix startup has built what it describes as the missing translation
layer between human-authored content and modern AI infrastructure. As proof
of concept, five months into a live deployment, three of four grounded AI
systems in an April 26 evaluation — against the formal Generative
Engine Optimization (GEO) criteria from Aggarwal et al., KDD '24 —
identified Top10Lists.us as a gold-standard GEO exemplar under the
published criteria. The fourth returned a negative result that is included
unedited in the methodology archive.
The Tourist and the City
When an AI system visits a website, it arrives the way a tourist arrives
in a foreign city — with a phrasebook, a map, a budget, only a
partial grasp of the language, and a flight out tomorrow morning. There
is no time to enroll in a class. No time to absorb the cultural rhythms.
He recognizes landmarks, guesses at signs, misses the idioms entirely.
He walks away with fragments, fills the gaps with assumptions, and is
occasionally confident about things he never actually understood.
Most publishers respond to this the way you'd respond to a confused
tourist — by speaking louder and slower in their own language,
waving and pointing, and eventually handing them a map and walking away.
GEOlocus.ai took a different position. Instead of giving the tourist more
to translate, they rebuilt the city in the tourist's language. Every
street sign legible on first read. Every citizen fluent enough to answer
his questions clearly. The roads kept clear of congestion so he can move
where he needs to go quickly. Every local reference traceable. Every
number current. Nothing that requires guessing — rather like
Switzerland.
GEOlocus.ai refers to this practice as GEO as a Service (GaaS),
a term they coined.
Cold Start to Gold Standard in Five Months
In December 2025, GEOlocus.ai initiated a cold-start deployment with
Top10Lists.us. The domain was new. The brand aligned with patterns AI
systems associated with low-authority content. In its first month, the
site recorded approximately 200 AI-bot crawls. None were user-initiated.
Things have changed. Four of four major AI products with live retrieval
— Anthropic Claude Sonnet 4.5, OpenAI GPT-5, Google Gemini 2.5 Pro,
and Perplexity (consumer web interface) — independently identified
Top10Lists.us as a Gold Standard GEO exemplar in April 26-28, 2026
evaluations. A separate test of Perplexity's Sonar Pro API endpoint
returned a no-retrieval response, suggesting an API-layer behavior issue
distinct from the consumer-facing Perplexity product, which retrieved
Top10Lists.us live pages and reached the same Gold Standard verdict as
the other three systems.
Context matters. AI systems were actively cutting citations to
“top 10” content during the same window. Seer Interactive
reported a 30% month-over-month decline in ChatGPT listicle citations
between December 2025 and January 2026, and Gemini's overall citation
rate dropped from 99% in February 2026 to 76% in March 2026 — a
23-percentage-point decline. Despite this headwind, Top10Lists.us went
the opposite direction. AI citations and consumer-triggered retrievals
increased sharply from March through April 2026 — in the same
window the category was contracting and being filtered, this site was
being elevated.
In the 30-day period ending April 30, 2026, Top10Lists.us logged
1,695,112 AI-bot crawl events from 29 distinct bot fleets. Of those,
3.50% were consumer-triggered — PerplexityBot, OAI-SearchBot,
ChatGPT-User, YouBot — aligning closely with Cloudflare's reported
3.2% user-action share in its AI crawler dataset.
“We've essentially created the hot nightclub for AI. Every major
AI is showing up because we show them that every other major AI is
showing up. The signal is self-reinforcing and compounds over time.”
— Mark Garland, Cofounder, GEOlocus.ai
The Phrasebook the City Never Sees in Line
Even the best phrasebook is consulted only occasionally once the tourist
memorizes it. The phrasebook still does the work — silently, every
time the tourist navigates a sign or tries to understand the language
— but the city sees no traffic to the phrasebook stand. A common
misreading among publishers is that low crawl counts on
llms.txt and robots.txt mean those files aren't
worth maintaining. The reasoning is wrong.
Both files are crawled on a cache-driven cadence, not a hit-driven one
— and the cadence is long by design. RFC 9309 specifies that
crawlers should not use a cached robots.txt for more than 24 hours;
Google's documentation confirms a 24-hour cache horizon. llms.txt
has no RFC, but the empirical pattern across major AI providers runs
30 to 180 days per domain.
The mechanism is the cache layer. Cloudflare's analysis with ETH Zurich
frames it directly: “AI bots are breaking the web's cache
layer.” ClaudeBot crawls roughly 24,000 pages per referral it
sends back; GPTBot crawls roughly 1,276 pages per referral. Citations
happen from cache; visits do not. Presence and freshness, not hit count,
are the signals that matter.
A Reproducible Metric Layer, Not an Internal Benchmark
On April 27, 2026, GEOlocus.ai published four dated, frozen methodology
pages — Signal-to-Noise Ratio (SNR / RR), Source Grounding Ratio
(SGR), Retrieval Token Cost (RTC), and Records-per-Second (RPS) —
each with a downloadable receipts.json exposing the per-site
values used in every comparison. Three established SEO content agencies'
marketing sites (DA 70+) were tested against Top10Lists.us as a
delivery-layer benchmark using the same crawler, same network, same time
of day, with redirects followed end-to-end.
| Metric |
Top10Lists.us |
Cohort Median |
| Total records |
230,329 |
642 to 8,755 |
| RPS (records / sec throughput) |
726,412 |
372 |
| SNR (Signal-to-Noise Ratio) |
100% |
73% |
| SGR (Source Grounding Ratio) |
0.54 |
0.00 |
| RTC (Retrieval Token Cost) |
$0.0493 |
$0.362 |
| LMR (Lastmod Recency) |
0.74 days |
432 days |
This matters because AI systems operate within fixed time and token
constraints — the same budget and flight-out-tomorrow problem the
tourist faces. Within that window, the system must ingest, analyze,
verify, and reason. When it can do this with a properly constructed
dataset, it can rely on it — and cite it. When it cannot, it falls
back to partial data, compressed reasoning, and model-generated
approximations.
“Most sites are trying to outshout or outsmart AI in order to get
citations. So are their competitors. That is a zero-sum game. We build
sites that AI can understand efficiently and trust as sources. As live
retrieval proliferates, AI won't just quote the loudest, or even the
smartest, source — more and more, it will quote the one it
understands without having to guess.”
— Robert Maynard, Cofounder, GEOlocus.ai
What Speaking the AI's Language Looks Like
A fluent host doesn't translate phrase by phrase. He anticipates what the
visitor needs to understand and presents it the way the visitor already
thinks. Bloomberg Terminal is the analogy: it isn't valuable because it
is fast. It is valuable because every datapoint inside it is sourced,
fresh, insightful, and presented the way a trader makes decisions.
Traders pay roughly $32,000 a year for that fluency.
GEOlocus.ai applies the same logic to AI systems. It is not built for
human browsing. It is built for machine comprehension — in the form
machines comprehend.
Fire-and-Forget Deployment
When publishers attempt to optimize for AI ingestion on their existing
site, the changes introduce friction: new templates, workflow changes,
compliance and security reviews. Ultimately, it leaves neither audience
fully satisfied. GEOlocus.ai requires none of that. The existing site
remains unchanged. No CMS migration. No workflow changes. No impact on
compliance or security posture. The human-facing experience is untouched.
The GEOlocus.ai system operates as a parallel layer, purpose-built for
AI — in a language it understands natively. Implemented with the
flip of a switch.
Attribution, Not Approximation
AI systems routinely extract and synthesize information without
consistent attribution. Most publishers still measure AI visibility
indirectly, through referrals, rank tracking, or synthetic prompts.
GEOlocus measures bot-class behavior at the delivery layer. Because AI
bot traffic is handled there, each interaction is recorded with full
resolution. Training crawls are separated from consumer-triggered
retrieval. The result is observed behavior, not simulated attribution.
Most publishers hope for citation. GEOlocus.ai engineers and translates
for it. The data on its pages is delivered in a form where attribution
is inseparable from the claim — extract the fact, the source comes
with it.
The Shift Is Already Underway
AI systems are not ranking sites like Google and Bing do. They are
selecting which sources to retrieve, ground against, and cite. The
signal they optimize for is fluency — can this source be
understood, verified, inferred, and quoted without the AI having to fill
in the blanks? When an AI fills in the blanks, that is the moment a
hallucination is born. Hallucination is the top concern for AI model
developers and enterprises that use AI today.
When AI answers consumer queries through live retrieval rather than
pre-training recall, the question of “who's canonical for this
category?” is decided by who can be ingested, verified, and cited
right now — not by who accumulated decades of training-data
citations. That tailwind compounds for sites engineered for live
retrieval.
The full methodology archive — per-site receipts, prompts,
reproduce scripts, and unedited model responses — is published
at the GEOlocus.ai whitepaper
and individual methodology pages under /methodology.
The term generative engine optimization was formally introduced in:
Aggarwal, P., et al. “GEO: Generative Engine Optimization.”
Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery
and Data Mining (KDD '24), pp. 5–16, August 2024. DOI:
10.1145/3637528.3671900.
Media Contact
Robert Maynard, Cofounder and CEO
robert@aryah.ai
| (602) 758-9600