AI Terms: A Glossary of AI Search & GEO Definitions

MCP, RAG, grounding, share of AI voice… the language of AI search moves fast, and a term that didn't exist a year ago can be everywhere today. This glossary defines the AI and GEO terms every marketer and SEO needs to know.
If our SEO glossary is the dictionary for classic search, think of this one as its companion for the generative era — the words you need to understand how ChatGPT, Gemini, Perplexity and Google's AI Overviews actually find, read and cite content. Each entry is short and practical, and where we've written a full guide we link straight to it so you can go deeper.
The shift from ranking links to generating answers brought a whole new vocabulary, much of it borrowed from machine learning. You don't need a data-science degree to work in this space, but you do need to know what people mean when they talk about embeddings, grounding or query fan-out. Bookmark this page and start speaking AI search fluently.
This glossary is maintained by David Kaufmann and the SEOcrawl team — the people tracking how AI engines cite brands every day.
A
AEO (Answer Engine Optimization)
Answer Engine Optimization is the practice of optimizing content so that AI answer engines select it as the source for a direct response. It's closely related to GEO, with the emphasis on being the answer rather than one of ten blue links.
Agent (AI agent)
An AI agent is a system that uses a language model to take actions — calling tools, browsing, or completing multi-step tasks — rather than just returning text. Agents are why protocols like MCP matter: they need a safe, standard way to reach external data and services.
AI Mode
AI Mode is Google's conversational, AI-generated search experience where a chat-style interface answers follow-up questions directly. Appearances inside AI Mode can't be measured the way classic rankings are, though the clicks it sends can be tracked in analytics.
AI Overview
An AI Overview is the AI-generated summary Google places at the top of many search results, pulling from multiple sources and citing them. Earning a spot among those cited sources is a central goal of GEO.
Answer engine
An answer engine is any system that responds to a query with a synthesized answer instead of a list of links — ChatGPT, Perplexity, Gemini and Google's AI Overviews all qualify. The term frames the strategic shift behind AEO.
B
Brand mention monitoring
Brand mention monitoring in AI search means tracking when, where and how AI engines name your brand in their answers. It's the foundation of any AI visibility strategy, since you can't improve what you can't see.
C
Chunking
Chunking is the process of splitting content into smaller passages so a retrieval system can index and fetch the most relevant piece. Clear structure — short sections, descriptive headings, self-contained paragraphs — makes content easier to chunk and retrieve.
Citation
An AI citation is a reference to your site or brand inside an AI-generated answer, shown as a linked source, footnote or inline mention. Citations are to AI search what rankings are to classic SEO: the unit of visibility you're competing for.
Crawler (AI crawler)
An AI crawler is a bot operated by an AI company to gather web content for training or live retrieval — for example GPTBot, ClaudeBot, Google-Extended or PerplexityBot. Your robots.txt controls which of them may access your site.
E
Embedding
An embedding is a numerical representation of text (or images) that captures meaning as a list of numbers, so a model can measure how similar two pieces of content are. Embeddings power semantic search and retrieval inside AI systems.
Entity
An entity is a distinct, identifiable thing — a person, brand, product or place — that engines track and connect in a knowledge graph. Being recognized as a clear entity helps AI engines associate your brand with the right topics and mention it confidently.
F
Fine-tuning
Fine-tuning is the process of further training a base model on a focused dataset to specialize its behavior or knowledge. It's distinct from retrieval: fine-tuning bakes information into the model, while retrieval fetches it at answer time.
G
GEO (Generative Engine Optimization)
Generative Engine Optimization is the discipline of optimizing your content and brand presence so generative AI engines mention and cite you. It extends SEO into ChatGPT, Gemini, Perplexity and AI Overviews.
Grounding
Grounding is when an AI engine bases its answer on retrieved, verifiable sources rather than only its trained parameters. Grounded answers are the ones most likely to include citations — which is exactly why being a retrievable, trustworthy source matters.
H
Hallucination
A hallucination is a confident but false or fabricated statement produced by an AI model. Strong, well-structured, citable content reduces the chance an engine invents details about your brand instead of pulling the correct facts.
K
Knowledge graph
A knowledge graph is a structured map of entities and the relationships between them. AI engines lean on knowledge graphs to disambiguate brands and decide which facts to trust about you.
L
LLM (Large Language Model)
A large language model is an AI model trained on vast amounts of text to predict and generate language — the engine behind ChatGPT, Claude, Gemini and others. Everything in AI search ultimately runs on one.
llms.txt
llms.txt is a proposed plain-text file that points AI models to your most important content in a clean, structured form. Adoption is still emerging, and it's no substitute for solid, crawlable content.
M
MCP (Model Context Protocol)
The Model Context Protocol is an open standard that lets AI assistants connect to external tools and data sources consistently. It's how a model can securely call a service like SEOcrawl to pull live SEO data instead of guessing from training.
Multimodal
Multimodal describes a model that can process more than one type of input — text, images, audio or video — within the same system. It's why AI engines can now read a screenshot or a chart, not just words.
P
Prompt
A prompt is the instruction or question you give an AI model to produce a response. In AI search, the prompts real users type are the queries you're trying to show up for.
Prompt tracking
Prompt tracking is monitoring how AI engines answer a defined set of prompts over time — which brands they mention, which sources they cite, and how that changes. It's the AI-search counterpart to rank tracking.
Q
Query fan-out
Query fan-out is the technique where an AI engine breaks one user question into several sub-queries, runs them in parallel, and synthesizes the results into a single answer. Understanding it explains why covering a topic thoroughly beats targeting one exact phrase.
R
RAG (Retrieval-Augmented Generation)
RAG is an architecture where a model retrieves relevant documents at answer time and uses them to generate a grounded, sourced response. Most AI search experiences are some form of RAG — which is why being retrievable is the whole game.
Reranking
Reranking is a second-pass step that reorders retrieved passages by relevance before the model writes its answer. It's one reason the most useful, well-matched passage often wins over the merely popular one.
S
Share of AI voice
Share of AI voice is the percentage of AI answers, for a topic or prompt set, in which your brand appears versus competitors. It's the headline metric for measuring AI visibility and benchmarking against rivals.
T
Token
A token is the unit of text a model reads and generates — roughly a word or word-piece. Tokens matter because they define a model's context limits and the cost of running it.
Training data
Training data is the body of text and other content a model learned from before deployment. If your brand and facts are well represented across the web, they're more likely to be reflected in what a model already "knows."
V
Vector database
A vector database stores embeddings and finds the closest matches to a query by meaning rather than exact keywords. It's the retrieval engine behind semantic search and most RAG systems.
Z
Zero-click
A zero-click result is one where the user gets their answer directly in the interface and never visits a website. AI answers have pushed zero-click higher, which is why visibility inside the answer — citations and mentions — now matters as much as the click.
Author: David Kaufmann

I've spent the last 10+ years completely obsessed with SEO — and honestly, I wouldn't have it any other way.
My career hit a new level when I worked as a senior SEO specialist for Chess.com — one of the top 100 most visited websites on the entire internet. Operating at that scale, across millions of pages, dozens of languages, and one of the most competitive SERPs out there, taught me things no course or certification ever could. That experience changed my perspective on what great SEO really looks like — and it became the foundation for everything I've built since.
From that experience, I founded SEO Alive — an agency for brands that are serious about organic growth. We're not here to sell dashboards and monthly reports. We're here to build strategies that actually move the needle, combining the best of classical SEO with the exciting new world of Generative Engine Optimization (GEO) — making sure your brand shows up not just in Google's blue links, but inside the AI-generated answers that ChatGPT, Perplexity, and Google AI Overviews are delivering to millions of people every single day.
And because I couldn't find a tool that handled both of those worlds properly, I built one myself — SEOcrawl, an enterprise SEO intelligence platform that brings together rankings, technical audits, backlink monitoring, crawl health, and AI brand visibility tracking all in one place. It's the platform I always wished existed.
Discover more content about this author

If you're not tracking AI visibility, you're flying blind on a channel that already drives over a billion monthly referral visits. Here's a 5-step framework to track your brand in ChatGPT, Perplexity, Gemini, and beyond.

Google rankings aren't enough anymore. SEO gets you ranked, AEO gets you surfaced as a direct answer, and GEO gets you cited by ChatGPT, Claude, Gemini and Perplexity. Here's how the three work together and which layer your brand is missing.