The Only Data & AI Glossary You Need in 2026
- May 20
- 2 min read
100 essential terms, explained in plain language for business leaders — not developers.

The language of data and AI is evolving faster than most organisations can keep up with. New terms appear weekly. Familiar ones get redefined. And the gap between what technology vendors say and what things actually mean keeps widening.
This glossary cuts through the noise.
A sample of what's inside:
Hallucination — When an AI confidently generates information that is factually incorrect or completely fabricated, without signalling any uncertainty. A critical risk in legal, medical, or financial contexts.
RAG (Retrieval-Augmented Generation) — Giving an AI model access to your specific, up-to-date documents at query time so it answers accurately using current information — not just what it was trained on.
Data Lakehouse — An architecture combining the flexibility and low cost of a data lake with the structured querying and governance of a data warehouse, in a single platform. The dominant modern data architecture in 2026.
MCP (Model Context Protocol) — An open standard that gives AI agents a universal way to connect to external tools and data sources. Often described as "USB-C for AI integrations."
Agentic AI — AI systems that don't just answer questions — they plan, use tools, check results, and complete complex multi-step tasks with minimal human input. The defining technology shift in 2025–2026.
Context Window — The maximum amount of text an AI model can see and process at one time — its working memory. Everything outside the window is forgotten.
AI Governance — The organisational and regulatory frameworks for overseeing AI systems, covering risk assessment, accountability, transparency, and compliance. By 2026, a compliance requirement — not just an ethics statement.
Inference-Time Scaling — Instead of only training larger models on more data, models are given more compute time to "think" before answering. The paradigm shift underpinning reasoning models like OpenAI o3.
Data Contract — A formal agreement between data producers and consumers defining the schema, format, and quality standards for a dataset. Essential infrastructure for AI pipelines — bad data degrades AI silently.
AI Slop — Low-quality, high-volume AI-generated content with minimal human review. A recognised brand and trust risk both externally and inside organisations.

Comments