What the semantic layer solves
In a typical marketing analytics stack, the same metric can be defined three different ways:
- Marketing’s “ROAS” includes only Meta and Google spend
- Finance’s “ROAS” includes all paid media plus agency fees
- The CEO’s “ROAS” is what the latest dashboard happens to show
Each definition is defensible. They’re just inconsistent, and when leadership asks “what’s our ROAS?” the answer depends entirely on which report they’re looking at.
A semantic layer fixes this by defining each metric once, in a single place, with explicit business logic. Every downstream consumer, BI dashboard, CRM, AI agent, uses the same definition. “ROAS” means the same thing everywhere.
What a semantic layer contains
Five primitives:
- Metrics,
revenue,roas,cac,aov,ltv, custom KPIs - Dimensions,
channel,campaign,country,cohort,device - Joins, how the underlying tables connect
- Filters, e.g. “exclude internal traffic,” “only paid”
- Targets and budgets, actual vs. plan comparisons
The semantic layer translates a request like “ROAS by channel this quarter” into the right SQL against the underlying warehouse, applying the right joins, filters, and aggregations.
Why AI agents need the semantic layer
This is one of the most important 2026 patterns. An LLM can be given:
- Raw warehouse schema, it’ll write SQL that’s syntactically valid but semantically wrong
- A dashboard screenshot, it’ll read the numbers visible but miss filters and context
- A semantic-layer interface, it’ll get correct metrics with the right definitions baked in
MCP servers built on a semantic layer produce dramatically more reliable AI analytics. The AI doesn’t need to know which columns to join or how revenue is calculated, it asks for revenue and gets the right number.
Without a semantic layer, AI-driven analytics quickly degrades into confidently-wrong answers.
Semantic layer in modern data stacks
Several products implement this pattern: dbt Semantic Layer, Cube, Looker’s LookML, Lightdash. The common shape:
- Metric definitions in code (YAML, SQL, or a DSL), version-controlled in git
- A query engine that compiles metric requests to SQL
- An API surface that BI tools, AI clients, and CRM integrations can call
- Often a caching layer for performance
The “headless BI” framing emphasises that the semantic layer doesn’t need its own visualisation, it serves any consumer that speaks its query API.
Common mistakes
- Letting individual dashboards reimplement metrics. This is the source of the original problem. Enforce the rule: dashboards call the semantic layer, never the warehouse directly.
- Defining too many metrics. A semantic layer with 800 metrics is unusable. Prune ruthlessly to the ~30-100 that drive real decisions.
- Treating the semantic layer as a BI feature. It’s foundational infrastructure for AI agents too. Design for both consumers from the start.
FAQ about Semantic Layer
What is a semantic layer?
A semantic layer is a unified set of metric and dimension definitions that sits between raw data and downstream consumers (dashboards, BI tools, AI agents). It is what ensures “ROAS” means the same thing in every report and query.
Do I need a semantic layer if I have a BI tool?
Increasingly yes. Without a semantic layer, each dashboard defines metrics differently and they drift apart. A semantic layer makes the metric definition the single source of truth.
Why do AI agents need a semantic layer?
LLMs given raw warehouse schemas write SQL that is syntactically valid but semantically wrong. A semantic layer exposes consistent, business-defined metrics that the AI can request without making mistakes about joins, filters, or aggregations.