6 LLMs  ·  5 languages  ·  Quarterly index  ·  Independent research  ·  Updated Q2 2026
CEAVERS
Centre for European AI Visibility Evaluation & Research Standards

Glossary

Schema.org

Last reviewed: 2026-05-22

Schema.org is the standardised vocabulary for structured data on the web. Embedding Schema.org JSON-LD in pages lets crawlers and language models parse entities and relationships directly, increasing the chance of citation.

JSON-LD versus Microdata

Schema.org can be embedded in three formats: JSON-LD (a separate <script> tag), Microdata (attributes on HTML elements), and RDFa (attribute-based). JSON-LD is the recommended format for all major search engines and retrieval systems — it is decoupled from the HTML, easy to validate, and trivial to update without modifying rendered content. Microdata and RDFa are largely deprecated for new implementations.

The types that matter most for AI citation

Research on retrieval-augmented systems (arXiv:2509.10697) documents a positive effect of schema.org markup on both retrieval probability and citation likelihood. The highest-impact types for research and institutional sites are:

Schema.org on CEAVERS

Every CEAVERS page carries JSON-LD structured data: the homepage declares WebSite and Organization (with sameAs: "https://www.wikidata.org/wiki/Q139785574"); research articles declare ScholarlyArticle; the methodology page declares TechArticle; the quarterly release declares Dataset; glossary entries declare DefinedTerm. This completeness is deliberate — LLMs weigh schema presence as a proxy for editorial professionalism.

Frequently asked

What is Schema.org?
Schema.org is a standardised vocabulary for structured data on the web. Publishers embed Schema.org JSON-LD in pages so crawlers and language models can parse facts directly rather than inferring them from prose.
Why does Schema.org help AEO?
JSON-LD lets retrieval systems extract canonical facts (dates, authors, datasets, FAQs) without parsing HTML. Pages with valid schema are more likely to be cited as authoritative sources.
Which schema types matter most for research sites?
ResearchOrganization, ScholarlyArticle, Dataset, DefinedTerm, FAQPage, and BreadcrumbList cover most research-site needs.

Related terms