Schema.org 2.0

schema-org1 About a month ago Version 2.0 of the Schema.org vocabulary hit the streets.

This update includes loads of tweaks, additions and fixes that can be found in the release information.  The automotive folks have got new vocabulary for describing Cars including useful properties such as numberofAirbags, fuelEfficiency, and knownVehicleDamages. New property mainEntityOfPage (and its inverse, mainEntity) provide the ability to tell the search engine crawlers which thing a web page is really about.  With new type ScreeningEvent to support movie/video screenings, and a gtin12 property for Product, amongst others there is much useful stuff in there.

But does this warrant the version number clicking over from 1.xx to 2.0?

These new types and properties are only the tip of the 2.0 iceberg.  There is a heck of a lot of other stuff going on in this release that apart from these additions.  Some of it in the vocabulary itself, some of it in the potential, documentation, supporting software, and organisational processes around it.

Sticking with the vocabulary for the moment, there has been a bit of cleanup around property names. As the vocabulary has grown organically since its release in 2011, inconsistencies and conflicts between different proposals have been introduced.  So part of the 2.0 effort has included some rationalisation.  For instance the Code type is being superseded by SoftwareSourceCode – the term code has many different meanings many of which have nothing to do with software; surface has been superseded by artworkSurface and area is being superseded by serviceArea, for similar reasons. Check out the release information for full details.  If you are using any of the superseded terms there is no need to panic as the original terms are still valid but with updated descriptions to indicate that they have been superseded.  However you are encouraged to moved towards the updated terminology as convenient.  The question of what is in which version brings me to an enhancement to the supporting documentation.  Starting with Version 2.0 there will be published a snapshot view of the full vocabulary – here is http://schema.org/version/2.0.  So if you want to refer to a term at a particular version you now can.

CreativeWork_usage How often is Schema being used? – is a question often asked. A new feature has been introduced to give you some indication.  Checkout the description of one of the newly introduced properties mainEntityOfPage and you will see the following: ‘Usage: Fewer than 10 domains‘.  Unsurprisingly for a newly introduced property, there is virtually no usage of it yet.  If you look at the description for the type this term is used with, CreativeWork, you will see ‘Usage: Between 250,000 and 500,000 domains‘.  Not a direct answer to the question, but a good and useful indication of the popularity of particular term across the web.

Extensions
In the release information you will find the following cryptic reference: ‘Fix to #429: Implementation of new extension system.’

This refers to the introduction of the functionality, on the Schema.org site, to host extensions to the core vocabulary.  The motivation for this new approach to extending is explained thus:

Schema.org provides a core, basic vocabulary for describing the kind of entities the most common web applications need. There is often a need for more specialized and/or deeper vocabularies, that build upon the core. The extension mechanisms facilitate the creation of such additional vocabularies.
With most extensions, we expect that some small frequently used set of terms will be in core schema.org, with a long tail of more specialized terms in the extension.

As yet there are no extensions published.  However, there are some on the way.

As Chair of the Schema Bib Extend W3C Community Group I have been closely involved with a proposal by the group for an initial bibliographic extension (bib.schema.org) to Schema.org.  The proposal includes new Types for Chapter, Collection, Agent, Atlas, Newspaper & Thesis, CreativeWork properties to describe the relationship between translations, plus types & properties to describe comics.  I am also following the proposal’s progress through the system – a bit of a learning exercise for everyone.  Hopefully I can share the news in the none too distant future that bib will be one of the first released extensions.

W3C Community Group for Schema.org
A subtle change in the way the vocabulary, it’s proposals, extensions and direction can be followed and contributed to has also taken place.  The creation of the Schema.org Community Group has now provided an open forum for this.

So is 2.0 a bit of a milestone?  Yes taking all things together I believe it is. I get the feeling that Schema.org is maturing into the kind of vocabulary supported by a professional community that will add confidence to those using it and recommending that others should.

16 Replies to “Schema.org 2.0”

  1. Hi Richard,

    It’s great to see the bib extensions nearing official status. Here’s a question for you:

    The 2.0 spec has two types of extension mechanism –

    1) Reviewed / hosted extensions (such as bib will become) that acquire a slice of schema.org namespace (bib.schema.org in this case)

    2) External extensions that you create and host yourself (e.g., at schema.mysite.org)

    (more on https://schema.org/docs/extension.html)

    Do you think libraries have a use for the second type, if they want to describe content in a richer way (e.g., for archives)? Or should they stick to established vocabularies separate from schema.org? All that richer vocab can still be linked and expressed on your pages in RDFa or JSON-LD or whatever, for anyone who understands it – and they are more likely to understand it than my self-hosted schema extension.

    The second type seems to conflict with the text on http://schema.org/docs/datamodel.html

    “The type hierarchy presented on this site is not intended to be a ‘global ontology’ of the world. It only covers the types of entities for which we (Microsoft, Yahoo!, Google and Yandex), think we can provide some special treatment for, through our search engines, in the near future.”

    There’s a thread on public-vocabs@w3.org that addresses this (somewhat): https://lists.w3.org/Archives/Public/public-vocabs/2015Feb/thread.html#msg52

    Tom

    1. Hi Tom, Sorry for delay in responding.

      The answer to your question about libraries needing to produce external extensions depends on the need and purpose for richer description.

      You mention archives which I also believe needs some attention. So much so that I am in the process of setting up a W3C community Group specifically to address this hopefully as a reviewed/hosted extension.

      Beyond archives there will almost certainly be need to extend things further, this may well be by adding more terms to the bib.schema.org extension beyond its initial proposal. It could be via creating or adding to external vocabularies such as BiblioGraph.net which has provided input into the bib.schema.org proposal.

      The question becomes where to draw the line between extending a general purpose vocabulary for sharing information across the web and with the search engines, and a domain specific one used to describe the internal idiosyncrasies of a specific domain such as libraries.

      By externally extending Schema you will be creating terms that are less likely to be recognised by the search engines, but nevertheless be encouraging the use of a combined vocabulary, the majority of which will be, recognised by them. My expectation is that such vocabularies will emerge in several domains, but I can not see them successfully replacing domain specific vocabularies/ontologies used in the heart of those domains. Fortunately the practice in the Linked Data world is to use terms from more than one vocabulary at a time. Using both Schema.org and its extensions together will satisfy the parallel descriptive needs of sharing/discovery and detailed domain specific management of resources.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.