Microdata in HTML5 and what it means for technology
Recently I’ve had the opportunity to work with some of Google’s best and brightest to create a set of sports schemas for Microdata. We announced the ESPN/Google collaboration earlier this week via a guest post on Google’s Inside Search blog.
Microdata is essentially 5 new attributes of the proposed HTML5 specification that, when used in conjunction with a Microdata schema, give semantic meaning of HTML code to a computer application (like a search engine).
The HTML5 attributes that serve as the building blocks of microdata markup are:
- itemid – This is meant to assign a unique ID to the “item” of which the Microdata is referencing. This attribute should be used on the same element where a scope is defined.
- itemscope – This defines a Microdata item and indicates that the tags contained within this container will contain Microdata.
- itemtype – This, when used in conjunction with itemscope, defines the type of information the Microdata refers to. The itemtype attribute refers to a schema that defines what attributes are part of the Microdata item (e.g. title, director, and release date for a movie).
- itemprop – Used to indicate a specific property that is part of the Microdata schema being used. The contents of this tag are meant to be extracted by the application parsing the Microdata.
- itemref – Can be used to assign itemprops (and thus extractable data) to a block of code when the tags in question are not a direct child element of containing Microdata item.
A common question coming up is how Microdata is different from Microformats. The answer is pretty simple. While they are both forms of rich HTML markup intended to apply meaning to HTML code, Microformats relies on the “rel” and “class” attributes of HTML tags to supply the data. Neither of these attributes were really intended for this usage, so the implementation of Microformats is a sort of “hack” as a result of there not being a native way to achieve this in HTML4. Microdata, on the other hand, has the above 5 attributes built in to the proposed HTML5 specification. No need to hack existing HTML attributes for uses they weren’t intended for. HTML5 figures to solve the problem the right way.
In this project with Google, we tackled schemas to represent athletes, teams, associations, series, and matches (games). The initial implementation is essentially a proof of concept that shows itself in Google’s results when ESPN pages are returned for MLB players, teams, or generic baseball searches. Though we started with baseball we are planning to extend this in the coming months to other sports including football, basketball, hockey, soccer, tennis, and golf.
Here’s an example of a possible representation of a boxing athlete using Microdata:
<div itemid="boxer~253" itemscope="itemscope" itemtype="http://schema.org/SportsAthlete/Boxing">
<img itemprop="image" src="http://someurlhere.jpg" alt="Evander Holyfield" />
<h3 itemprop="name">Evander Holyfield</h3>
<link itemprop="url" href="http://somewebsite.com/evander-holyfield.html" />
<link itemprop="sport" href="http://schema.org/Boxing" />
<time itemprop="birthDate" datetime="1962-10-19">1962-10-19</time>
<span itemprop="height">188</span>
<span itemprop="weight">100</span>
<div itemprop="statistics" itemscope="itemscope" itemtype="http://schema.org/SportStat/Wins">
<span itemprop="name">Wins</span>
<span itemprop="abbreviation">W</span>
<span itemprop="value">44</span>
</div>
<div itemprop="statistics" itemscope="itemscope" itemtype="http://schema.org/SportStat/Losses">
<span itemprop="name">Losses</span>
<span itemprop="abbreviation">L</span>
<span itemprop="value">10</span>
</div>
<div itemprop="statistics" itemscope="itemscope" itemtype="http://schema.org/SportStat/Draws">
<span itemprop="name">Draws</span>
<span itemprop="abbreviation">D</span>
<span itemprop="value">2</span>
</div>
<div itemprop="statistics" itemscope="itemscope" itemtype="http://schema.org/SportStat/Knockouts">
<span itemprop="name">Knockouts</span>
<span itemprop="abbreviation">KO</span>
<span itemprop="value">29</span>
</div>
</div>
The ultimate goal is to standardize the sports schemas and then publish them for other developers/publishers to use in their own development. At the end of the day, a richer online experience based on standards is better for everyone involved in digital technonology–both content creators and consumers.
While Google (and other search engines like Yahoo! and Bing) are early adopters of Microdata and thus already using it to improve their search results, there will undoubtedly be other uses as well. If a computer application can know, in a structured way, what a block of HTML code is representing, it can do some interesting things to create a richer experience. Leaving the sports world for a second, marking up a Web page with Person Microdata, for example, could allow a user to easily call a friend directly from their smartphone reading their name in a friend’s Facebook post or Tweet. Or Place Microdata could be used to launch Google Maps directly from a tablet to get more information about something referenced in the Web page. When reading a review of a restaurant a user could click the linked restaurant name to quickly reserve a table. These simple uses are just the beginning, and serve as a promising improvement to the utility of many of the smart phones, tablets, and other devices that we all use every day.
Depending on how sophisticated the schemas become for Microdata, they could also potentially be used as a form of API (an insecure one, but still an API) for organizations to easily share data with one another without having to do data dumps or register for developer keys, as in the standard use of a formal API. Microdata, if widely adopted, could essentially wipe out the need for “scraper” technology and complicated regular expressions that have long been used by software engineers and Web developers to extract data from Web pages.
I believe that Microdata is going to become a major technology that attains more uses than just improving search results. My money is on big things for Microdata as more and more smart people start experimenting with the technology and extending it beyond what is being done today.
Post a Comment
You must be logged in to post a comment.