Drupal7 RDFa XMLLiteral content processing

Drupal 7 supports RDFa 1.0 as part of the core product. RDFa 1.0 is the current specification but RDFa 1.1 is to be released shortly.

RDFa 1.0 metadata can be parsed using the RDFa Distiller and Parser while the RDFa Distiller and Parser (Test Version for RDFa 1.1) can be used to extract RDFa 1.1.

Creating a simple Drupal 7 test blog and parsing out the RDFa 1.0 metadata with the RDFa Distiller and Parser shows that Drupal 7 is using the SIOC (Semantically-Interlinked Online Communities) ontology to describe blog posts and identifies the Drupal user as the creator of the post using the sioc:has_creator property.

<sioc:Post rdf:about="http://137breakerbay.3kbo.com/test">
  <rdf:type rdf:resource="http://rdfs.org/sioc/types#BlogPost"/>
  ...
  <sioc:has_creator>
    <sioc:UserAccount rdf:about="http://137breakerbay.3kbo.com/user/2">
      <foaf:name>Richard</foaf:name>
    </sioc:UserAccount>
  </sioc:has_creator>
  ...
  <content:encoded rdf:parseType="Literal"><p xml:lang="en" xmlns="http://www.w3.org/1999/xhtml">Test Blog</p>
  </content:encoded>
</sioc:Post>

The sioc:UserAccount is a sub class of foaf:OnlineAccount.

What I would like to do is add additional RDFa metadata within the content of the blog to associate me, the Drupal 7 user with a sioc:UserAccount, to me the foaf:Person identified by my FOAF file.

Drupal 7 content is wrapped by XHTML elements containing the property=”content:encoded” (shown below) and an RDFa parser treats this content as an XMLLiteral.

<div property="content:encoded">
...
</div>

The problem is that RDFa 1.0 parsers don’t extract metadata contained within the XMLLiteral.

This was raised in the issue “XMLLiteral content isn’t processed for RDFa attributes in RDFa 1.0 – should this change in RDFa 1.1? a while back with the result that in RDFa 1.1 parsers should now also process the XMLLiteral content.

To make sure that the RDFa parsers know that I want to use RDFa 1.1 processing I need to update Drupal 7 to use the  XHTML+RDFa Driver Module defined in the XHTML+RDFa 1.1 spec.

This turns out to be a simple update of one Drupal 7 file, site/modules/system/html.tpl.php.

Near the top of the file the version is changed to 1.1 (in two places) and the dtd changed to  “http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd”.

?><!DOCTYPE html PUBLIC "-//W3C//DTD XHTML+RDFa 1.1//EN"
  "http://www.w3.org/MarkUp/DTD/xhtml-rdfa-2.dtd">
<html xmlns="http://www.w3.org/1999/xhtml" xml:lang="<?php print $language->language; ?>" version="XHTML+RDFa 1.1" dir="<?php print $language->dir; ?>"<?php print $rdf_namespaces; ?>>

With these changes made I can create another blog containing the following RDFa metadata

<div about="http://www.3kbo.com/people/richard.hancock/foaf.rdf#i" typeof="foaf:Person">
<div rel="foaf:account" resource="http://137breakerbay.3kbo.com/user/2">
...
<div>
</div>

knowing that that an RDFa 1.1 parser will create the RDF triples below which link the Drupal 7 user to me the person identified in my FOAF file.

  <foaf:Person rdf:about="http://www.3kbo.com/people/richard.hancock/foaf.rdf#i">
    <foaf:account>
      <sioc:UserAccount rdf:about="http://137breakerbay.3kbo.com/user/2">
        <foaf:name>Richard</foaf:name>
      </sioc:UserAccount>
    </foaf:account>
  </foaf:Person>

The differences between the RDF extracted with an RDFa 1.0 parser and an RDFa 1.1 parser can be seen using the two links below.

Now that I know that the RDFa 1.1 metadata embedded in the content will be processed accordingly I can move on to the task of building 137 Breaker Bay, a simple accommodation site where the plan is to use RDFa and ontologies such as GoodRelations to describe the both the accommodation services available and the attractions and services of the surrounding area.

Tags: , , , ,

One Response to “Drupal7 RDFa XMLLiteral content processing”

  1. Atrus says:

    I made a reference to your findings here at this Drupal bug: http://drupal.org/node/1015948#comment-4398184