Archive for the ‘RESTful’ Category

Using Groovy to Upload RDF files to the Talis Platform

Saturday, March 13th, 2010

The Talis Platform provides free stores for developers to host RDF data online. Each store has its own SPARQL end point for querying the RDF data.

Options for uploading individual RDF files into a store include:

A nice to have option would be to be able to upload all the RDF files found in a directory directly into a store using a simple command like TalisStore.load.

Groovy with its flexible scripting is a good candidate for this type of work. Code like the following makes it easy to traverse directories and list the RDF files

  • in the current directory:
    new File(".").eachFileMatch(~/.*\.rdf/) { println it }
    
  • or in a specific directory:
    new File("/data/rdf").eachFileMatch(~/.*\.rdf/) { println it }
    

Once Groovy is installed the above lines of code can be run directly in both the Groovy Shell (groovysh) and the Groovy Console (groovyConsole). For example when run in the Groovy Shell (groovysh) :

$ groovysh
Groovy Shell (1.6.4, JVM: 1.6.0_15)
Type 'help' or '\h' for help.
-------------------------------------------------------------------------------------
groovy:000> new File(".").eachFileMatch(~/.*\.rdf/) { println it }
./WO0002.rdf
./WO0003.rdf
./WO0004.rdf
./WO0005.rdf

The Groovy RESTClient simplifies REST operations like POSTing (uploading) files to a web site. It is an extension of HTTPBuilder which in turn is a wrapper of Apache’s HttpClient. The main addition required for the RESTClient to upload RDF/XML files to a Talis store is an “application/rdf+xml” encoder. This is easy to create following the example provided in the article Groovy RESTClient and Putting Zip Files.

The result is the encodeRDF method shown below.

import groovyx.net.http.RESTClient
import org.apache.http.entity.FileEntity
TalisStoreLoader() {
 talis = new RESTClient( "http://api.talis.com/" )
 talis.auth.basic TALIS_USERNAME, TALIS_PASSWORD
 talis.encoder.'application/rdf+xml' = this.&encodeRDF
 }
def encodeRDF( Object data ) throws UnsupportedEncodingException {
 if ( data instanceof File ) {
 def entity = new FileEntity( (File) data, "application/rdf+xml" );
 entity.setContentType( "application/rdf+xml" );
 return entity
 } else {
 throw new IllegalArgumentException(
 "Don't know how to encode ${data.class.name} as application/rdf+xml" );
 }
 }

The line talis.encoder.’application/rdf+xml’ = this.&encodeRDF registers it with an instance of the RESTClient.

With the RDF encoder in place a file can be uploaded to a stores metabox as follows.

def res = talis.post( path: metaboxPath, body: file, requestContentType: "application/rdf+xml" )

This functionality is encapsulated in the class com._3kbo.talis.TalisStoreLoader which is part of a maven project available for download as a zip file. It includes the script TalisStore.groovy which is a simplified wrapper of com._3kbo.talis.TalisStoreLoader.

The jar file create by the project talis-store-0.2.jar can be downloaded separately.

The RESTClient is not bundled with the standard Groovy install. Trying to access it from the shell or console without explicitly installing it will results in errors like the following:

groovy:000> import groovyx.net.http.RESTClient
ERROR org.codehaus.groovy.tools.shell.CommandException:
Invalid import definition: 'import groovyx.net.http.RESTClient';
reason: startup failed, script1266050039289.groovy:
1: unable to resolve class groovyx.net.http.RESTClient
 @ line 1, column 1. 1 error at java_lang_Runnable$run.call (Unknown Source)

Installing the RESTClient requires downloading HTTPBuilder and adding it and its dependencies (http-builder-xxx-all.zip) to the ${user.home}/.groovy/lib directory. Also add talis-store-0.2.jar to this directory. The ${user.home}/.groovy/lib directory may need to be created manually but the Groovy install should have created a file named “$GROOVY_HOME/conf/groovy-starter.conf” containing the line

load ${user.home}/.groovy/lib/*

which enables the loading of the additional jar files required by RESTClient plus the com._3kbo.talis.TalisStoreLoader i.e:

  • http-builder-0.5.0-RC2.jar
  • httpclient-4.0.jar
  • httpcore-4.0.1.jar
  • json-lib-2.3-jdk15.jar
  • xml-resolver-1.2.jar
  • commons-collections-3.2.1.jar
  • commons-logging-1.1.1.jar
  • talis-store-0.2.jar

Using the Groovy Shell to Upload

With the RESTClient and the talis-store-0.2.jar installed the Groovy Shell (groovysh) makes it easy to run the TalisStore.groovy script and upload either individual RDF files or all the RDF files in a directory to a Talis store.

The four options for running the TalisStore.groovy script are:

  1. TalisStore.load “mystore”,”user”,”password”,”file_or_directory”
  2. TalisStore.load “mystore”,”user”,”password”
  3. TalisStore.load “file_or_directory”
  4. TalisStore.load()

The first and second options both explicitly set the store, user and password. The first option also nominates either a specific RDF file to upload or a directory to scan and upload all the RDF files found. The second option uploads all the RDF files found in the current directory, i.e. the directory in which the Groovy Shell (groovysh) was invoked.

The third and forth options read the store, user and password from the configuration file TalisConfig.groovy, updated for a specific store and available on the classpath (see below).

With the configuration file TalisConfig.groovy in place uploading a specific RDF file or a directory simplifies to TalisStore.load “file_or_directory”

Uploading the RDF files in the current directory is just TalisStore.load() as shown in the example
Loading all RDF files from the current directory below.

Using the Script to Upload

Adding the line #!/usr/bin/env groovy to the TalisStore.groovy script and making the script executable allows it to be run independent of the Groovy Shell (groovysh), for example ./TalisStore.groovy /sioc/forum/WO0902.rdf explicitly loads the RDF, using the configuration file to set the store, user and password.

See the TalisStore.groovy javadoc for more details on running as an executable script.

Summary

There is a bit of configuration to set everything up but once in place the combination of Groovy, the RESTClient and the TalisStore loader code described here makes it easy to load RDF files to the Talis Platform.

My preference is to run the Groovy Shell (groovysh) and use simple commands like TalisStore.load().

Possible extensions for the future include commands like TalisStore.sparql.select etc…

Appendix A: Examples

Loading a specific file

$ groovysh
Groovy Shell (1.7.1, JVM: 1.6.0_15)
Type 'help' or '\h' for help.
-------------------------------------------------------------------------------
groovy:000> TalisStore.load "mystore","user","password","/sioc/WO0401.rdf"
Using store: mystore user password
Loading a file or directory: /sioc/WO0401.rdf
Loading /sioc/WO0401.rdf
Loaded 1565688 bytes in 58518 milliseconds. (Status: 204)

Loading all RDF files from the current directory

$ cd /scoop/forum/
$ ls -l
-rw-r--r--  1  3847192  2 Jan 12:11 WO0903.rdf
-rw-r--r--  1  2485605  2 Jan 12:11 WO0904.rdf
-rw-r--r--  1  2321233  2 Jan 12:12 WO0905.rdf
-rw-r--r--  1  2551787  2 Jan 12:12 WO0906.rdf
$ groovysh
Groovy Shell (1.7.1, JVM: 1.6.0_17)
Type 'help' or '\h' for help.
--------------------------------------------
groovy:000> TalisStore.load()
Classpath:
...
Loading RDF files from directory /scoop/forum/.
2010-03-14 11:32:31.477: Loading /scoop/forum/./WO0903.rdf
2010-03-14 11:33:49.289: Loaded 3847192 bytes in 77808 milliseconds. (Status: 204)
2010-03-14 11:33:49.304: Loading /scoop/forum/./WO0904.rdf
2010-03-14 11:34:38.288: Loaded 2485605 bytes in 48984 milliseconds. (Status: 204)
2010-03-14 11:34:38.289: Loading /scoop/forum/./WO0905.rdf
2010-03-14 11:35:25.429: Loaded 2321233 bytes in 47140 milliseconds. (Status: 204)
2010-03-14 11:35:25.43: Loading /scoop/forum/./WO0906.rdf
2010-03-14 11:36:15.952: Loaded 2551787 bytes in 50523 milliseconds. (Status: 204)
Loaded 4 files in 224488 milliseconds.
===> 4
groovy:000>

Appendix B: Adding the Groovy Configuration File to the Classpath

The structure of the config file is:

// TalisConfig.groovy
talis {
    user = "myusername"
    password = "mypassword"
    store = "mystore"
}

Once the values have been updated for a specific store the steps for adding to the classpath and also verifying that it is being read correctly are as follows:

  • Create a directory to hold property files ( e.g. . ${user.home}/.groovy/conf/ ) and
  • Add a matching line to “$GROOVY_HOME/conf/groovy-starter.conf” to add the directory to the classpath,e.g. load ${user.home}/.groovy/conf/./
  • Place the Groovy configuration file TalisConfig.groovy in the directory (i.e. ${user.home}/.groovy/conf/)

ConfigSlurper is used to read the configuration file. The shell input below shows how to:

  • Check what is on the classpath using loader.URLs.each{ println it }
  • Get the config file using url = loader.getResource(”TalisConfig.groovy”)
  • Read the config file using def config = new ConfigSlurper().parse(url)
groovy:000> import groovyx.net.http.RESTClient
===> [import groovyx.net.http.RESTClient]
groovy:000> talis = new RESTClient( "http://api.talis.com/" )
===> groovyx.net.http.RESTClient@1798928
groovy:000> loader = talis.class.classLoader.rootLoader
===> org.codehaus.groovy.tools.RootLoader@4d20a47e
groovy:000> loader.URLs.each{ println it }
file:/Users/richardhancock/./
file:/Users/richardhancock/groovy-1.6.4/lib/ant-1.7.1.jar
file:/Users/richardhancock/groovy-1.6.4/lib/ant-junit-1.7.1.jar
file:/Users/richardhancock/groovy-1.6.4/lib/ant-launcher-1.7.1.jar
file:/Users/richardhancock/groovy-1.6.4/lib/antlr-2.7.7.jar
file:/Users/richardhancock/groovy-1.6.4/lib/asm-2.2.3.jar
file:/Users/richardhancock/groovy-1.6.4/lib/asm-analysis-2.2.3.jar
file:/Users/richardhancock/groovy-1.6.4/lib/asm-tree-2.2.3.jar
file:/Users/richardhancock/groovy-1.6.4/lib/asm-util-2.2.3.jar
file:/Users/richardhancock/groovy-1.6.4/lib/bsf-2.4.0.jar
file:/Users/richardhancock/groovy-1.6.4/lib/commons-cli-1.2.jar
file:/Users/richardhancock/groovy-1.6.4/lib/commons-logging-1.1.jar
file:/Users/richardhancock/groovy-1.6.4/lib/groovy-1.6.4.jar
file:/Users/richardhancock/groovy-1.6.4/lib/ivy-2.1.0-rc2.jar
file:/Users/richardhancock/groovy-1.6.4/lib/jline-0.9.94.jar
file:/Users/richardhancock/groovy-1.6.4/lib/jsp-api-2.0.jar
file:/Users/richardhancock/groovy-1.6.4/lib/junit-3.8.2.jar
file:/Users/richardhancock/groovy-1.6.4/lib/servlet-api-2.4.jar
file:/Users/richardhancock/groovy-1.6.4/lib/xstream-1.3.1.jar
file:/Users/richardhancock/.groovy/lib/http-builder-0.5.0-RC2.jar
file:/Users/richardhancock/.groovy/lib/httpclient-4.0.jar
file:/Users/richardhancock/.groovy/lib/httpcore-4.0.1.jar
file:/Users/richardhancock/.groovy/lib/json-lib-2.3-jdk15.jar
file:/Users/richardhancock/.groovy/lib/xml-resolver-1.2.jar
file:/Users/richardhancock/.groovy/conf/./
===> [Ljava.net.URL;@3ebc312f
groovy:000> url = loader.getResource("TalisConfig.groovy")
===> file:/Users/richardhancock/.groovy/conf/TalisConfig.groovy
groovy:000> def config = new ConfigSlurper().parse(url)
===> {talis={username=myusername, password=mypassword, store=mystore}}
groovy:000>

Appendix C: Using Maven to run the Groovy Script

The TalisStore script can also be run via maven. This approach uses the jar file dependencies defined in the maven  project and does not require the standard Groovy install. If a valid “TalisConfig.groovy” configuration file is available on the classpath, the parameters for “store”, “username” and “password” are not required. By default the pom.xml file excludes the dummy configuration file but once it has been updated with real values it can be included by changing the exclude(s) to include(s) .  The TalisStore script can be run by executing command lines such as the following which invoke the TalisStore main method (optionally with parameters).

mvn exec:java -Dexec.mainClass=TalisStore

mvn exec:java -Dexec.mainClass=TalisStore -Dexec.args=”/sioc/forum/2007″

Appendix D: Authentication

The method “talis.auth.basic TALIS_USERNAME, TALIS_PASSWORD” is a bit of an anomaly since the Talis Platform uses HTTP Digest Authentication. RESTClient uses the groovyx.net.http.AuthConfigbasic” method which works for “digest” authentication as well.

DBpedia Examples using Linked Data and Sparql

Monday, August 11th, 2008

Using Wikipedia, the largest online encyclopedia, users can browse and perform full-text searches, but programmatic access to the knowledge-base is limited.

The DBpedia project extracts structured information from Wikipedia opening it up to programmatic access using Semantic Web technologies such as Linked Data and SPARQL. This means that the linking and reasoning abilities of RDF and OWL can be utilized and queries for specific information can be made using SPARQL.

Simplistically the mapping from the Wikipedia HTML based web pages to the DBpedia RDF based resources can be thought of as replacing “http://en.wikipedia.org/wiki/” with “http://dbpedia.org/resource/” but in reality there are some additional subtleties which are described in the article From Wikipedia URI-s to DBpedia URI.

The Wikipedia entry for “Civil Engineering” (http://en.wikipedia.org/wiki/Civil_Engineering) is used as an example to show how specific data can be retrieved from its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering).

When both the Wikipedia entry (http://en.wikipedia.org/wiki/Civil_Engineering) and its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering) are opened in a standard web browser they display similar information, however the DBpedia equivalent has been redirected to http://dbpedia.org/page/Civil_engineering.

This redirect can be viewed in Firefox using the Tamper Data Firefox Extension as shown in the image below.

Loading the DBpedia Resource

The initial status of 303 is the HTTP response code “303 See Other“. The server replied with the HTTP response code 303 in order to direct the browser to URI http://dbpedia.org/page/Civil_engineering which is a HTML page the browser can display. The original URI http://dbpedia.org/resource/Civil_engineering is an RDF resource that would not display as well in the HTML browser.

DBpedia implements a HTTP mechanism called content negotiation in order to provide clients such as web browsers with the information they request in a form they can display. The tutorial How to publish Linked Data on the Web describe this and other Linked Data techniques that are used by applications such as DBpedia.

In order to access the RDF resource directly a web client needs to tell the server to send it RDF data. A client can do this by sending the HTTP Request Header Accept: application/rdf+xml as part of its initial request. (The HTML browser had sent an Accept: text/html HTTP header indicating that it was requesting an HTML page.)

The Firefox Addon RESTTest can be used to set Accept: application/rdf+xml in the HTTP Request Header and directly request http://dbpedia.org/resource/Civil_engineering as shown in the image below.

In this case the request to http://dbpedia.org/resource/Civil_engineering succeeded as shown by the “Response Status 200″ and a RDF document was received as shown in the “Response Text”.

In both the RDF fragment shown in the image above and in the HTML page http://dbpedia.org/page/Civil_engineering the multiple language support is visible. The SPARQL queries below show how to extract specific information for a particular language.

SPARQL

DBpedia provides a public SPARQL endpoint at http://dbpedia.org/sparql which enables users to query the RDF datasource with SPARQL queries such as the following.

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract }
}

The query returns all the abstracts for Civil Engineering, in each of the available languages.

The next query refines the abstracts returned to just the language specified, in this case ‘en’ (English).

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract .
FILTER langMatches( lang(?abstract), 'en') }
}

The SNORQL query explorer shown in the image below, provides a simpler interface to the DBpedia SPARQL endpoint. The image below shows both the query and the result returned.

Other SPARQL endpoints such as http://demo.openlinksw.com/sparql/ (shown below) can query DBpedia by specifying the FROM NAMED clause to describe the RDF dataset. E.g.

SELECT ?abstract
FROM NAMED <http://dbpedia.org>
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract.
FILTER langMatches( lang(?abstract), ‘en’) }
}

Other Related DBpedia Articles

RDF as self-describing Data uses DBpedia and its SPARQL support to show how RDF is essentially ’self-describing’ – there is no need to know about traditional metadata (schemas) before exploring a data set.

Linking to DBpedia with TopBraid outlines the benefit of DBpedia in terms of providing relatively stable URIs for all relevant real-world concepts, thus making it a natural place to connect specific domain models with each other using the OWL built in propery owl:sameAs ( This property indicates that two URI references actually refer to the same thing ). TopBraid Composer provides support to link domain models with DBpedia .

Querying DBpedia provides examples of using SPARQL to query DBpedia.

Adding Semantic Markup to Your Rails Application with DBpedia and ActiveRDF and
Get Semantic with DBPedia and ActiveRDF describe using ActiveRDF to integrate DBpedia resources into web based applications. ActiveRDF is a library for accessing RDF data from Ruby and Ruby On Rails programs and can perform SPARQL queries.

Linking to New Zealand Legislation

Saturday, January 12th, 2008

The web page Public Access to Legislation – Creating links to the New Zealand Legislation website gives information on how to link to New Zealand Legislation.

The legislative documents are identified by:

  • the information type (Act, Regulation, Bill, SOP)
  • the legislation type or category (public, local, members, government, imperial etc)
  • the year
  • the number, padded with initial zeros to 4 digits. For Bills, the number will also include the Bar number and split letter (if applicable).

And a legislative document can currently be linked to in the following ways:

In the same way that I want to link to photo sharing sites from within my web application there will be occasions when I want to link to legislation, standards and regulation documents.

For example in the context of a web based building project it could be useful to link to the Building Act 2004 Table of Content which gives an overview of the individual sections of the Building Act.

This is useful as a general reference but there will be occasions where I want to show a provision in a specific context relevant to the project. For example a building project needs to be issued with a building consent which can lapse after a period of time.

When showing the status of a project which has not yet started building it would be useful to indicate if its building consent is about to expire and if it is then link to the relevant provision to clarify the situation.

Currently there are two simple ways of linking to the specific provision, open it in the same page or open it in a new page.

Both of these approaches are a bit rough for todays modern Ajax-based web applications which would ideally take a smoother approach. I.e. take just the relevant content and slide it into the page at the required location, in this case inserting just the following:

“A building consent lapses and is of no effect if the building work to which it relates does not commence within—
(a) 12 months after the date of issue of the building consent; or
(b) any further period that the building consent authority may allow.”

This Ajax insertion can be achieved by first using a customized HTML reader which extracts the relevant content from the original provisions page.

The simpler display rendered by the customized HTML reader would also be more appropriate for a mobile phone based web application.

Note that in January 2008, as part of the PAL Project, a new site for accessing New Zealand legislation will be available.

The PAL Project stores the legislation documents as XML fragments that are combined for publication as HTML and PDF. It is likely that the documents will also be available as XML.

If the XML document is available then it should be simpler to access the content of the specific provisions when using the customized HTML reader discussed above.

A further simplification would be to provide a REST based web service for accessing the provisions. This would allow the content of the provision “Lapse of Building Consent” to be accessed via a URI similar to the following http://www.legislation.govt.nz/act/public/2004/se/072se52.xml.

Linking to Photo Sharing Sites

Saturday, January 5th, 2008

I have users with photos published on photo sharing sites such as Flickr, Fotopic, ImageShack, Livejournal, Photobucket, Picasaweb, SmugMug, Webshots and others.

What I want to do is provide these users with the ability to link these published photos to projects managed by our web application. Within our application only the published URL to the image would be stored.

The Flickr API, with its support for REST, is the ideal site to use as a proof of concept. At least one other site, 23 also implements the Flickr API, making it easy to support both 23 and Flickr.

The Ajax based Flickr Related Tag Browser provides a good example of how to use the Flickr API to browse and select photos by tag. Within our application a user wold browse their Flickr account and search by photo album, tag or date to retrieve specific photos which would illustrate the project site itself, tasks to be accomplished, problems associated with a task or as visual confirmation that a task had been completed.

As an example I have two projects that I am about to start.

  • The first is to widen a driveway, add a front gate and landscape the front garden to link in with the new driveway and gate.
  • The second is to complete the garage at the back of the property, which includes adding a couple of rooms that will link up with the existing studio kitchen and main room.

The photos for these two different projects I have tagged gate and garage respectively on my Flickr account.

The default Flickr web pages that open for

  • http://www.flickr.com/photos/richard3kbo/tags/gate/ and
  • http://www.flickr.com/photos/richard3kbo/tags/garage/

by themselves give good overviews of the two different project sites.

Linking specific photos to project tasks helps illustrate and clarify these tasks.

Using photos published on Flickr and similar sites allows our users to continue using the photo sharing sites they are used to and which have more features for image processing and sharing than our site.

A Semantic Web Architecture for a Rails Hosted Environment

Saturday, October 20th, 2007

Last week-end I installed ActiveRDF on my Mac OS X Powerbook, together with the Sparql, RDFLite and Redland adapters. Ideally I am working towards setting up an environment that allows me to build RESTful Semantic Web Applications that support reasoning over RDF data and implement a SPARQL query end point. Support for OpenID authentication, integrated with FOAF, is also at the top of the list.

On the Powerbook I could also install the ActiverRDF adapters for Sesame and Jena to give me the functionality that I am after but that only works in my development environment. Sesame and Jena are Java based. When it comes to deploying an application onto the web my options are currently more limited. 3kbo is deployed into hosted environment which supports PHP, Python, Ruby and Ruby On Rails and PERL, but no Java. (There is C/C++, limited to my local user account.)

Currently there are two PHP SPARQL implementations, ARC and RAP. RAP also provides a reasoning engine InfModel, with support for owl:sameAs and owl:inverseOf.

So at this stage the architecture that is emerging is an ActiveRDF RESTful Ruby On Rails application that uses RAP as the triple store, SPARQL query engine and reasoning engine. To integrate Rails with PHP I am planning to implement a RESTful PHP interface that acts as a facade to RAP.