Posts Tagged ‘Sparql’

DBpedia Examples using Linked Data and Sparql

Monday, August 11th, 2008

Using Wikipedia, the largest online encyclopedia, users can browse and perform full-text searches, but programmatic access to the knowledge-base is limited.

The DBpedia project extracts structured information from Wikipedia opening it up to programmatic access using Semantic Web technologies such as Linked Data and SPARQL. This means that the linking and reasoning abilities of RDF and OWL can be utilized and queries for specific information can be made using SPARQL.

Simplistically the mapping from the Wikipedia HTML based web pages to the DBpedia RDF based resources can be thought of as replacing “http://en.wikipedia.org/wiki/” with “http://dbpedia.org/resource/” but in reality there are some additional subtleties which are described in the article From Wikipedia URI-s to DBpedia URI.

The Wikipedia entry for “Civil Engineering” (http://en.wikipedia.org/wiki/Civil_Engineering) is used as an example to show how specific data can be retrieved from its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering).

When both the Wikipedia entry (http://en.wikipedia.org/wiki/Civil_Engineering) and its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering) are opened in a standard web browser they display similar information, however the DBpedia equivalent has been redirected to http://dbpedia.org/page/Civil_engineering.

This redirect can be viewed in Firefox using the Tamper Data Firefox Extension as shown in the image below.

Loading the DBpedia Resource

The initial status of 303 is the HTTP response code “303 See Other“. The server replied with the HTTP response code 303 in order to direct the browser to URI http://dbpedia.org/page/Civil_engineering which is a HTML page the browser can display. The original URI http://dbpedia.org/resource/Civil_engineering is an RDF resource that would not display as well in the HTML browser.

DBpedia implements a HTTP mechanism called content negotiation in order to provide clients such as web browsers with the information they request in a form they can display. The tutorial How to publish Linked Data on the Web describe this and other Linked Data techniques that are used by applications such as DBpedia.

In order to access the RDF resource directly a web client needs to tell the server to send it RDF data. A client can do this by sending the HTTP Request Header Accept: application/rdf+xml as part of its initial request. (The HTML browser had sent an Accept: text/html HTTP header indicating that it was requesting an HTML page.)

The Firefox Addon RESTTest can be used to set Accept: application/rdf+xml in the HTTP Request Header and directly request http://dbpedia.org/resource/Civil_engineering as shown in the image below.

In this case the request to http://dbpedia.org/resource/Civil_engineering succeeded as shown by the “Response Status 200″ and a RDF document was received as shown in the “Response Text”.

In both the RDF fragment shown in the image above and in the HTML page http://dbpedia.org/page/Civil_engineering the multiple language support is visible. The SPARQL queries below show how to extract specific information for a particular language.

SPARQL

DBpedia provides a public SPARQL endpoint at http://dbpedia.org/sparql which enables users to query the RDF datasource with SPARQL queries such as the following.

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract }
}

The query returns all the abstracts for Civil Engineering, in each of the available languages.

The next query refines the abstracts returned to just the language specified, in this case ‘en’ (English).

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract .
FILTER langMatches( lang(?abstract), 'en') }
}

The SNORQL query explorer shown in the image below, provides a simpler interface to the DBpedia SPARQL endpoint. The image below shows both the query and the result returned.

Other SPARQL endpoints such as http://demo.openlinksw.com/sparql/ (shown below) can query DBpedia by specifying the FROM NAMED clause to describe the RDF dataset. E.g.

SELECT ?abstract
FROM NAMED <http://dbpedia.org>
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/property/abstract> ?abstract.
FILTER langMatches( lang(?abstract), ‘en’) }
}

Other Related DBpedia Articles

RDF as self-describing Data uses DBpedia and its SPARQL support to show how RDF is essentially ’self-describing’ - there is no need to know about traditional metadata (schemas) before exploring a data set.

Linking to DBpedia with TopBraid outlines the benefit of DBpedia in terms of providing relatively stable URIs for all relevant real-world concepts, thus making it a natural place to connect specific domain models with each other using the OWL built in propery owl:sameAs ( This property indicates that two URI references actually refer to the same thing ). TopBraid Composer provides support to link domain models with DBpedia .

Querying DBpedia provides examples of using SPARQL to query DBpedia.

Adding Semantic Markup to Your Rails Application with DBpedia and ActiveRDF and
Get Semantic with DBPedia and ActiveRDF describe using ActiveRDF to integrate DBpedia resources into web based applications. ActiveRDF is a library for accessing RDF data from Ruby and Ruby On Rails programs and can perform SPARQL queries.

Constructing an Ontology - Common Inspection and Test Plans

Sunday, June 15th, 2008

ABE Services has developed a web based application, the Compliance Data Management Service (CDMS) for checking work performed on site as part of building and construction projects. These are projects undertaken by the building, construction and related industries.

The diagram below gives an overview of how CDMS works.

The three main parts of CDMS are:

  • Designing Projects by identifying the tasks to be performed and allocating those tasks to the people who will perform them.
  • Using a mobile phone on site to check that completed tasks comply with industry standards and best practice
  • Monitoring and Managing the project via the progress and status of the completed tasks, the tasks outstanding and the non-compliant tasks.

Inspection and Test Plans (ITPs) are central to the three main parts of CDMS.

An Inspection and Test Plan (ITP) identifies the inspection, testing (verification) and acceptance requirements of a particular type of task, eg brickwork. Conceptually an ITP could be thought of as a refined checklist, identifying the usual steps to be followed when undertaking a particular type of task. A specific task may also have additional requirements specific just to it.

An Inspection and Test Plan is composed of one or more:

  • Verification Point(s) which identify the parts of the task to verify. Each Verification Point is composed of one or more:
  • Criterion (plural Criteria) which define the specific requirements by which a verification point can be deemed compliant. This could include referencing specific industry standards which the task is required to comply with.

A set of commonly recurring Inspection and Test Plans (ITPs) has been identified and published on the ABE Services web site. To help promote a higher standard of work in the building and construction industries ABE Services decided to make these common Inspection and Test Plans (ITPs) freely and publicly available. Currently though the details of these common Inspection and Test Plans (ITPs) are hidden within the CDMS application. To place them in the public domain an OWL ontology, named the Common Inspection and Test Plans ontology, is being constructed based on this initial set of commonly recurring Inspection and Test Plans (ITPs).

The definition of ontology in this context is along the lines of “an ontology is a specification of a conceptualization“. In this case the specification is that which constitutes a generic Inspection and Test Plan and the more specialized Inspection and Test Plans listed in the set of common Inspection and Test Plans (ITPs). Other related definitions of ontology include the more philosophical “the study of the nature of being” and the more detailed Wikipedia definitions Ontology and Ontology (Information Science)

The Common Inspection and Test Plans ontology is being implemented using the Web Ontology Language (OWL). OWL is an ontology language that can formally describe the meaning of terminology used in Semantic Web documents (See Why OWL?). In turn the Semantic Web is about two things. It is about common formats for integration and combination of data drawn from diverse sources, where on the original Web mainly concentrated on the interchange of documents. It is also about language for recording how the data relates to real world objects. That allows a person, or a machine, to start off in one database, and then move through an unending set of databases which are connected not by wires but by being about the same thing.

In the world of building and construction projects, and activities related to Real Estate, the buying and selling of houses, house and building maintenance, and potentially in the selection of contractors to perform building work, the Common Inspection and Test Plans ontology will form one of these databases, connected by the concept of how to perform a specific type of building, construction or related task.

In the first instance the Common Inspection and Test Plans ontology has been published in a very simple draft form, listing only the Inspection and Test Plans and not the more detailed the Verification Points and Criteria.

This draft version is available at the URI: http://www.abeservices.com.au/schema/2008/05/InspectionTestPlans.rdf.

A good way to view it is to install the Tabulator rdf browser plugin for Firefox as outlined in Description of a Project

It is intended that the Common Inspection and Test Plans ontology evolve as a result of community participation. As well as general feedback it is hoped to also make available a web-based vocabulary editor which would allow for greater collaboration . (Potential options include using Neologism and OntoWiki).

The next steps are to

  • add Verification Point(s) and Criteria to the currently identified Inspection and Test Plans
  • identify additional concepts related to Inspection and Test Plans, e.g. Hold Points
  • add some worked examples

In a future article I’ll also outline how to query the Common Inspection and Test Plans ontology for specific informaton using the RDF query language SPARQL. A SPARQL end point is currently available at http://abeserver.isa.net.au:2020/ providing a SPARQL Query form .