Archive for the ‘OWL’ Category

Linked Data and the SOA Software Development Process

Thursday, November 17th, 2011

We have quite a rigorous SOA software development process however the full value of the collected information is not being realized because the artifacts are stored in disconnected information silos. So far attempts to introduce tools which could improve the situation (e.g. zAgile Teamwork and Semantic Media Wiki) have been unsuccessful, possibly because the value of a Linked Data approach is not yet fully appreciated.

To provide an example Linked Data view of the SOA services and their associated artifacts I created a prototype consisting of  Sesame running on a Tomcat server with Pubby providing the Linked Data view via the Sesame SPARQL end point. TopBraid was connected directly to the Sesame native store (configured via the Sesame Workbench) to create a subset of services sufficient to demonstrate the value of publishing information as Linked Data. In particular the prototype showed how easy it became to navigate from the requirements for a SOA service through to details of its implementation.

The  prototype also highlighted that auto generation of the RDF graph (the data providing the Linked Data view) from the actual source artifacts would be preferable to manual entry, especially if this could be transparently integrated with the current software development process. This is has become the focus of the next step, automated knowledge extraction from the source artifacts.

Artifacts

Key artifact types of our process include:

A Graph of Concepts and Instances

There is a rich graph of relationships linking the things described in the artifacts listed above. For example the business entities defined in the UML analysis model are the subject of the service and service operations defined in the Service Contracts. The service and service operations are mapped to the WSDLs which utilize the Xml Schema’s that provide an XML view of business entities. The JAX-WS implementations are linked to the WSDLs and Xml Schema’s and deployed to the Oracle Weblogic Application Server where the configuration files list the external dependencies. The log files and defects link back to specific parts of the code base (Subversion revisions) within the context of specific service operations. The people associated with the different artifacts can often be determined from artifact meta-data.

RDF, OWL and Linked Data are a natural fit for modelling and viewing this graph since there is a mix of concepts plus a lot of instances, many of whom already have a HTTP representation. Also the graph contains a number of transitive relationships , (for example a WSDL may import an Xml Schema which in turn imports another Xml Schema etc …) promoting the use of the owl:TransitiveProperty to help obtain a full picture of all the dependencies a component may have.

Knowledge Extraction

Another advantage of the RDF, OWL, Linked Data approach is the utilization of unique URIs for identifying concepts and instances. This allows information contain in one artifact, e.g. a WSDL, to be extracted as RDF triples which would later be combined with the RDF triples extracted from the JAX-WS annotation of Java source code. The combined RDF triples tell us more about the WSDL and its Java implementation than could be derived from just one of the artifacts.

We have made some progress with knowledge extraction but this is still definitely a work in progress. Sites such as ConverterToRdf, RDFizers and the Virtuoso Sponger provide tools and information on generating RDF from different artifact types. Part of the current experimentation is around finding tools that can be transparently layered over the top of the current software development process. Finding the best way to extract the full set of desired RDF triples from Microsoft Word documents is also proving problematic since some natural language processing is required.

Tools currently being evaluated include:

The Benefits of Linked Data

The prototype showed the benefits of Linked Data for navigating from the requirements for a SOA service through to details of its implementation. Looking at all the information that could be extracted leads on to a broader view of the benefits Linked Data would bring to the SOA software development process.

One specific use being planned is the creation of a Service Registry application providing the following functionality:

  • Linking the services to the implementations running in a given environment, e.g. dev, test and production. This includes linking the specific versions of the requirement, design or implementation artifacts and detailing the runtime dependencies of each service implementation.
  • Listing the consumers of each service and providing summary statistics on the performance, e.g. daily usage figures derived from audit logs.
  • Providing a list of who to contact when a service is not available. This includes notifying consumers of a service outage and also contacting providers if a service is being affected by an external component being offline, e.g. a database or an external web service.
  • Search of the services by different criteria, e.g. business entity
  • Tracking the evolution of services and being able to assist with refactoring, e.g answering questions such as “Are there older versions of the Xml Schemas that can be deprecated?”
  • Simplify the running of a specific Soapui test case for a service operation in a given environment.
  • Provide the equivalent of a class lookup that includes all project classes plus all required infrastructure classes and returns information such as the jar file the class is contained in and JIRA and Subversion information.

Developing a Semantic Web Strategy

Tuesday, August 10th, 2010

In the last chapter of his book “Pull: The Power of the Semantic Web to Transform Your Business” David Siegel outlines some steps for developing a successful Semantic Web strategy for your business or organization.

One approach that worked for me recently was to organize a meeting titled “Developing a Semantic Web Strategy”  and invite along developers, architects, analysts and managers. This was in the context of a government organization and the managers were from the applications development area.

Sharing out books like Semantic Web for the Working Ontologist, Semantic Web For Dummies, Programming the Semantic Web and Semantic Web Programming prior to the meeting helped people get familiar with concepts like URIs as names for things, RDF, RDFS, OWL, SPARQL and RDFa.

To highlight how rapidly the Web of Data is evolving and the amount of information now being published as Linked Open Data, I stepped through Mark Greaves excellent presentation The Maturing Semantic Web: Lessons in Web-Scale Knowledge Representation.

During the meeting I took a business strategy first, technology second approach, taking the time to explore how an approach that has worked for someone else might fit with our organization.

Areas explored included:

Enterprise Modeling

I spent some time comparing RDF / OWL modeling with the UML modeling, highlighting how URIs enable modeling across distributed information sources without the need to consolidate everything in a central repository like you do with UML tools.

Also touched on OWL features such as:

Because it is a government department I highlighted the Federal Enterprise Architecture Reference Model Ontology (FEA-RMO) and how such an ontology could be used to map a parliamentary initiative to the software providing its implementation.

Open Government

Given the current trend for governments to make datasets freely available I presented the Linked Data approaches taken by http://data.gov and http://data.gov.uk as examples to follow in this area.

The business case for Linked Data in this scenario is that Linked Data is seen as the best available approach for publishing data in hugely diverse and distributed environments, in a gradual and sustainable way (see Why Linked Data for data.gov.uk? for details).

RDFa Based Integration

One example that struck a chord was RDFa and Linked Data in UK Government Websites where job vacancy details  from different sites can easily be combined since each web site publishes their web pages using HTML with RDFa added to annotate the job vacancy. Using RDFa allows the same page to be read as either HTML or RDF. The end result is that integration can be achieved with minimal changes to the original sites.

Search Engine Optimisation (SEO)

For anyone advertising products and services online the business strategy to follow is the example set by BestBuy.com which describes its stores and products using the Good Relations ontology and embeds these descriptions into its web pages using RDFa, increasing search engine traffic by 30%.

Enterprise Web of Data

Within our software development process, from project inception to production release and subsequent maintenance release, information is being copied and duplicated in a number of different places. Silos abound, in the form of word documents, spread sheets and the sticky notes that are part of the “Agile” process. There is some good information on our wiki pages but it is unstructured and not machine readable.

The information that forms our internal processes fails David Siegel’s Semantic Web Acid Test:

  • It’s not semantic and
  • It’s not on the web.

Introducing a Semantic Wiki such as Semantic MediaWiki, to hold project information and link this information to other datasources was raised as a candidate for a semantic web proof of concept.

Outcomes

Just scheduling the meeting was in itself a successful outcome since it started discussion around the role Semantic Web technologies could play in our organization. For a number of people, including the Applications Development manager, this is new technology and they need time to absorb it but the end result was agreement that it was technology that couldn’t be ignored.

In order to gain some practical experience two internal prototypes were agreed to,  both with practical value for the organization.

The first is a small application that will show the full set of runtime dependencies for a given software component as well as the other components affected when the specified component is changed. The application will be based on a simple ontology that defines dependencies between components using the owl:TransitiveProperty and uses a reasoner (e.g. Pellet) to infer the full set of dependencies for a component.

The second prototype will trial Semantic MediaWiki for project management (potentially using the Teamwork Ontology). The longer term view is customize Semantic MediaWiki to include artifacts created as part of the software development process, addressing some of the silo problems found in our current internal enterprise web of data.

Once practical knowledge has been gained from the internal prototypes a meeting will be scheduled with the Enterprise Architecture team to canvas the establishment of a wider vision for the use of Linked Data and Semantic Web technologies, potentially leading to its use on the public web sites, actively publishing to the Web of Data.

A GoodRelations Semantic Web Description of a Business

Saturday, April 11th, 2009

Tried out the newly released GoodRelations Annotator to create a Semantic Web description of a business.

The GoodRelations Annotator is an online form-based tool that creates an RDF/XML file “semanticweb.rdf” containing a description of the key aspects of the business. The description is based on concepts defined in the GoodRelations OWL ontology. In particular the description contains a BusinessEntity representing the business and one or more Offerings. Each Offering describes the intent to provide a Business Function for a certain Product or Service to a specified target audience.

The generated RDF/XML file can be either be published directly on the company’s Web site or used as a skeleton for developing a more fine-grained description.

The link Publishing GoodRelations Data on the Web provides guidelines on publishing to the web.

In my case I created a description for my embryonic business 3kbo.

I’m interested in linking the generated semanticweb.rdf to other things, in particular linking the BusinessEntity with people and with other BusinessEntitys.

Initially I added the URI of my foaf file to the BusinessEntity instance using rdfs:seeAlso, but after reading the definition of BusinessEntity i.e. that it represents the legal agent making a particular offering and
can be a legal body or a person, I changed it to owl:sameAs.

E.g.

<gr:BusinessEntity rdf:ID=”BusinessEntity”>

<owl:sameAs
rdf:resource=”http://www.3kbo.com/people/richard.hancock/foaf.rdf#i“/>

</gr:BusinessEntity>

This makes sense for my simple case, since as a sole trader I am the BusinessEntity. When viewed in Firefox using the Tabulator Extension owl:sameAs also provides an inferred link from my foaf file to my semanticweb.rdf as shown below.

foaf-infers-goodrelations

A part of the business description I don’t understand yet is how best to use the eClassOWL ontology to describe the Product or Service.

For example using the GoodRelations Annotator I selected “19 information, communication and media technology” as the Category and “1904 Software” as the Group.

eClassProductCategory

This leads to http://www.ebusiness-unibw.org/ontologies/eclass/5.1.4/#C_AKJ317003-tax being used in the definition of the product or service, i.e.

<gr:typeOfGood>
<gr:ProductOrServicesSomeInstancesPlaceholder rdf:ID=“ProductOrServicesSomeInstancesPlaceholder_1″>
<rdf:type rdf:resource=”"&eco;#C_AKJ317003-tax”>

<gr:ProductOrServicesSomeInstancesPlaceholder>
<gr:typeOfGood>

Because of the size of the eClassOWL ontology it takes awhile to dereference this link. It would be good to be able to provide a  more user friendly reference at this point that provided a description of the product or service.

Beyond this simple example I am interested in semantic web descriptions of other more complex relationships between a BusinessEntity (when not a person) and the people involved with the business (e.g. directors, CEO etc …) and between other BusinessEntitys.

Potentially GoodRelations and eClassOWL could be used as part of an Enterprise Architecture describing the who, what, how, when, where and why of a business.

DBpedia Examples using Linked Data and Sparql

Monday, August 11th, 2008

Using Wikipedia, the largest online encyclopedia, users can browse and perform full-text searches, but programmatic access to the knowledge-base is limited.

The DBpedia project extracts structured information from Wikipedia opening it up to programmatic access using Semantic Web technologies such as Linked Data and SPARQL. This means that the linking and reasoning abilities of RDF and OWL can be utilized and queries for specific information can be made using SPARQL.

Simplistically the mapping from the Wikipedia HTML based web pages to the DBpedia RDF based resources can be thought of as replacing “http://en.wikipedia.org/wiki/” with “http://dbpedia.org/resource/” but in reality there are some additional subtleties which are described in the article From Wikipedia URI-s to DBpedia URI.

The Wikipedia entry for “Civil Engineering” (http://en.wikipedia.org/wiki/Civil_Engineering) is used as an example to show how specific data can be retrieved from its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering).

When both the Wikipedia entry (http://en.wikipedia.org/wiki/Civil_Engineering) and its DBpedia equivalent (http://dbpedia.org/resource/Civil_engineering) are opened in a standard web browser they display similar information, however the DBpedia equivalent has been redirected to http://dbpedia.org/page/Civil_engineering.

This redirect can be viewed in Firefox using the Tamper Data Firefox Extension as shown in the image below.

Loading the DBpedia Resource

The initial status of 303 is the HTTP response code “303 See Other“. The server replied with the HTTP response code 303 in order to direct the browser to URI http://dbpedia.org/page/Civil_engineering which is a HTML page the browser can display. The original URI http://dbpedia.org/resource/Civil_engineering is an RDF resource that would not display as well in the HTML browser.

DBpedia implements a HTTP mechanism called content negotiation in order to provide clients such as web browsers with the information they request in a form they can display. The tutorial How to publish Linked Data on the Web describe this and other Linked Data techniques that are used by applications such as DBpedia.

In order to access the RDF resource directly a web client needs to tell the server to send it RDF data. A client can do this by sending the HTTP Request Header Accept: application/rdf+xml as part of its initial request. (The HTML browser had sent an Accept: text/html HTTP header indicating that it was requesting an HTML page.)

The Firefox Addon RESTTest can be used to set Accept: application/rdf+xml in the HTTP Request Header and directly request http://dbpedia.org/resource/Civil_engineering as shown in the image below.

In this case the request to http://dbpedia.org/resource/Civil_engineering succeeded as shown by the “Response Status 200″ and a RDF document was received as shown in the “Response Text”.

In both the RDF fragment shown in the image above and in the HTML page http://dbpedia.org/page/Civil_engineering the multiple language support is visible. The SPARQL queries below show how to extract specific information for a particular language.

SPARQL

DBpedia provides a public SPARQL endpoint at http://dbpedia.org/sparql which enables users to query the RDF datasource with SPARQL queries such as the following.

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/ontology/abstract> ?abstract }
}

The query returns all the abstracts for Civil Engineering, in each of the available languages.

The next query refines the abstracts returned to just the language specified, in this case ‘en’ (English).

SELECT ?abstract
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/ontology/abstract> ?abstract .
FILTER langMatches( lang(?abstract), ‘en’) }
}

The SNORQL query explorer shown in the image below, provides a simpler interface to the DBpedia SPARQL endpoint. The image below shows both the query and the result returned.

Other SPARQL endpoints such as http://demo.openlinksw.com/sparql/ (shown below) can query DBpedia by specifying the FROM NAMED clause to describe the RDF dataset. E.g.

SELECT ?abstract
FROM NAMED <http://dbpedia.org>
WHERE {
{ <http://dbpedia.org/resource/Civil_engineering> <http://dbpedia.org/ontology/abstract> ?abstract.
FILTER langMatches( lang(?abstract), ‘en’) }
}

Other Related DBpedia Articles

RDF as self-describing Data uses DBpedia and its SPARQL support to show how RDF is essentially ’self-describing’ – there is no need to know about traditional metadata (schemas) before exploring a data set.

Linking to DBpedia with TopBraid outlines the benefit of DBpedia in terms of providing relatively stable URIs for all relevant real-world concepts, thus making it a natural place to connect specific domain models with each other using the OWL built in propery owl:sameAs ( This property indicates that two URI references actually refer to the same thing ). TopBraid Composer provides support to link domain models with DBpedia .

Querying DBpedia provides examples of using SPARQL to query DBpedia.

Adding Semantic Markup to Your Rails Application with DBpedia and ActiveRDF and
Get Semantic with DBPedia and ActiveRDF describe using ActiveRDF to integrate DBpedia resources into web based applications. ActiveRDF is a library for accessing RDF data from Ruby and Ruby On Rails programs and can perform SPARQL queries.

ISO-15926

Sunday, June 29th, 2008

The ISO-15926 standard is titled: “Industrial automation systems and integration—Integration of life-cycle data for process plants including oil and gas production facilities“. One of its main requirements was that the scope of the data model covers the entire lifecycle of a facility (e.g. oil refinery) and its components (e.g. pipes, pumps and their parts, etc.)

The data model that has evolved is an RDF/OWL ontology. Its development and evolution has set some important precedents that other engineering and construction projects such as the development of the Common Inspection and Test Plans can learn from. These include:

  • The use of OWL to model concepts and the potential reuse of concepts already identified by ISO-15926 and modeled in OWL.
  • The construction of OWL ontologies through community participation.
  • Public sharing of web based ontologies in order to speed up the adoption of standardized concepts.
  • The development of a Semantic Web Ontology browser.
  • Provisioning for individual companies to provide their own customizations.

Wikipedia provides overviews of both ISO-15926 and ISO-15926 WIP (Work In Progress).

15926.ORG is a wiki based site providing a Knowledge Base dedicated to the practical implementation of, and information about ISO 15926. It includes an ISO 15926 General Introduction.