Archive for the ‘Groovy’ Category

Linked Data and the SOA Software Development Process

Thursday, November 17th, 2011

We have quite a rigorous SOA software development process however the full value of the collected information is not being realized because the artifacts are stored in disconnected information silos. So far attempts to introduce tools which could improve the situation (e.g. zAgile Teamwork and Semantic Media Wiki) have been unsuccessful, possibly because the value of a Linked Data approach is not yet fully appreciated.

To provide an example Linked Data view of the SOA services and their associated artifacts I created a prototype consisting of  Sesame running on a Tomcat server with Pubby providing the Linked Data view via the Sesame SPARQL end point. TopBraid was connected directly to the Sesame native store (configured via the Sesame Workbench) to create a subset of services sufficient to demonstrate the value of publishing information as Linked Data. In particular the prototype showed how easy it became to navigate from the requirements for a SOA service through to details of its implementation.

The  prototype also highlighted that auto generation of the RDF graph (the data providing the Linked Data view) from the actual source artifacts would be preferable to manual entry, especially if this could be transparently integrated with the current software development process. This is has become the focus of the next step, automated knowledge extraction from the source artifacts.

Artifacts

Key artifact types of our process include:

A Graph of Concepts and Instances

There is a rich graph of relationships linking the things described in the artifacts listed above. For example the business entities defined in the UML analysis model are the subject of the service and service operations defined in the Service Contracts. The service and service operations are mapped to the WSDLs which utilize the Xml Schema’s that provide an XML view of business entities. The JAX-WS implementations are linked to the WSDLs and Xml Schema’s and deployed to the Oracle Weblogic Application Server where the configuration files list the external dependencies. The log files and defects link back to specific parts of the code base (Subversion revisions) within the context of specific service operations. The people associated with the different artifacts can often be determined from artifact meta-data.

RDF, OWL and Linked Data are a natural fit for modelling and viewing this graph since there is a mix of concepts plus a lot of instances, many of whom already have a HTTP representation. Also the graph contains a number of transitive relationships , (for example a WSDL may import an Xml Schema which in turn imports another Xml Schema etc …) promoting the use of the owl:TransitiveProperty to help obtain a full picture of all the dependencies a component may have.

Knowledge Extraction

Another advantage of the RDF, OWL, Linked Data approach is the utilization of unique URIs for identifying concepts and instances. This allows information contain in one artifact, e.g. a WSDL, to be extracted as RDF triples which would later be combined with the RDF triples extracted from the JAX-WS annotation of Java source code. The combined RDF triples tell us more about the WSDL and its Java implementation than could be derived from just one of the artifacts.

We have made some progress with knowledge extraction but this is still definitely a work in progress. Sites such as ConverterToRdf, RDFizers and the Virtuoso Sponger provide tools and information on generating RDF from different artifact types. Part of the current experimentation is around finding tools that can be transparently layered over the top of the current software development process. Finding the best way to extract the full set of desired RDF triples from Microsoft Word documents is also proving problematic since some natural language processing is required.

Tools currently being evaluated include:

The Benefits of Linked Data

The prototype showed the benefits of Linked Data for navigating from the requirements for a SOA service through to details of its implementation. Looking at all the information that could be extracted leads on to a broader view of the benefits Linked Data would bring to the SOA software development process.

One specific use being planned is the creation of a Service Registry application providing the following functionality:

  • Linking the services to the implementations running in a given environment, e.g. dev, test and production. This includes linking the specific versions of the requirement, design or implementation artifacts and detailing the runtime dependencies of each service implementation.
  • Listing the consumers of each service and providing summary statistics on the performance, e.g. daily usage figures derived from audit logs.
  • Providing a list of who to contact when a service is not available. This includes notifying consumers of a service outage and also contacting providers if a service is being affected by an external component being offline, e.g. a database or an external web service.
  • Search of the services by different criteria, e.g. business entity
  • Tracking the evolution of services and being able to assist with refactoring, e.g answering questions such as “Are there older versions of the Xml Schemas that can be deprecated?”
  • Simplify the running of a specific Soapui test case for a service operation in a given environment.
  • Provide the equivalent of a class lookup that includes all project classes plus all required infrastructure classes and returns information such as the jar file the class is contained in and JIRA and Subversion information.

Using Groovy to Upload RDF files to the Talis Platform

Saturday, March 13th, 2010

The Talis Platform provides free stores for developers to host RDF data online. Each store has its own SPARQL end point for querying the RDF data.

Options for uploading individual RDF files into a store include:

A nice to have option would be to be able to upload all the RDF files found in a directory directly into a store using a simple command like TalisStore.load.

Groovy with its flexible scripting is a good candidate for this type of work. Code like the following makes it easy to traverse directories and list the RDF files

  • in the current directory:
    <br />
    new File(&quot;.&quot;).eachFileMatch(~/.*\.rdf/) { println it }<br />
    
  • or in a specific directory:
    <br />
    new File(&quot;/data/rdf&quot;).eachFileMatch(~/.*\.rdf/) { println it }<br />
    

Once Groovy is installed the above lines of code can be run directly in both the Groovy Shell (groovysh) and the Groovy Console (groovyConsole). For example when run in the Groovy Shell (groovysh) :

<br />
$ groovysh<br />
Groovy Shell (1.6.4, JVM: 1.6.0_15)<br />
Type 'help' or '\h' for help.<br />
-------------------------------------------------------------------------------------<br />
groovy:000&gt; new File(&quot;.&quot;).eachFileMatch(~/.*\.rdf/) { println it }<br />
./WO0002.rdf<br />
./WO0003.rdf<br />
./WO0004.rdf<br />
./WO0005.rdf<br />

The Groovy RESTClient simplifies REST operations like POSTing (uploading) files to a web site. It is an extension of HTTPBuilder which in turn is a wrapper of Apache’s HttpClient. The main addition required for the RESTClient to upload RDF/XML files to a Talis store is an “application/rdf+xml” encoder. This is easy to create following the example provided in the article Groovy RESTClient and Putting Zip Files.

The result is the encodeRDF method shown below.

<br />
import groovyx.net.http.RESTClient<br />
import org.apache.http.entity.FileEntity<br />
TalisStoreLoader() {<br />
 talis = new RESTClient( &quot;http://api.talis.com/&quot; )<br />
 talis.auth.basic TALIS_USERNAME, TALIS_PASSWORD<br />
 talis.encoder.'application/rdf+xml' = this.&amp;encodeRDF<br />
 }<br />
def encodeRDF( Object data ) throws UnsupportedEncodingException {<br />
 if ( data instanceof File ) {<br />
 def entity = new FileEntity( (File) data, &quot;application/rdf+xml&quot; );<br />
 entity.setContentType( &quot;application/rdf+xml&quot; );<br />
 return entity<br />
 } else {<br />
 throw new IllegalArgumentException(<br />
 &quot;Don't know how to encode ${data.class.name} as application/rdf+xml&quot; );<br />
 }<br />
 }<br />

The line talis.encoder.’application/rdf+xml’ = this.&encodeRDF registers it with an instance of the RESTClient.

With the RDF encoder in place a file can be uploaded to a stores metabox as follows.

<br />
def res = talis.post( path: metaboxPath, body: file, requestContentType: &quot;application/rdf+xml&quot; )</p>
<p>

This functionality is encapsulated in the class com._3kbo.talis.TalisStoreLoader which is part of a maven project available for download as a zip file. It includes the script TalisStore.groovy which is a simplified wrapper of com._3kbo.talis.TalisStoreLoader.

The jar file create by the project talis-store-0.2.jar can be downloaded separately.

The RESTClient is not bundled with the standard Groovy install. Trying to access it from the shell or console without explicitly installing it will results in errors like the following:

<br />
groovy:000&gt; import groovyx.net.http.RESTClient<br />
ERROR org.codehaus.groovy.tools.shell.CommandException:<br />
Invalid import definition: 'import groovyx.net.http.RESTClient';<br />
reason: startup failed, script1266050039289.groovy:<br />
1: unable to resolve class groovyx.net.http.RESTClient<br />
 @ line 1, column 1. 1 error at java_lang_Runnable$run.call (Unknown Source)<br />

Installing the RESTClient requires downloading HTTPBuilder and adding it and its dependencies (http-builder-xxx-all.zip) to the ${user.home}/.groovy/lib directory. Also add talis-store-0.2.jar to this directory. The ${user.home}/.groovy/lib directory may need to be created manually but the Groovy install should have created a file named “$GROOVY_HOME/conf/groovy-starter.conf” containing the line

load ${user.home}/.groovy/lib/*

which enables the loading of the additional jar files required by RESTClient plus the com._3kbo.talis.TalisStoreLoader i.e:

  • http-builder-0.5.0-RC2.jar
  • httpclient-4.0.jar
  • httpcore-4.0.1.jar
  • json-lib-2.3-jdk15.jar
  • xml-resolver-1.2.jar
  • commons-collections-3.2.1.jar
  • commons-logging-1.1.1.jar
  • talis-store-0.2.jar

Using the Groovy Shell to Upload

With the RESTClient and the talis-store-0.2.jar installed the Groovy Shell (groovysh) makes it easy to run the TalisStore.groovy script and upload either individual RDF files or all the RDF files in a directory to a Talis store.

The four options for running the TalisStore.groovy script are:

  1. TalisStore.load “mystore”,”user”,”password”,”file_or_directory”
  2. TalisStore.load “mystore”,”user”,”password”
  3. TalisStore.load “file_or_directory”
  4. TalisStore.load()

The first and second options both explicitly set the store, user and password. The first option also nominates either a specific RDF file to upload or a directory to scan and upload all the RDF files found. The second option uploads all the RDF files found in the current directory, i.e. the directory in which the Groovy Shell (groovysh) was invoked.

The third and forth options read the store, user and password from the configuration file TalisConfig.groovy, updated for a specific store and available on the classpath (see below).

With the configuration file TalisConfig.groovy in place uploading a specific RDF file or a directory simplifies to TalisStore.load “file_or_directory”

Uploading the RDF files in the current directory is just TalisStore.load() as shown in the example
Loading all RDF files from the current directory below.

Using the Script to Upload

Adding the line #!/usr/bin/env groovy to the TalisStore.groovy script and making the script executable allows it to be run independent of the Groovy Shell (groovysh), for example ./TalisStore.groovy /sioc/forum/WO0902.rdf explicitly loads the RDF, using the configuration file to set the store, user and password.

See the TalisStore.groovy javadoc for more details on running as an executable script.

Summary

There is a bit of configuration to set everything up but once in place the combination of Groovy, the RESTClient and the TalisStore loader code described here makes it easy to load RDF files to the Talis Platform.

My preference is to run the Groovy Shell (groovysh) and use simple commands like TalisStore.load().

Possible extensions for the future include commands like TalisStore.sparql.select etc…

Appendix A: Examples

Loading a specific file

<br />
$ groovysh<br />
Groovy Shell (1.7.1, JVM: 1.6.0_15)<br />
Type 'help' or '\h' for help.<br />
-------------------------------------------------------------------------------<br />
groovy:000&gt; TalisStore.load &quot;mystore&quot;,&quot;user&quot;,&quot;password&quot;,&quot;/sioc/WO0401.rdf&quot;<br />
Using store: mystore user password<br />
Loading a file or directory: /sioc/WO0401.rdf<br />
Loading /sioc/WO0401.rdf<br />
Loaded 1565688 bytes in 58518 milliseconds. (Status: 204)<br />

Loading all RDF files from the current directory

<br />
$ cd /scoop/forum/<br />
$ ls -l<br />
-rw-r--r--  1  3847192  2 Jan 12:11 WO0903.rdf<br />
-rw-r--r--  1  2485605  2 Jan 12:11 WO0904.rdf<br />
-rw-r--r--  1  2321233  2 Jan 12:12 WO0905.rdf<br />
-rw-r--r--  1  2551787  2 Jan 12:12 WO0906.rdf<br />
$ groovysh<br />
Groovy Shell (1.7.1, JVM: 1.6.0_17)<br />
Type 'help' or '\h' for help.<br />
--------------------------------------------<br />
groovy:000&gt; TalisStore.load()<br />
Classpath:<br />
...<br />
Loading RDF files from directory /scoop/forum/.<br />
2010-03-14 11:32:31.477: Loading /scoop/forum/./WO0903.rdf<br />
2010-03-14 11:33:49.289: Loaded 3847192 bytes in 77808 milliseconds. (Status: 204)<br />
2010-03-14 11:33:49.304: Loading /scoop/forum/./WO0904.rdf<br />
2010-03-14 11:34:38.288: Loaded 2485605 bytes in 48984 milliseconds. (Status: 204)<br />
2010-03-14 11:34:38.289: Loading /scoop/forum/./WO0905.rdf<br />
2010-03-14 11:35:25.429: Loaded 2321233 bytes in 47140 milliseconds. (Status: 204)<br />
2010-03-14 11:35:25.43: Loading /scoop/forum/./WO0906.rdf<br />
2010-03-14 11:36:15.952: Loaded 2551787 bytes in 50523 milliseconds. (Status: 204)<br />
Loaded 4 files in 224488 milliseconds.<br />
===&gt; 4<br />
groovy:000&gt;<br />

Appendix B: Adding the Groovy Configuration File to the Classpath

The structure of the config file is:

<br />
// TalisConfig.groovy<br />
talis {<br />
    user = &quot;myusername&quot;<br />
    password = &quot;mypassword&quot;<br />
    store = &quot;mystore&quot;<br />
}<br />

Once the values have been updated for a specific store the steps for adding to the classpath and also verifying that it is being read correctly are as follows:

  • Create a directory to hold property files ( e.g. . ${user.home}/.groovy/conf/ ) and
  • Add a matching line to “$GROOVY_HOME/conf/groovy-starter.conf” to add the directory to the classpath,e.g. load ${user.home}/.groovy/conf/./
  • Place the Groovy configuration file TalisConfig.groovy in the directory (i.e. ${user.home}/.groovy/conf/)

ConfigSlurper is used to read the configuration file. The shell input below shows how to:

  • Check what is on the classpath using loader.URLs.each{ println it }
  • Get the config file using url = loader.getResource(”TalisConfig.groovy”)
  • Read the config file using def config = new ConfigSlurper().parse(url)

<br />
groovy:000&gt; import groovyx.net.http.RESTClient<br />
===&gt; [import groovyx.net.http.RESTClient]<br />
groovy:000&gt; talis = new RESTClient( &quot;http://api.talis.com/&quot; )<br />
===&gt; groovyx.net.http.RESTClient@1798928<br />
groovy:000&gt; loader = talis.class.classLoader.rootLoader<br />
===&gt; org.codehaus.groovy.tools.RootLoader@4d20a47e<br />
groovy:000&gt; loader.URLs.each{ println it }<br />
file:/Users/richardhancock/./<br />
file:/Users/richardhancock/groovy-1.6.4/lib/ant-1.7.1.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/ant-junit-1.7.1.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/ant-launcher-1.7.1.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/antlr-2.7.7.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/asm-2.2.3.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/asm-analysis-2.2.3.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/asm-tree-2.2.3.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/asm-util-2.2.3.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/bsf-2.4.0.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/commons-cli-1.2.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/commons-logging-1.1.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/groovy-1.6.4.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/ivy-2.1.0-rc2.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/jline-0.9.94.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/jsp-api-2.0.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/junit-3.8.2.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/servlet-api-2.4.jar<br />
file:/Users/richardhancock/groovy-1.6.4/lib/xstream-1.3.1.jar<br />
file:/Users/richardhancock/.groovy/lib/http-builder-0.5.0-RC2.jar<br />
file:/Users/richardhancock/.groovy/lib/httpclient-4.0.jar<br />
file:/Users/richardhancock/.groovy/lib/httpcore-4.0.1.jar<br />
file:/Users/richardhancock/.groovy/lib/json-lib-2.3-jdk15.jar<br />
file:/Users/richardhancock/.groovy/lib/xml-resolver-1.2.jar<br />
file:/Users/richardhancock/.groovy/conf/./<br />
===&gt; [Ljava.net.URL;@3ebc312f<br />
groovy:000&gt; url = loader.getResource(&quot;TalisConfig.groovy&quot;)<br />
===&gt; file:/Users/richardhancock/.groovy/conf/TalisConfig.groovy<br />
groovy:000&gt; def config = new ConfigSlurper().parse(url)<br />
===&gt; {talis={username=myusername, password=mypassword, store=mystore}}<br />
groovy:000&gt;<br />

Appendix C: Using Maven to run the Groovy Script

The TalisStore script can also be run via maven. This approach uses the jar file dependencies defined in the maven  project and does not require the standard Groovy install. If a valid “TalisConfig.groovy” configuration file is available on the classpath, the parameters for “store”, “username” and “password” are not required. By default the pom.xml file excludes the dummy configuration file but once it has been updated with real values it can be included by changing the exclude(s) to include(s) .  The TalisStore script can be run by executing command lines such as the following which invoke the TalisStore main method (optionally with parameters).

mvn exec:java -Dexec.mainClass=TalisStore

mvn exec:java -Dexec.mainClass=TalisStore -Dexec.args=”/sioc/forum/2007″

Appendix D: Authentication

The method “talis.auth.basic TALIS_USERNAME, TALIS_PASSWORD” is a bit of an anomaly since the Talis Platform uses HTTP Digest Authentication. RESTClient uses the groovyx.net.http.AuthConfigbasic” method which works for “digest” authentication as well.