Process Discovery - Matchmaking of Semantic Bahavior Descriptions

Implementation Details

We used our semi-automatic Web process acquisition method to collect descriptions of actual Web processes. In the current status of the acquisition, processes with interesting behavior patterns cannot be acquired completely automatically. So, for the purpose of testing the presented verification method we decided to synthesize process descriptions additionally. Using given domain ontologies (that would have been created by the acquisition method), the generated semantic descriptions of processes have non-functional properties and a behavioral description with input, output activities and local processes. We implemented a Java API to model process description and another API to model the request. The verification module then receives process description and request objects together with the used domain ontologies described in OWL2. HermiT (Hermit OWL Reasoner http://hermit-reasoner.com) is used to provide reasoning support during the verification of Web process descriptions.

The generated process descriptions vary in their complexity with up to 4 NFPs, up to 10 process steps (each of which with up to 3 arguments). We hand-crafted a description of a classification with 20 classes and classified each process in up to 4 of them. We performed tests with 3 queries:

(Q1) A proposition about process arguments and their properties,

(Q2) a conjunction of eventually consumed inputs and eventually returned outputs, and

(Q3) always have the ability to logout after a user has provided login credential. We omitted the NFRs due to space constraints.

For each request the verifier checks for each available process description if the request is a model of the description. This is done for each property that is constrained in the request. If classes of the classification hierarchy are used, the set of processes are simply retrieved from the classification hierarchy that is cached in the verification module and thus allows us to speed up the verification process. In an experiment we observed the time to retrieve all processes for which the request is model for their descriptions. These experiments where repeated for a various numbers of available Web process descriptions.

The results reveal the crucial need for the reduction of the search space as we did with the classification. Unsurprisingly, the use of expressive formalism lead to high computational effort, which is quickly increasing with increasing number of formal process descriptions. We also observed that the different complexity has a high influence on the query time. This observation justifies the use of a classification because the usage of classes reduces the query complexity by replacing model checking tasks with retrieval of off-line classified processes. Comparing the results of individual queries with or without classes reveals the substantial impact of the hierarchy on the verification performance. The shown experiments serve as a proof of concept as the purpose of this work is not the performance and scalability of the approach. Nevertheless, in order to achieve more usable results, we are currently experimenting with parallelizing the task of verifying queries against independent process descriptions and materializing the classification in a database (instead of a 20MB OWL Ontology for 40k process descriptions).

The implementation is available in the maven repository. To gain access to the maven repository, we send you a user login upon request to junghans(at)kit.edu .

The components to use can be included by adding the following dependency

<dependency>
    <groupId>edu.kit.aifb.suprime.reasoning</groupId>
    <artifactId>suprime-process-reasoning</artifactId>
    <version>1.0</version>
</dependency>

Discovery of Service and Process Descriptions

Efficient discovery of services is a central task in the field of Service Oriented Architectures (SOA). Service discovery techniques enable users, e.g., end users and developers of a service-oriented system, to find appropriate services for their needs by matching user’s goals against available descriptions of services. The more formal the service descriptions are, the more automation of discovery can be achieved while still ensuring comprehensibility of a discovery technique.

Currently, only invocation related information of Web services is described with the W3C standard WSDL while the functionality of a service, i.e., what a service actually does, is described in form of natural language documents. In suprime, we use the pi-calculus based service and process description formalism. However, there are still a lot of Web services that are not annotated formally.

We have developed discovery techniques in order to provide users, e.g., end users, developers, and annotators, with a discovery component that is capable to consider semantic descriptions of Web services. Traditional non-semantic descriptions are not used for a full text based discovery, which makes use of the Web service descriptions provided by the WSDL service description files, the Web pages describing REST-ful Web services, and related documents found by a crawler.

The semantic discovery mechanism considers functionalities of services formally described with pre-conditions and effects. This discovery component enables users to enter a more structured goal (service classification, pre-conditions, and effects), then finds the available descriptions from the repository that match the goal by using the reasoning facilities. While keyword based search mechanisms may scale well, reasoning over logical expressions is computationally expensive. We therefore also develop an approach that on one side considers the functionalities of Web services, on the other side has the potential of scaling to a large number of Web service descriptions.

Ranking of Discovered Results

The ranking component determines an ordering of the discovered service and process descriptions and considers user preferences on functional and non-functional properties. It enables automation of service related task (e.g., composition) by finding the most appropriate service or process for a given query. Therefore, the development of the ranking component comprises formalisms to specify preferences on properties and a scalable fuzzy logics based ranking algorithm.