by Carmen Bratosin and Wil van der Aalst
The Architecture for Information Systems group of the Technische Universiteit Eindhoven (TU/e) in the Netherlands has built up extensive knowledge in the field of Workflow Management Systems and Process Mining. Since 2006, the group has begun to apply this knowledge in a new and dynamic research area: Grid computing. Four research perspectives are currently under investigation.
Software systems are becoming increasingly complex. To cope with this, systems are often divided into a number of autonomous components whose work is coordinated: this coordination of components and services represents one of the main challenges in software engineering. Two important application fields of coordination are Grid computing and workflow management. Grid computing is mostly used in computational science while workflow management is used for business applications: we try to bridge the gap between these two areas in order to make further progress in both of them.
Over the last decade we have gathered a great deal of experience in process modelling, analysis and enactment. Our workflow patterns have become a standard way to evaluate languages and the workflow management system YAWL (Yet Another Workflow Language) is one of the most expressive and mature open-source workflow systems available today. Moreover, we specialize in process analysis. Using Petri nets as a theoretical foundation, we have been able to analyse a variety of real-life process models ranging from BPEL (Business Process Execution Language) and workflow specifications to the entire SAP reference model. In recent years, we have focused on the analysis of processes based on system logs. The ProM framework developed at TU/e provides a versatile toolset for process mining, which seems to be particularly useful in a Grid environment.
Until now, the Grid computing community has focused primarily on infrastructure. Grid software has been designed that allows users to submit their 'problems' to the Grid. Less work has been done on how to model such problems efficiently. In addition, most applications place the correctness properties in the hands of the user.
We are therefore applying our knowledge of Petri-net modelling and analysis, workflow patterns, process mining and concrete workflow technology to Grids. This involves research in the following areas:
Many definitions of Grids exist, and in many cases technological aspects and hyped terms hide the essence of Grids. We use a mixture of Petri nets and UML modelling to build formal/conceptual models for Grid computing. Here we emphasize the link between the distributed nature of Grids (where resources play an important role) and workflow processes. The main purpose is to formalize the concept of a Grid and to fix a particular interpretation while highlighting interesting research questions.
Analysing Grid models:
Using techniques based on Petri nets, we analyse different mechanisms used in Grid workflows, with the goal being to transfer correctness notions such as soundness to them. We also try to find new properties based on the specific Grid behaviour (eg multiple instances of the same process, resource allocation, and distributed management).
Analysing Grid logs:
In a Grid environment many events are logged and the performance of the system is of the utmost importance. The application of process-mining techniques is therefore of interest, to assist in the configuration of Grids and the on-the-fly optimization of processes.
Building a process-aware Grid infrastructure:
Using a combination of Globus, YAWL and ProM we want to realize a more 'process-aware' Grid. By linking a fundamental enabling technology for the Grids (Globus) to a powerful process engine (YAWL) and state-of-the-art analysis tools (ProM), we obtain an interesting environment for experimentation.
The figure illustrates the scope of the project. On the one hand, we analyse Grids by modelling them in terms of Petri nets. Similar models are used for the configuration of the process perspective of Grid middleware (in our case a mixture of Globus and YAWL). On the other hand, we collect event logs via the middleware layer and use these for process mining, process discovery (automatically deriving models by observing the Grid), conformance checking (to check whether 'the Grid' is behaving as expected) and model extension (eg to project performance indicators onto a process model).
All of the aspects shown in the figure have been extensively investigated in the context of workflow management systems and service-oriented architectures using BPEL engines. For example, we have been doing conformance testing in the context of Oracle BPEL, and process discovery and process verification in the context of IBM WebSphere. We have also evaluated many process engines using the so-called workflow patterns and provided semantics and analysis techniques for process-modelling languages ranging from BPEL and YAWL to BPMN and EPCs. The next step is to apply this in a Grid environment using both Globus and YAWL.
The research is supported by the Netherlands Organization for Scientific Research (NWO) in the context of the project Workflow Management for Large Parallel and Distributed Applications. TU/e is participating in this project with the group of Professor Farhad Arbab of CWI. The project started in 2006 and its duration is four years.
Carmen Bratosin, Eindhoven University of Technology, the Netherlands
Tel.: +31 40 247 5144