by Paul Klint
The paradigm of service orientation creates new opportunities for language-processing tools and Interactive Development Environments. At CWI we have developed the ToolBus, a service-oriented architecture with application areas like software renovation and implementation of domain-specific languages.
Service-oriented architectures aim to decouple processing steps. They are usually applied in the context of heavyweight business applications where traceability and transaction management are the dominant requirements.
In language-processing tool suites and Interactive Development Environments, other requirements prevail: for instance, the tight integration of tools and efficient exchange of data are more prominent as requirements. At CWI we have developed the ToolBus, a service-oriented architecture, to achieve this. It is based on process algebra as a concurrency paradigm and on ATerms as a data exchange mechanism. The ToolBus forms the foundation for a suite of language-processing tools combined in the ASF+SDF Meta-Environment. ASF+SDF is a term rewriting language that extends the syntax definition formalism (SDF).
There are many urgent language-processing tasks that require a quicker answer than can be achieved by building a dedicated tool from scratch. Examples are performing a domain-specific analysis (Is the memory management API used consistently?), executing a dedicated transformation (Refactor this code to use the new API.) or building support tools for a new domain-specific language. In all these cases, the best approach is to combine existing tools with newly written ones in order to solve the problem quickly.
The tools typically available in the language engineer's tool chest are generators (for parsing, formatting and code generation) and generic tools for editing, user-interfacing and visualization. These tools are usually written in different languages and run on different platforms.
The relevance of service orientation to this domain is evident but we apply it with some twists. Based on experience, we have come to the conclusion that the straightforward use of XML for exchanging intermediate data is inadequate: the parse trees that arise from analysing hundreds of thousands of lines of code simply become too bulky. In order to solve this problem we represent intermediate data as ATerm (short for Annotated Term), a directed acyclic graph that maximizes subterm sharing and can be represented very concisely. By providing all the relevant tools with an ATerm interface, huge amounts of data can be shipped between tools while sharing is preserved.
Another issue is how to orchestrate the execution of all these tools. To this end, we connect them to the ToolBus, which can simply be described as a programmable, ATerm-enabled, service bus. The orchestration is described by Tscript, a scripting language based on process algebra that supports parallelism, asynchronous and synchronous communication and tool control. This allows the construction of large, heterogeneous and distributed applications. The figure illustrates the use of the ToolBus while orchestrating the tools in the Meta-Environment. Observe that variations in the Tscript lead to variations in the resulting system; product families can thus easily be supported.
Recently the ToolBus entered a new phase in its development. The existing ToolBus was implemented in C; we are now about to finish a reimplementation in Java in order to profit from Java's better structuring facilities and from the direct availability of many relevant communication and (Web) service libraries. This new version will also address issues such as built-in profiling and monitoring, efficient tool-to-tool communication, and better isolation and recovery of malfunctioning tools. In addition to its existing service and networking capabilities, this new implementation will also enable execution on multi-core computers. In this way, the whole distribution spectrum - from wide area to multi-processors on a chip - can be handled. This enables applications in which high-density local computation clusters are loosely coupled via a wide-area network. We envisage that more advanced software analysis tasks will require such an infrastructure.
The ToolBus is distributed as part of the Meta-Environment, which is in use by various academic and industrial parties for software analysis, software transformation and domain-specific language development. Examples are analysis and refactoring of the C code for ASML's lithography machines, the renovation of administrative Cobol code by Getronics, and the use of a financial domain-specific language by the Fortis bank. Current work on the ToolBus is done in cooperation with Technical University Eindhoven and University of Amsterdam. Within ERCIM, we cooperate with INRIA.
Paul Klint, CWI, The Netherlands
Tel: +31 20 592 4126