High-Level Protocol Engineering without Performance Penalty for Multi-Core

by Farhad Arbab, Sung-Shik Jongmans and Frank de Boer

CWI’s Reo-to-C compiler can generate code that outperforms hand-crafted code written by a competent programmer. As such, compared to conventional approaches to parallel programming, our approach has the potential to not only improve software quality but also performance.

In 2012, we described in ERCIM News a novel approach to programming interaction protocols among threads on multi-core platforms. This approach is based on the idea of using the graphical coordination language Reo, which has been subject to ongoing development by the Formal Methods group at CWI since the early 2000s, as a domain-specific language for the compositional specification of protocols. Since then, we have developed several code generation techniques and tools for Reo, including the Reo-to-Java and Reo-to-C compilers.

In terms of software engineering and software quality there are many advantages of using Reo. Reo [1] provides declarative high-level constructs for expressing interactions and thus, programmers can specify their protocols at a more suitable level of abstraction that can be achieved by using conventional languages (e.g., Java or C). These conventional languages provide only error-prone low-level synchronization primitives (e.g., locks, semaphores). Moreover, because Reo has a formal semantics, protocols expressed in Reo can be formally analyzed to improve software quality and ensure correctness (e.g., by using model checking tools). Lastly, by using Reo, protocols become tangible, explicit software artifacts which promotes their reuse, composition and maintenance.

Two years ago, we thought that this list of software engineering advantages would be the main reason for programmers to adopt our approach, so long as the performance of the code generated by our compilers proved “acceptable”. At the time, it seemed ambitious to strive for the level of performance (within an order of magnitude) that competent programmers can achieve hand-crafting code using a conventional language. After all, compiling high-level specifications into efficient lower-level implementations constitutes a significant challenge.

However, while developing our compilers, we came to realize that Reo’s declarative constructs actually give us an edge, as compared to conventional languages. Reo allows for novel compiler optimization techniques that fundamentally conventional languages cannot apply. The reason for this is that Reo’s declarative constructs preserve more of a programmer’s intentions when they specify their protocols. When a sufficiently smart compiler knows exactly what a protocol is supposed to achieve, this compiler can subsequently choose the best lower-level implementation. Using conventional languages, in which programmers write such lower-level implementations by hand, information about their intentions is either irretrievably lost or very hard to extract. Therefore, to perform optimizations at the protocol logic level a compiler needs to reconstruct those intentions, but typically, it simply cannot. Thus, by using conventional languages for implementing protocols by hand, the burden of writing efficient protocol implementations rests exclusively on the shoulders of programmers, adding even more complexity to the already difficult task of writing parallel programs. As the number of cores per chip increases, the shortcomings of conventional programming languages in writing efficient protocol implementations will amplify this issue, effectively making such languages unsuitable for programming large-scale multi-core machines.

The following example, a simple producers-consumer program, offers the first evidence that our approach can result in better performance. In this program, every one of n producers produces and sends an infinite number of data elements to the consumer, while the consumer receives and consumes an infinite number of data elements from the producers. The protocol between the producers and the consumer states that the producers send their data elements asynchronously, reliably and in rounds. In every round, each producer sends one data element in an arbitrary order. A Reo specification realizes this protocol (for three producers; see Figure 1). We also had a competent programmer hand-craft a semantically equivalent program in C and Pthreads.

Figure 1: Producers-consumer protocol specification in Reo.

We compared the scalability of the code generated by our current Reo-to-C compiler with the hand-crafted implementation (Figure 2). The generated C code runs on top of a novel runtime system for parallel programs. In this example, for this protocol, the Reo-based implementation outperformed the carefully hand-crafted code.

Figure 2: The results of a scalability comparison between the code generated by the Reo-to-C compiler (dashed line) and the hand-crafted implementation (continuous line).

The technology behind our compiler is based on Reo’s automaton semantics. Our most recent publication contains references to the relevant material [3]. All the optimization techniques used by the compiler have a strong mathematical foundation and we have formally proved their correctness (which guarantees correctness by construction).

We do not claim that code generated by our compiler will outperform every hand-crafted implementation in every protocol and in fact, know that it does not. However, we do believe that these first results are very encouraging and see several ways to further optimize our compiler in the future.

Links:
http://reo.project.cwi.nl
http://www.cwi.nl/~farhad

References:
[1] F. Arbab: “Puff, The Magic Protocol”, in “Formal Modeling: Actors, Open Systems, Biological Systems (Talcott Festschrift)”, LNCS 7000, pp. 169--206, 2011
[2] S.-S. Jongmans, S. Halle, F. Arbab: “Automata-based Optimization of Interaction Protocols for Scalable Multicore Platforms”, in “Coordination Models and Languages (Proceedings of COORDINATION 2014)”, LNCS 8459, pp. 65--82, 2014
[3] S.-S. Jongmans, F. Arbab: “Toward Sequentializing Overparallelized Protocol Code”, in “proc. of ICE 2014”, EPTCS, 2014

Please contact:
Farhad Arbab, Frank de Boer,
Sung-Shik Jongmans
CWI, The Netherlands
Tel: +31 20 592 [4056, 4139, 4241]
E-mail: [farhad, frb, jongmans]@cwi.nl

Sidebar

Contents

High-Level Protocol Engineering without Performance Penalty for Multi-Core