by Eduard Ayguadé and Jordi Torres
Self-managed middleware should be able to manage resources transparently and cost-effectively, while hiding the underlying complexity from users. Our group at the Barcelona Supercomputing Centre (BSC) and Technical University of Catalonia (UPC) has strengths in a diverse set of research areas, and through cross-disciplinary studies is building middleware that has the additional crucial feature of intelligent power management.
Because of the escalating price of power, energy-related costs have become a major economic factor for IT infrastructure and its host data centres. In addition to improving energy efficiency, companies are facing increasing pressure to reduce their carbon footprint due to EU regulations and campaigns demanding greener businesses.
The research community is therefore being challenged to rethink data centre strategies, adding energy efficiency to a list of critical operating parameters that already includes service availability, reliability and performance. While a large variety of power-saving proposals has been presented in the literature, workload consolidation and powering off spare servers are obvious and effective ways to save power. Server consolidation involves combining workloads from separate machines and different applications into a smaller number of systems. This is done using virtualization technology that allows the consolidation of applications, multiplexing them onto physical resources while supporting isolation from other applications sharing the same physical resources. This approach solves some interesting challenges: less hardware is required, less electricity is needed for server power and cooling, and less physical space is required.
The success of the consolidation strategy requires that the underlying complexity be hidden from users, and this is done by building self-managed middleware that can manage the resources transparently and in the most cost-effective way. To build this middleware with more intelligent power management will require cross-disciplinary studies over a diverse set of research areas. It is a complex end-to-end problem, requiring an intricate coordination of hardware, operating systems, virtual machines, middleware and applications.
The team at the Barcelona Supercomputing Center (BSC) is working on these topics (see link below). Our research demonstrates how consolidation with energy efficiency goals still has a long way to go beyond the use of virtualization. We have identified new opportunities to improve the energy efficiency of systems, reducing the resources required without negatively affecting performance or user satisfaction. For instance, request discrimination is introduced to identify and reject those requests that consume system resources but have no value for an application (eg requests coming from content-stealing Web crawlers). Memory compression is another example, converting CPU power into extra memory capacity to overcome system underutilization scenarios caused by memory constraints. Our results show that considering these techniques during placement decisions can boost the energy savings in a data centre.
Our research group has also proposed ways of rescuing resources by reducing waste. For example, middleware can hide and prevent some system failures or denial-of-service flooding attacks, thus avoiding the potential disruptions of unplanned outages and the associated loss of resources. An important component of this middleware is the predictive modelling available through self-monitoring analysis. Our current approach is applying new methods and concepts from machine learning that can not only find accurate models of and explanations for system behaviours, but also predict and estimate system states and values. Currently, we are addressing the problem of predicting and managing the performance of MapReduce applications, trying to meet performance goals while considering several high-level objectives such as energy saving, and without wasting physical resources. Obviously, we are also looking at multi-core processors with better performance and the energy efficiency necessary to meet these demands in years to come.
Figure 1: EMOTIVEcloud – Elastic Management of Tasks in Virtualized Environments.
BSC and the Technical University of Catalonia (UPC) are contributing to the research community with the EMOTIVE (Elastic Management of Tasks in Virtualized Environments) framework, which provides an elastic, fully customized, virtual environment in which to execute services, and which allows the development of new schedulers that take power usage into account when building the consolidation process (Figure 1). EMOTIVE abstracts a Cloud architecture using different layers and provides users with basic primitives for supporting the execution of tasks in an infrastructure. The core layer wraps each virtualized node and monitors its state, granting full control to the application of its execution environment without any risks to the underlying system or the other applications. In addition, it allows both local virtual machines (ie running in the provider’s nodes) and remote virtual machines running in third-party providers such as Amazon EC2 to be managed in a federated environment. These functionalities of the EMOTIVE framework ease the development of new resource management proposals, thus contributing to the innovation in this research area.
The research community must continue to find ways to ensure that performance improvements are accompanied by corresponding improvements in energy efficiency. The next generation of computing systems must achieve significantly lower power needs, higher performance/watt ratio, and higher dependability than ever before. This is something that can only be achieved with a holistic approach.
UPC-BSC, Barcelona, Spain