by Thomas Lorünser (AIT), Stephan Krenn (AIT) and Roland Kammerer (Linbit HA-Solutions GmbH)

Distributed Replicated Block Device (DRBD) is the de facto standard for redundant block storage. It is used more than 250,000 times world-wide and part of the official Linux kernel. DRBD4Cloud is a research project which aims at increasing the applicability and functionality of DRBD in order to enter new markets and to face future challenges in distributed storage.

Redundant data storage is a necessity for business continuity of virtually any cloud service. An intuitive approach is full data replication to multiple storage nodes, which is currently done by DRBD [L2]. However, due to large bandwidth requirements and storage overhead this is not feasible for large-scale deployments with many mirroring nodes, i.e., typical cloud settings. Other than DRBD, Ceph is the major backend block storage implementation. Ceph already offers integration with OpenStack. However, Ceph’s performance characteristics prohibit its deployment in certain low-latency use cases, e.g., as backend for Oracle MySQL databases.

DRBD4Cloud [L1] will increase the performance and scalability (technical and organisational) of highly-available software defined block storage in dynamic large scale cloud deployments. It is based on DRBD, which is a high-performance low-latency low-level building block for block replication, offering key functionality of such systems. During the project DRBD will be integrated into OpenStack and DC/OS (the Distributed Cloud Operating System), the most prevalent tool suites to manage distributed computing resources. This will enable cloud backend providers to use DRBD technologies for replicated block storage, which is an inherently needed building block for highly reliable cloud storage offerings.

To ease the deployment and maintenance of DRBD a collection of key components will be offered as an easy-to-use software package. A component to orchestrate and monitor the storage environment consisting of a multitude of DRBD nodes will be developed. On top of this, web-based and API-based management and monitoring solutions will be developed in order to ease adoption of DRBD-based software-defined storage for cloud environments.
To guarantee availability, DRBD is currently storing up to 32 full data replicas on remote storage nodes. DRBD4Cloud will allow for the usage of erasure coding, which allows one to split data into a number of fragments (e.g., nine), such that only a subset (e.g., three) is needed to read the data. This will significantly reduce the required storage and upstream bandwidth (e.g., by 67 %), which is important, for instance, for geo-replication with high network latency. Additionally, specific schemes called secret sharing can even guarantee that the servers do not learn anything about the plain data, without requiring cryptographic keys [1]. This will allow for using public cloud storages without compromising confidentiality, making DRBD also usable to SMEs without own data centres but requiring highly available storage.

The feasibility of cloud integration of DRBD for small setups has already been demonstrated by means of a first proof-of-concept prototype which showed that the main challenge is to meet the scalability requirements of large scale deployments. Another challenge is the design and maintenance of a common shared driver core between the OpenStack and DC/OS integrations. Additionally, concerning erasure coding and secret sharing the main challenge is minimising the impact upon latency. Fortunately, in such schemes, read operations only require access to a subset of the storage nodes in order to retrieve the data, which positively affects the availability and latency. A first result on hardware acceleration of secret sharing [2] also shows the potential of low latency implementations on modern high-bandwidth network interface cards. As an optimisation, the monitoring solution could be utilised to optimise the global workload of the overall storage cluster and as results in [3] show, efficient auditing mechanisms can be used to verify the data integrity in the system although erasure coding and secret sharing are applied.

Figure 1: Depending on the configuration, no individual share contains any sensitive information about the replicated data, while the data can be recovered from any two of the shares to achieve high availability.
Figure 1: Depending on the configuration, no individual share contains any sensitive information about the replicated data, while the data can be recovered from any two of the shares to achieve high availability.

In summary, with DRBD4Cloud a new and highly optimised out-of-the-box solution for multi-node software defined storage in high-load cloud environments will be developed. The results will be delivered as software implementations, which similar to DRBD will (mostly) be published under an open source license. The extensions will allow DRBD – the de-facto standard for distributed replicated block devices – to cut the storage overhead by 50-80 % while guaranteeing practically equivalent levels of redundancy and will allow confidential data to be securely stored on semi-trusted cloud providers. Simplifying the deployment and enabling monitoring as well as integrating DRBD into cloud platforms will allow cloud providers to pick up the technology. At the end of DRBD4Cloud, all extensions will be delivered as internally tested prototypes. DRBD4Cloud is a joint effort of Linbit HA-Solutions GmbH, Pro-zeta A.s. and AIT Austrian Institute of Technology (AIT) which has received funding from the EUREKA Eurostars programme [L3].

Links:
[L1] https://www.ait.ac.at/en/research-fields/cyber-security/projects/extending-drbd-for-large-scale-cloud-deployments/
[L2] https://www.linbit.com
[L3] https://www.eurostars-eureka.eu/project/id/11450

References:
[1] T. Lorünser, A. Happe, and D. Slamanig: “ARCHISTAR: Towards Secure and Robust Cloud Based Data Sharing”, in CloudCom 2015, IEEE, https://doi.org/10.1109/CloudCom.2015.71
[2] J. Stangl, T. Lorünser, S.M. Dinakarrao: “A fast and resource efficient FPGA implementation of secret sharing for storage applications”,  DATE 2018, pp. 654–659, https://doi.org/10.23919/DATE.2018.8342091
[3] D. Demirel, et al.: “Efficient and Privacy Preserving Third Party Auditing for a Distributed Storage System”, ARES 2016, pp. 88–97, https://doi.org/10.1109/ARES.2016.88

Please contact:
Stephan Krenn
AIT Austrian Institute of Technology GmbH
This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: January 2025
Special theme:
Large-Scale Data Analytics
Call for the next issue
Image ERCIM News 119 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed