by Frej Drejhammar

Long-term archiving of Electronic Health Records (EHRs) is a complex task with a number of different requirements. This article describes how these requirements shape the implementation the DIGHT distributed EHR database.

The aim of the DIGHT (Distributed Information store for Global Healthcare Technology) project is to build a scalable and highly reliable information store for the Electronic Health Records (EHRs) of the citizens of India. The project partners are the Swedish Institute of Computer Science (SICS) and the Indian Centre for Development of Advanced Computing (C-DAC). SICS is responsible for developing a trusted and reliable distributed long-term archive for EHRs, while C-DAC is working on the standardization of EHRs and front-end software. The DIGHT system is a federated system where the participating entities are medical service providers (hospitals, clinics etc). These entities provide computing resources for their own use and contracts with other participants for sharing data and access to geographically dispersed data storage.

SICS is responsible for developing a reliable distributed long-term archive for the health records of India's 1,2 billion citizens. Photo: iStockphoto.
SICS is responsible for developing a reliable distributed long-term archive for the health records of India's 1,2 billion citizens. Photo: iStockphoto.

EHR data must be preserved for at least the citizen's lifetime. For demographical, genealogical and other research purposes it may be desirable to preserve EHRs for even longer time periods. Long-term archiving of EHRs can be approached from a number of directions. We have legal requirements, requirements on reliability, organizational aspects and software maintenance requirements to consider. In this article we will describe how these aspects constrain and shape the design of the DIGHT distributed EHR database.

To ensure the availability of EHR data and robustness in the face of unexpected occurrences such as sabotage, fires and natural catastrophes, EHR data must be replicated to avoid data loss. On the other hand, physical replication is expensive as it requires network connectivity and the maintenance of computing equipment at several locations. As there are legal requirements on keeping and archiving health records, we have designed the system such that all data have an explicit owner. The DIGHT implementation uses the owner information to guarantee that the data is permanently replicated on storage nodes controlled by the owner. The explicit ownership of data motivates a participant to pay for the upkeep of replicas as it is the only way it can fulfill its legal requirements.

Other legal requirements such as protection of patient confidentiality mandate the use of strong cryptography to protect against information disclosure. A way to guarantee patient confidentiality would be to let the patient carry the key needed to decrypt his or her EHRs. However, a patient might lose their key or be admitted to a hospital in a state where they cannot produce the key, making this method impractical. Such a scheme would also hinder research that uses patient data. To solve this problem, DIGHT uses trusted hardware to secure disk storage, and a public key infrastructure-based authorization policy and authentication system to control access to EHRs. By necessity a healthcare professional assigned to, for example, an emergency ward is allowed access to any patient's EHR. To control abuse of the confidence placed in them by this policy, the DIGHT design uses secure logs to audit access. The log typically stores a request to access the patient's data signed by the healthcare professional's private key. Likewise new EHRs created by a healthcare professional are signed by his or her private key and time-stamped by the system. The time stamp and signature are important if, for example, malpractice is suspected, since created EHRs cannot be manipulated without detection.

If the DIGHT system is successful, its design choices will probably influence the software that handles EHRs for centuries to come, as converting to a newer incompatible system will probably not be economically feasible. The DIGHT database is designed from the beginning to support gradual upgrades as new requirements evolve.

Over time, the types of data stored in the database will inevitably change. The database supports a generic explicitly typed data format. The type information allows old data objects to be upgraded for use by newer software, and can also be used to temporarily downgrade data to allow old software to access newly created entries. To ensure data integrity and confidentiality in the face of improved cryptographic attacks, the system is designed with mechanisms for upgrading and re-certifying stored data.

From a software maintenance perspective, we support gradual updates by structuring the system as a set of cooperating services communicating over documented platform- and implementation-neutral interfaces. This allows us to upgrade and replace parts of the system, while maintaining its availability, to accommodate new storage technologies and upgrade legacy hardware and software. For example, a storage node is upgraded online using bootstrap and handover protocols that ensure the new node has replicated the old node's data before it takes over from the old node.

To summarize, the guiding principles behind our design of the DIGHT distributed EHR database are to make legal and operational responsibilities coincide; to support gradual online updates of software and hardware; to choose a flexible data model which can be extended; and to avoid vendor and technology lock-in by using open and documented interfaces.


Please contact:
Frej Drejhammar
SICS, Sweden
Tel: +46 8 633 1617
E-mail: This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: October 2022
Special theme:
"Ethical Software Engineering and Ethically Aligned Design"
Call for the next issue
Get the latest issue to your desktop
RSS Feed