by Emmanouil G. Spanakis (FORTH-ICS)

The vision of the SpeechXRays project [L1] is to provide a user recognition system combining voice biometrics, which are convenient and cost effective, with video, which can help to improve accuracy, and to introduce superior anti-spoofing capabilities. SpeechXRays aims to outperform other state-of-the-art solutions in the areas of security, privacy, usability and cost-effectiveness.

The project takes a new scientific approach to voice biometrics by using human voice physiology, which produces precise acoustic cues that are unique to each individual speaker. The precise vocal tract physiology is directly derived from the feature analysis of the speech spectrogram. We use this information to model the human voice quality characteristics that are used by the human auditory system when identifying a speaker’s voice. The project models acoustic cues of voice physiology and detects them in the first pass of a speaker (voice) authentication system in a deterministic discrete time signal processing architecture. Multi-channel biometrics further enhance the system’s performance.

The solution combines voice acoustic analysis (rather than models based purely on statistics) with dynamic face recognition (including lip movement and facial analysis). The technology is combined into a unified service capable of running the speaker recognition process: either locally on the device (cancellable biometric template created by binding keys with biometric data and securely stored on the device, for example on the SIM card), or remotely, via a secure cloud connection (cancellable biometric template securely stored on a private cloud, the responsibility of the data subject, and not on the service provider’s servers).

The project will test the solution in three real-life use cases requiring various degrees of security: consumer use case (low security), eHealth use case (medium security) and workforce use case (high security). All scenarios will demonstrate an authentication over a secure broadband network giving access to specific services. The technology will be deployed on 2,000 users in three pilots: workforce, consumer and eHealth. The details of the latter are discussed below.

FORTH is responsible for the eHealth use case, deployed in Crete. SpeechXRay’s eHealth pilot will test the security, privacy, usability and cost-effectiveness of the security platform. In particular, the scenario will test the context-dependent feature that allows administrators to modify the FAR/FRR trade-off in order to reduce the risk of false reject for low security data (e.g. physical examination) and reduce the risk of false accept for high security data (e.g., MRI/CT scans).

FORTH-ICS, is responsible for running the eHealth use case pilot under the General Data Protection Regulation (GDPR), to allow patient monitoring and medical expert collaboration. The pilot includes the biometric data acquisition and assessment of the provided user identification services, enabling secure access an eHealth collaboration platform for different stakeholders. This pilot (eHealth) will test the security, privacy, usability and cost-effective features of the security platform and the context-dependent features. Once the SpeechXRays security layer has allowed a user to access the platform, the user can access personal health data over 3G/4G or WLAN, from a mobile device (laptop, tablet or smartphone). Patients and doctors will use the remote biometrics solution to access a collaboration platform developed by FORTH to support the prevention and management of a chronic condition (osteoarthritis). Patients will be able to remotely and securely report health data such as activity level, pain, etc. while general practitioners and specialists will be able to access the patient journals for decision-support.

SpeechXRays platform will thus be used to study, how to optimal design a modular biometric platform able to be used in the eHealth domain. Our efforts are focusing on identifying all related benefits of deploying biometric tools that can lead to increased security, increased convenience and increased accountability compared to other authentication methods (PINs, passwords etc.).

Figure 1: SpeechXRays eHealth pilot study flow diagram for user enrollment and verification/authentication.
Figure 1: SpeechXRays eHealth pilot study flow diagram for user enrollment and verification/authentication.

Figure 2: SpeechXRays workflow for eHealth use case for doctor/patient authentication to access medical data.
Figure 2: SpeechXRays workflow for eHealth use case for doctor/patient authentication to access medical data.


[1] A. N.Cocioceanu et al.: “An Assessment Framework for Voice-Based Biometrics”, 40th Annual Int.Conference of the IEEE Engineering in Medicine and Biology Society, Honolulu, HI, USA, 2018.
[2] M. Spanakis, et al.: “Developing a context-dependent tuning framework of multi-channel biometrics that combine audio-visual characteristics for secure access in eHealth platform for osteoarthritis management”,  MobiHealth 2017, Vienna, Austria, 2017.
[3] E.G. Spanakis, et al.: “Secure Access to Patient’s   Health Records using SpeechXRays a Multi-Channel Biometrics Platform for User Authentication, 38th Annual Int. Conference of the IEEE Engineering in Medicine and Biology Society, Orlando, FL, USA, 2016 (pp. 2541-2544). doi: 10.1109/EMBC.2016.7591248. PMID:28268840.

Please contact:
Emmanouil G. Spanakis, FORTH-ICS, Greece
This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: January 2019
Special theme:
Transparency in Algorithmic Decision Making
Call for the next issue
Image ERCIM News 115 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed