by Péter Földesy, Imre Jánoki, Ákos Zarándy (SZTAKI and Péter Pázmány Catholic University, Budapest)

A camera and machine learning based system, developed at SZTAKI and by Péter Pázmány at Catholic University Budapest  [L1], enables continuous non-contact measurement of respiration and pulse of premature infants. It also performs high precision monitoring, immediate apnoea warnings and logging of motion activity and caring events.

It is essential within hospitals to be able to continuously and reliably monitor vital signs, like the heart rate and respiration of newborn infants—particularly those in neonatal intensive care units (NICU). The heart and respiration rates can be extracted from the electrocardiogram (ECG), which despite being a non-invasive technique still relies on direct contact with the body. The self-adhesive electrodes are relatively expensive, and more importantly they can easily damage the sensitive skin of preterm infants. Therefore, non-contact monitoring is a daily need in NICUs.

Recent studies have shown that non-contact visual vital-sign monitoring is a reliable and accurate technique [1] although, like traditional contact monitoring methods (e.g. ECG, pulse-oximeter), it suffers from motion artefacts. During periods of caring (e.g. baby is removed, skin-to-skin contact with parent visible, cleaning, nurses change feeding tube, etc.) or intense activity the measurements are inaccurate, so these situations need to be treated separately. Our system can detect and handle common activities, such as infant self-motion, phototherapy treatment and low light conditions with infrared illumination, with a high confidence level. Unlike other systems, our system provides continuous monitoring, not limited to motionless periods. The respiration waveform and rate are calculated directly from the chest and abdomen movements, giving a physiologically and computationally more reliable and more established result than extracting them based on remote photoplethysmography (rPPG), like some existing algorithms do.

In the framework of signal and rate extraction, a top classifier runs, with feature extraction and a neural network classifier, distinguishing events and status of the view. This classifier can detect an empty incubator, an active or passive infant, caring and other motion related situations with 98% precision in real life clinical practice. Whenever the infant is detected, the heart and respiration rates are extracted from the video feed as described below.

The heart rate calculation consists of an ensemble of two networks: (i) the signal extractor network, which derives the pulse-signal from the video input; (ii) the rate estimator network, which calculates the heart rate value from the signal. For the former, the PhysNet architecture [2] is applied and the rate estimator is our own network, named RateEstNet. These networks are fused and trained together after the pre-training of PhysNet. We have developed a novel augmentation technique, called frequency augmentation, which produces a uniform heart rate distribution that results in unbiased training (i.e., the network is not biased towards the average heart rate value).

The respiration rate calculation incorporates the more traditional way of image processing with optical flow and a machine learning approach as well. A dense optical flow algorithm extracts the movements from a series of grayscale images in a time window of about six seconds. The resulting differential sequence is then masked using a U-Net machine learning architecture to filter only the abdomen and chest [3]. The images are summed and processed to get the waveform of the respiration. In a last step, we use a neural network consisting of one-dimensional convolutional layers with a fully connected part at the end to get the respiration rate.

Our system can estimate pulse and respiration rates and can handle medical intervention and heavy motion scenarios built up from an ensemble of hierarchical neural networks (see Figure 1). In physical form the system is under integration into an open incubator pilot product of a leading Hungarian incubator manufacturer, including medically safe night vision illumination and hardware acceleration of the neural networks by a NVIDIA Jetson Nano module.

The method’s performance is being evaluated in real-time and on a carefully annotated database collected at the First Department of Neonatology of Paediatrics, Department of Obstetrics and Gynaecology, Semmelweis University, Budapest, Hungary [L2]. The project started in early 2018 and is still running, with further product integration and R&D for extending night vision capabilities, closed incubators, behavioural studies and sleep quality evaluation.


[1] K. Gibson, et al: “Non-contact heart and respiratory rate monitoring of preterm infants based on a computer vision system: a method comparison study”, Pediatric research 86.6, pp. 738-741, 2019.
[2] Z. Yu, L. Xiaobai , Z. Guoying: “Remote photoplethysmograph signal measurement from facial videos using spatio-temporal networks”, Proc. of BMVC, pp. 1-15, 2019.
[3] R. Janssen, W. Wang, A. Moço, G. de Haan: “Video-based respiration monitoring with automatic region of interest detection”, Physiological Measurement, vol. 37, no. 1, pp. 100-114, 2015.

Please contact:
Péter Földesy, SZTAKI, Hungary
+36 1 279 6000/7182
This email address is being protected from spambots. You need JavaScript enabled to view it.

Next issue: October 2020
Special theme:
"Blue Growth"
Call for the next issue
Image ERCIM News 122 epub
This issue in ePub format

Get the latest issue to your desktop
RSS Feed