Qiang Zhu

and 5 more

Recent studies have shown that subtle changes in human face color due to heartbeats can be captured by regular RGB digital video cameras. It is possible, though challenging, to track one’s pulse rate when a video contains significant subject’s body motions in a fitness setting. The robustness gain in the recently proposed systems is often achieved by adding or changing certain modules in the system’s pipeline. Most existing works, however, only evaluate the performance of the pulse rate estimation at the system level of particular pipeline configurations, whereas the contribution from each module remains unclear. To gain a better understanding of the performance at the module level and facilitate future research in explainable learning and artificial intelligence (AI) in physiological monitoring, this paper conducts an in-depth comparative study for video-based pulse rate tracking algorithms; a special focus is placed on challenging fitness scenarios involving significant movement. The representative efforts over the past decade in the field are reviewed, upon which a reconfigurable rPPG framework/pipeline is constructed comprising of major processing modules. For performance attribution, different candidates for each module are evaluated while having the rest of modules fixed. The performance evaluation is based on a signal quality metric and four pulse-rate estimation metrics and uses the simultaneously recorded ECG-based heart rate measurement as a reference. Experimental results using a challenging fitness dataset reveals the synergy between pulse color mapping and adaptive motion filtering in obtaining accurate pulse rate estimates. The results also suggest the importance of robust frequency tracking for accurate PR estimation in low signal-to-noise ratio fitness scenarios.

Xin Tian

and 3 more

Blood oxygen saturation (SpO2) is an important indicator for pulmonary and respiratory functionalities. Clinical findings on COVID-19 show that many patients had dangerously low blood oxygen levels not long before conditions worsened. It is therefore recommended, especially for the vulnerable population, to regularly monitor the blood oxygen level for precaution. Recent works have investigated how ubiquitous smartphone cameras can be used to infer SpO2. Most of these works are contact-based, requiring users to cover a phone’s camera and its nearby light source with a finger to capture reemitted light from the illuminated tissue. Contact-based methods may lead to skin irritation and sanitary concerns, especially during a pandemic. In this paper, we propose a noncontact method for SpO2 monitoring using hand videos acquired by smartphones. Considering the optical broadband nature of the red (R), green (G), and blue (B) color channels of the smartphone cameras, we exploit all three channels of RGB sensing to distill the SpO2 information beyond the traditional ratio-of-ratios (RoR) method that uses only two wavelengths. To further facilitate an accurate SpO2 prediction, we design adaptive narrow bandpass filters based on accurately estimated heart rate to obtain the most cardiac-related AC component for each color channel. Experimental results show that our proposed blood oxygen estimation method can reach a mean absolute error of 1.26% when a pulse oximeter is used as a reference, outperforming the traditional RoR method by 25%.

Ravi Garg

and 2 more

The Electric Network Frequency (ENF) is a signature of power distribution networks that can be captured by multimedia recordings made in areas where there is electrical activity. This has led to an emergence of several forensic applications based on the use of the ENF signature. Examples of such applications include estimating or verifying the time-of-recording of a media signal and inferring the power grid associated with the location in which the media signal was recorded. In this paper, we carry out a feasibility study to examine the possibility of using embedded ENF traces to pinpoint the location-of-recording of a signal within a power grid. In this study, we demonstrate that it is possible to pinpoint the location-of-recording to a certain geographical resolution using power signal recordings containing strong ENF traces. To this purpose, a high-passed version of an ENF signal is extracted and it is demonstrated that the correlation between two such signals, extracted from recordings made in different geographical locations within the same grid, decreases as the distance between the recording locations increases. We harness this property of correlation in the ENF signals to propose trilateration based localization methods, which pinpoint the unknown location of a recording while using some known recording locations as anchor locations. We also discuss the challenges that need to be overcome in order to extend this work to using ENF traces in noisier audio/video recordings for such fine localization purposes.