Diego L. Guarin

and 6 more

Background: Automatic facial landmark localization is an essential component in many computer vision applications, including video-based detection of neurological diseases. Machine learning models for facial landmarks localization are typically trained on faces of healthy individuals, and we found that model performance is inferior when applied to faces of people with neurological diseases. Fine-tuning pre-trained models with representative images improves performance on clinical populations significantly. However, questions related to the characteristics of the database used to fine-tune the model and the clinical impact of the improved model remain. Methods: We employed the Toronto NeuroFace dataset – a dataset consisting videos of Healthy Controls (HC), individuals Post-Stroke, and individuals with Amyotrophic Lateral Sclerosis performing speech and non-speech tasks with thousands of manually annotated frames - to fine-tune a well-known deep learning-based facial landmark localization model. The pre-trained and fine-tuned models were used to extract landmark-based facial features from videos, and the facial features were used to discriminate clinical groups from HC. Results: Fine-tuning a facial landmark localization model with a diverse database that includes HC and individuals with neurological disorders resulted in significantly improved performance for all groups. Our results also showed that fine-tuning the model with representative data greatly improved the ability of the subsequent classifier to classify clinical groups vs. HC from videos. Conclusions: Using a diverse database for model fine-tuning might result in better model performance for HC and clinical groups. We demonstrated that fine-tuning a model for landmark localization with representative data results in improved detection of neurological diseases.

Allan Kember J

and 10 more

Objective: To build a computer vision model that can automatically detect sleeping position in the third trimester under real-world conditions. Design: This study used data from an ongoing observational study and a previous cross-sectional study. Setting: Participants’ homes. Sample: Pregnant participants in the third trimester and their bed partners. Methods: Real-world overnight video recordings were collected from an ongoing, Canada-wide, prospective, four-night, home sleep apnea study and controlled-setting video recordings were used from a previous study. Images were extracted from the videos and body positions were annotated. Five-fold cross validation was used to train, validate, and test a model using state-of-the-art deep convolutional neural networks. Main Outcome Measures: Precision and recall of the model for detecting thirteen pre-defined body positions. Results: The dataset contained 39 pregnant participants, 13 bed partners, 12,930 images, and 47,001 annotations. The model was trained to detect pillows, twelve sleeping positions, and a sitting position in both the pregnant person and their bed partner simultaneously. The model significantly outperformed a previous similar model for the three most commonly occurring natural sleeping positions in pregnant and non-pregnant adults, with an 82-to-89% average probability of correctly detecting them and a 15-to-19% chance of failing to detect them when any one of them is present. Conclusions: The model holds potential to solve yet unanswered research and clinical questions regarding the relationship between sleeping position and pregnancy outcomes.