Ambient Monitoring of Gait and Machine Learning Models for Dynamic and Short-Term Falls Risk Assessment in People With Dementia

Falls are a leading cause of morbidity and mortality in older adults with dementia residing in long-term care. Having access to a frequently updated and accurate estimate of the likelihood of a fall over a short time frame for each resident will enable care staff to provide targeted interventions to prevent falls and resulting injuries. To this end, machine learning models to estimate and frequently update the risk of a fall within the next 4 weeks were trained on longitudinal data from 54 older adult participants with dementia. Data from each participant included baseline clinical assessments of gait, mobility, and fall risk at the time of admission, daily medication intake in three medication categories, and frequent assessments of gait performed via a computer vision-based ambient monitoring system. Systematic ablations investigated the effects of various hyperparameters and feature sets and experimentally identified differential contributions from baseline clinical assessments, ambient gait analysis, and daily medication intake. In leave-one-subject-out cross-validation, the best performing model predicts the likelihood of a fall over the next 4 weeks with a sensitivity and specificity of 72.8 and 73.2, respectively, and achieved an area under the receiver operating characteristic curve (AUROC) of 76.2. By contrast, the best model excluding ambient gait features achieved an AUROC of 56.2 with a sensitivity and specificity of 51.9 and 54.0, respectively. Future research will focus on externally validating these findings to prepare for the implementation of this technology to reduce fall and fall-related injuries in long-term care.

categories, and frequent assessments of gait performed via a computer vision-based ambient monitoring system. Systematic ablations investigated the effects of various hyperparameters and feature sets and experimentally identified differential contributions from baseline clinical assessments, ambient gait analysis, and daily medication intake. In leave-one-subject-out cross-validation, the best performing model predicts the likelihood of a fall over the next 4 weeks with a sensitivity and specificity of 72.8 and 73.2, respectively, and achieved an area under the receiver operating characteristic curve (AUROC) of 76.2. By contrast, the best model excluding ambient gait features achieved an AUROC of 56.2 with a sensitivity and specificity of 51.9 and 54.0, respectively. Future research will focus on externally validating these findings to prepare for the implementation of this technology to reduce fall and fall-related injuries in long-term care.

I. INTRODUCTION
F ALLS are one of the leading causes of injury, loss of independence, and mortality in older adults [1]. In particular, the annual incidence of falls in older adults with dementia is 70%-80% [2], approximately twice the incidence of falls in cognitively healthy older adults [3], [4]. In long-term care (LTC) homes, where the majority of residents have dementia [5], about 25% of falls result in injuries, 15% severe enough to require medical care [6]. Since falls have a major impact on functional ability, caregiver burden, and quality of life, fall prevention has become an important public health goal in high-risk populations such as people with dementia and residents of LTC homes [7].
An important part of a falls prevention plan is determining the risk of falling. One of the shortcomings of many existing approaches is that their risk evaluation is solely based on a single cross-sectional assessment, which provides an estimate of fall risk over time frames of months to years [8]. By this standard, all LTC residents are categorized as having a high risk of falling, making it difficult for care staff to act upon this information. While it may be useful to be aware of someone's long-term risk of falls, acute changes in gait and dynamic stability may be predictive of falls in the short-term, i.e., within days to weeks. For instance, new medications, failing health, or declines in cognition may detectably alter falls risk in the days immediately preceding a fall. To improve upon static assessments and to identify individuals at imminent risk of falling, one needs to incorporate frequent (daily or weekly) measurements that may correlate with or cause falls in the short term, e.g. over the next few days or weeks.
There are many intrinsic and extrinsic factors that contribute to falls [9]. Extrinsic or environmental fall risk factors include slippery surfaces, tripping hazards, dim lighting, etc. Examples of intrinsic fall risk factors include gait and balance problems, poor vision, poor reaction time, and muscle weakness. In the absence of full awareness of all intrinsic and extrinsic factors contributing to falls, it is impossible to fully and accurately predict future falls. It is, however, still very useful to try to build and validate techniques that assess fall risk with moderate accuracy, as LTC care staff can take this information into account in targeted interventions. For example, if a fall risk assessment system could identify LTC residents with a rapid rise in their estimated risk of falling (based on a combination of specific intrinsic and/or extrinsic risk factors), clinicians and care staff can attend to those residents and, either assess their health status and recent changes in medications as possible contributing factors, or implement harm reduction strategies such as hip protectors. Such targeted interventions may improve the efficiency of LTC staff in providing timely care and reducing fall risk.
Previous research on gait and fall risk in dementia has relied on cross-sectional assessments in laboratory settings or controlled environments (such as on flat instrumented walkways), which do not account for the cognitive and physical demands of navigating in the real world. Because of variations in the individuals' motivation, adherence, and perception of the task, standardized clinical measures of fall risk are difficult to obtain in people with dementia [10]. These measurements are typically insensitive to change [11]. Thus there is a need for an objective, reliable, valid, automatic and responsive fall risk assessment system in both research and clinical practice. Despite the potential of technology to enhance mobility assessment in dementia, its clinical use may be constrained by factors like obtrusiveness, participant compliance, and the expense of equipment and software. These challenges are addressed in this paper. The objective of this study was to develop a dynamic fall risk assessment tool -i.e. one that updates its risk estimate over time -for older adults with dementia. To this end, we applied a machine learning approach and leave-one-subject-out cross-validation (LOSOCV) to examine the predictive power of various combination of predictors including static baseline assessments, medication exposures over time, and gait parameters measured via frequent ambient monitoring.
In summary, the main contributions of our paper are: 1) To present a solution that addresses the mentioned challenges via an inexpensive vision-based single-camera model that predicts short-term risks of falls using the longitudinal data captured as the natural walking bouts in a real-world setting. 2) Our model is the first short-term fall risk prediction model specifically designed in a population of older adults with dementia who have different gait patterns than the healthy people and are more prone to falls resulted from gait disorders. 3) Our work shows, for the first time, that fall risk predictions in older adults with dementia can be improved using repeated ambient monitoring of gait. The remainder of the paper is organized as follows: a brief literature review of existing research and approaches for fall prediction and prevention is presented in Section II; the proposed machine learning-based fall risk prediction model is presented in Section III. Performance evaluation and ablation study of the proposed model are discussed in Section IV; and, finally, conclusions and future work directions are provided in Section V.

II. RELATED WORK
Current approaches to fall prevention in LTC settings involve some combination of baseline risk factor assessment and a multicomponent, individualized falls and injury prevention plans [13]. There are multiple concerns regarding the feasibility of these approaches in the LTC population [10], [14], [15]. For example, over 60% of LTC residents have dementia [5] and many have difficulty following instructions and steps required to complete clinical fall risk assessments. It is therefore desirable to develop and validate methods that could unobtrusively assess fall risk in this population. Moreover, dementia is associated with a cognitive decline and a high incidence of gait disorders [16]. This makes existing fall risk assessments developed based on community dwelling older adults (or healthy people in general) difficult to apply to this population. Thus, there is a need to develop an automated monitoring system that can capture and learn the dynamics of gait in older adults with dementia and determine its correlation to the short-term risks of falls.
Wearable technologies have been used in multiple studies to assess fall risk in older adults [8], [17], [18], [19], [20]. While prospective fall-risk prediction via wearable sensors is potentially very useful for cognitively healthy older adults, compliance and adherence challenges limit the use of sensors in individuals with dementia [21], [22], [23]. Ambient monitoring offers an opportunity to frequently monitor gait parameters for older adults with dementia, bypassing the need for explicit adherence. There are also commercially available fall detection systems and numerous research studies on the development and evaluation of such technologies [24]. These devices and technologies are useful and potentially life-saving, as they can detect once a fall has happened and decrease clinical response time, but they will not help prevent falls or fall-related injuries.
In a previous analysis of the dataset used in this study, baseline gait variables measured during the first two weeks after admission could predict the number of falls experienced by an older adult with dementia during the remainder of their hospital stay [32]. In single variable analyses, the estimated lateral margin of stability, step width, and step time variability were significantly associated with number of future falls. Importantly, in multivariate analysis controlling for clinical and demographic variables, the lateral margin of stability remained associated with the number of future falls. When Cox proportional hazards regression analysis was used to build a prognostic model to determine fall-free survival probabilities, gait stability (in addition to fall history) was shown to be a statistically significant predictor of time to fall in this population [35]. These studies, however, only analysed the first two weeks of gait data to estimate the number of future falls or to estimate fall-free survival probabilities. This limits the ability of developed models to dynamically update fall risk when new gait data is captured.
There are also some contradictory findings on the relationship between gait measures and falls [28]. For instance, while some studies showed that gait variability was able to discriminate between fallers and non-fallers [25], [31], others have not [36]. Another study [37] concluded that cadence (number of steps per minute) was the only gait metric predictive of falls. These inconsistencies motivate further research in establishing features that are consistent predictors of falls, especially in high-risk populations such as individuals with dementia.
In addition, while existing studies have linked baseline gait assessments with future falls over the long-term, estimating the short-term dynamic falling risk provides care staff with more actionable information to intervene. To improve upon static risk assessments, it is necessary to establish whether intrinsic fall risk varies over time in a measurable way, allowing for frequently updating the falling risk for each individual, e.g. based on recent changes in their gait [38]. Augmenting this information with static fall risk scores and available extrinsic risk factors (e.g. medication intake) could potentially lead to accurate dynamic fall risk assessment.
There are many vision-based studies that use machine learning algorithms to study fall detection (e.g., [39], [40], [41]), although they sometimes refer to this task as fall prediction (i.e., predicting the onset of a fall). While early detection of a fall in progress is useful, it does not allow for fall prevention interventions. Other studies use vision-based systems for clinical gait analysis ( [41], [42]) without expanding to making predictions about future falls. However, the purpose of this paper is to develop a fall risk assessment tool that allows for planning fall intervention strategies to minimize the risk of falling and fall-related injuries over time. The vast majority of existing studies are lab-based or simulated in non-clinical populations. Other studies collect clinical and/or gait variables at baseline and construct machine learning models to predict the likelihood of a fall over the next 6-12 months [43], [44]. Falls risk predictions over this time frame are less clinically useful and these models do not dynamically respond to changes in risk factors over time. A few studies on Parkinson's disease have used machine learning with clinical data (not gait data) to predict falls or have examined the statistical relationship between gait data and the number of fall events and does not indicate the time-frame over which falls are predicted [45]. To the best of our knowledge, based on our literature searches and recent reviews [41], [46] on deep learning and video-based gait analyses, this paper is the first to propose a markerless, non-intrusive vision-based fall risk assessment tool that predicts short-term risk of falls in older adults with dementia which has been trained and tested with real-world clinical data.

A. Participants
The participants in this study were inpatients in the Specialized Dementia Unit at the Toronto Rehabilitation Institute -an 18-bed tertiary facility which admits older patients with behavioral and psychological symptoms of dementia. Cognitive assessment, Severe Impairment Battery (SIB) [48]) was completed on all participants as well as an assessment of functional impairment (Katz [49]) and consistent with the usual population on this unit, participants had moderate to severe dementia. There were two inclusion criteria: first, the diagnosis of major neurocognitive disorder (dementia) based on a geriatric psychiatric assessment, and second, ability to walk independently over at least 20 meters. There were no exclusion criteria. The Research Ethics Board of the University Health Network approved this study and substitute decision makers provided written informed consent for all participants. The IRB protocol number of this study is 15-9693.10, and it was approved on Dec 21, 2015. For every assessment, participants were engaged in the study provided their assent and were excluded when showing signs of dissent. Longitudinal data collection from each participant began soon after their admission (after obtaining informed consent), and continued until the end of their stay at the unit or until they were no longer able to walk independently over at least 20 meters.

B. Data Collection
Data collected from participants consists of longitudinal and baseline measures. Longitudinal measures include gait parameters from patients' walking bouts, collected via ambient monitoring, and daily medication intake, including both prescribed and as-needed (pro re nata) medication dosages. Baseline measures include assessments of gait, balance and fall risk administered at the time of admission. In addition, the dataset includes the date and time of all participants' falls that were noted during their stay at the hospital.
1) Measures of Gait: Gait measurements were collected via an ambient monitoring system (Fig. 1) consisting of a depth sensor (Microsoft Kinect for Windows v2) and radio-frequency identification (RFID) to identify study participants [50]. The sensor was mounted in a hallway and captured residents' gait as they walked naturally. The RFID system consists of a reader (UHF Long Range from FEIG Electronics, Duluth, Georgia, USA) and two circular polarized UHF antennas (Times-7, Wellington, New Zealand), and allows for automatically tagging participants' ID number to their gait recordings, and also (for privacy reasons) turning off the sensor when staff or nonparticipating residents are within view. The Kinect sensor tracks the human pose and motion within its field of view at the real-time [51] and previous work has established the validity of gait parameters extracted from the Kinect skeletal tracking sequences [52]. A total of 3 gait parameters were extracted from 3D skeletal tracking of each walking bout. These include: cadence, estimated margin of stability (eMOS), and estimated parkinsonism score -quantified via the MDS-UPDRS part III gait score. The first 2 were chosen from a longer list of available features based on a recent study which demonstrated that these 2 gait features, measured during the first two weeks of stay, were individually predictive of the number of future falls during the remaining period of hospital stay in this population [33]. Details of how these measures were calculated from tracked skeletal sequences can be found elsewhere [33]. The third gait feature (UPDRS-III-gait) was estimated from tracked skeletal sequences using a spatial-temporal graph convolutional neural network (ST-GCN), via a pre-trained and previously validated model [53]. This measure was included as previous studies have linked parkinsonism and fall risk in individuals with dementia [54], [55].
2) Medications: Participants' medication intake was represented with 3 features, each of which corresponding to the daily exposure to a standardized dosage in one of the antidepressant (AD), antipsychotic (AP), and benzodiazepine (BE) categories. Using the WHO Defined Daily doses for each medication to define equivalence, all medication doses within each class were standardized to a Citalopram, Risperidone, or Lorazepam equivalent dose, and total exposure over a 24 h period calculated.
Drug-related hypotension has been shown to be associated with falls in frail older adults [55]. Previous studies have also linked medications with central nervous system effects (including antipsychotics and antidepressants) to an increased fall risk in older adults with dementia in a dose dependent way [56]. Therefore, it was expected that including medication exposure features would result in improved fall prediction performance.
3) Baseline Assessments: Baseline assessments included in this analysis were scores of the following clinical assessments at the time of admission: the St. Thomas Risk Assessment Tool in Falling elderly inpatients (STRATIFY) [57] and the Tinetti Performance Oriented Mobility Assessment (POMA) [47] to assess gait (POMA-gait) and balance (POMA-balance). Both STRATIFY and POMA scores have been previously linked to falls in older adults [58], [59], [60] and also specifically in older adults with dementia [32]. Despite implementation challenges that prevent regular evaluations using these clinical assessment tools [10], [15], it is often feasible to administer them once at time of admission to LTC. It was hypothesized that the baseline values of these assessments will act as a prior and may improve the predictive power of machine learning models for estimating dynamic fall risk. 4) Falls: Falls were recorded prospectively by gathering details at daily safety huddles with staff members as well as reviewing incident reports and falls documentation in charts.
For the analysis in this paper, we focused on falls which took place while standing or walking. We thus excluded any falls from chairs or beds. We also excluded any falls where there was a clear external factor leading to the fall (for example being pushed, tripping on obstacle.) Unwitnessed falls where the cause was unknown were included.

C. Fall Risk Prediction
A combination of longitudinal and baseline features was fed to a binary classification model trained to predict whether a person would fall over the subsequent few weeks or not. The main model presented in this work used the combination of 3 gait features, AP medication, and STRATIFY. Using a sliding window over the length of stay, the model was trained to predict falls over the subsequent 4 weeks. The effects of various other feature combinations and also the effect of changing the extent of prediction (EoP ) from 4 weeks to shorter or longer periods (from 1 to 11 weeks) were experimentally investigated.
For longitudinal features (gait and medication), the average values over the most recent k days, and the first (baseline) k days were used (Fig. 2). Including features from the baseline k days allows the model to also incorporate information about long-term changes in gait and medication its decision making. The effect of including each feature set (i.e. recent k days, and baseline k days) was examined experimentally. Primary results included here are with k = 5, but analyses with other values of k, ranging from 1 to 7, are also included.
A Multi-Layer Perceptron (MLP) with Adam optimization and L2 regularization was used as a binary classifier to distinguish participants who fell once or more during the subsequent four weeks vs. those who did not. Other classification models (SVM with a linear and a radial basis function (RBF) kernel, logistic regression, and random forest) yielded worse results in preliminary experiments and, for brevity, are not presented. Positive (falls) and negative (non-fall) cases were weighted proportionally to the inverse of their respective frequencies.
For each participant, a fall risk prediction is made on every day on where there was at least one gait recording on the preceding k days. The output, via Platt scaling, is a number between 0 to 1, which can be interpreted as the participant's falling risk (probability) in next four weeks. Fig. 2. An overview of the fall prediction framework. Gait and medication features averaged over the most recent k days, and the first (baseline) k days, along with STRATIFY clinical assessment score at the time of admission, were used to predict falls over the next few weeks. Data augmentation is performed by over-sampling the minority class (input features labeled as fallers) using the SMOTE algorithm [12]. (LOSOCV: Leave-one-subjectout cross-validation).

D. Train, Test, and Data Preprocessing
Data was divided into training and test sets in nested crossvalidation so that test performance could be reported on the entire dataset. Leave-one-subject-out cross-validation was used, ensuring that data from each test participant did not appear in the training set.
Z-score normalization was used to normalize gait features in each data split, according to the range of values in the training set. Missing gait features were imputed to handle missing values due to the Kinect tracking failure or insufficient detected steps. However, the imputation was performed considering that 1) the test data should not be used; 2) the imputation should be done separately for each person; and 3) for test data, the imputation value should not be based on future data. So, missing values were filled for train and test, and each person separately, using past data. Since the number of falls was limited in our data (as shown in Table I), a Synthetic Minority Oversampling TEchnique (SMOTE) [12] was used to address the inherently imbalanced data issue by augmenting training set.

E. Experimental Settings
A two-layered MLP network with the hidden size of f loor( #features 2 + #features 3 ) and ReLU activation was used as the binary classifier model. A maximum of 2 K training steps with Adam optimizer was performed to train this network with a constant learning rate of 0.01, batch size 64 and L2 normalization with weight of 0.01. The tolerance for optimization was set to 1e −4 , and convergence was deemed to have been reached and training stopped when the loss did not improve by at least 1e −4 for 10 consecutive iterations. This was done to prevent overfitting and ensure that a stable solution was achieved by the model. The hyper-parameters were selected through a number of experiments on a subset of data and were kept the same for all the experiments in the ablation study. A grid search was performed, exploring the following ranges to find the best set of hyperparameters: )), (12,7), (12,7,3)] The range for each hyper-parameter k and EoP , was also searched and their impact on model performance is discussed further in the results section.

F. Performance Evaluation
To investigate the effect of k, the effect of the fall prediction extent, and the contribution of each set of features to the overall performance, multiple models were trained using various combinations of features and hyperparameters. The primary metric used to compare model performance was the area under the receiver operating characteristic curve (AUROC), but the area under the precision-recall curve (average precision), as well as precision, recall, specificity, and the F1-score are also reported.

IV. RESULTS AND DISCUSSION
During ∼22 months of data collection, 64 individuals with dementia were recruited to participate in this study. One person withdrew their consent shortly after entry and 4 others were withdrawn due to changes in health status such that they could no longer walk independently over at least 20 meters before any gait recordings were completed. Of the remaining 59 participants, 5 with no recorded walks were excluded and the the remaining 54 participants were included in the analysis.
There were a total of 81 falls during the period of observation. Among these, 19 were caused by external factors (being pushed etc.) and 8 occurred while transferring from a sitting or lying position, of which 1 overlapped with the 19 caused by external factors. As such, there were a total of (81 − 19 − 8 + 1 =) 55 falls included in the analysis. Twenty-five (25) of the 54 participants were fallers. The gait recording period (from the first ambient gait recording until the last) among all 54 participants was 56.8 days on average. A total of 4851 walking bouts were collected over the participants' length of stay. The recorded number of walking bouts per patient varied from a minimum of 11 to a maximum of 323. Table I summarizes the demographic information of study participants. Finally, a total of 17,114 medication administration events were collected over the participants' length of stay.
The fall risk prediction model achieved an AUROC of 76.2%, recall (sensitivity) of 72.8%, specificity of 73.2%, and precision (positive predictive value) of 44.1%, amounting to an F1-score of of 54.9%. This model was based on 3 longitudinal gait features (cadence, eMOS, UPDRS-III-gait), daily standardized dosage of AP medication, and baseline STRATIFY scores. The values of k and EoP were 5 days and 4 weeks, respectively. For longitudinal features, the average values over the recent 5 days, and the baseline 5 days were used. This amounted to using a total of (3 + 1) × 2 + 1 = 9 features. Table II shows the effect of varying k from 1 to 7 days. The best AUROC was achieved at k = 5, but changes in the AUROC are relatively small when different values of k are used. Specifically, AUROC = 76.2 at k = 5 and AUROC = 74.1 at k = 6, and almost all values of k ranging from 1 to 7 resulted in AUROC values over 70. Table III shows the effect of various combinations of longitudinal data from the most recent 5 days (RECENT), and the baseline 5 days (BASELINE). The best AUROC was obtained when data from the recent, and baseline five days were used. Including only the baseline groups degraded performance by a large amount (AUROC 76.2 vs. 66.1). Removing the baseline group and only using the recent groups also greatly degraded performance to an AUROC of 68.3. This implies that changes in gait over multiple weeks (from the first until the final week) are better predictors of fall risk than the most recent features of gait (over the recent 5 days). Table IV shows the model's performance when the extent of prediction ranges from 1 to 11 weeks. In all the experiments, replacing STRATIFY with POMA scores resulted in a poorer performance. Similarly, including both STRATIFY and POMA scores did not improve performance and in some cases even slightly lowered performance, perhaps due to overfitting. For brevity, only results with STRATIFY are reported here.
From the results presented in Table IV, it can be seen that best AUROC of 77.4 is achieved with EoP = 5 weeks and that AUROC remained above 72 for EoP ranging from 3 to 7 weeks. Performance dropped below an AUROC of 70 when EoP is set to 1 or 2 weeks. While the best AUROC is achieved at EoP = 5 weeks, the model chosen for further analysis in this study was with EoP = 4 weeks, i.e. the smallest EoP with AUROC greater than 75.0. The difference in AUROC when changing EoP from 5 to 4 weeks is relatively small (76.2 vs. 77.4), and a shorter term fall risk score provides with clinicians and care staff with more actionable knowledge and better chance to intervene.
Depending on the EoP , only 8-21% of the data points belong to the positive class (falls). The second column of Table IV shows the percentage of positive cases for each EoP . Without balancing class weights, the AUROC of the model with EoP = 4 drops from 76.2% to 63.8%. This drop highlights the importance of applying class weights to put more emphasis on the minority class. In addition, the large drop in AUROC when EoP 2 corresponds with a drop in the percentage of true positives to below 15%, potentially because of the reduction in the diversity of positive class data points which inhibits the models' generalizability. Table V shows the effect of removing each feature group on fall prediction performance. Fig. 3 plots the ROC curve for the best model with EoP = 4 (top row in Table V), the model with the same feature set excluding AP (2nd row in Table V), the model based only on gait features (7th row in Table V), and the model based only on STRATIFY (last row in Table V).
Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.   Table V show that the weakest classifier of all feature combinations, with an AUROC of 51.7, understandably belongs to the model solely trained on the static STRAT-IFY score administered at the time of admission. This model performs virtually the same as chance predictions, for which the AUROC would be 50.0. As presented in Table V, adding longitudinal feature groups (gait or medication) improved the classifiers' performance and the highest AUROC of 76.2 is achieved when Gait, AP, and STRATIFY are used. Removing antipsychotic medication (AP) information only slightly degrades model performance to an AUROC of 72.4 (2nd row in Table V). Removing Gait features results in a drop to AUROC = 56.2 when only STRATIFY and AP features are used (3rd row in Table V). Removing STRATIFY and only using the longitudinal features results in a larger performance drop to an AUROC of 68.9 (5th row in Table V). That is, while STRATIFY alone is not a good predictor of short-term falls, it serves as an important prior augmenting the longitudinal features.  results, and consistent with previous findings [33], the combination of Cadence and eMOS was most effective in identifying those at a high risk of a fall.

Results in
Finally, Table VII shows the effect of using longitudinal features from other medication groups (BE and AD) or the combination of all three medication groups (AP, BE, and AD).
Including all three types of medications in the feature set resulted in a large drop in performance (AUROC = 60.6 when all 3 medication groups are used vs. 76.2 when only AP medications are fed to the predictive model). This is likely due to overfitting when all three medication types are used. Replacing AP with either BE or AD information also resulted in a lower performance, with an AUROC of 64.1 and 60.6 respectively. Individually, AD and BE features performed worse than chance prediction (AUROC<50.0) and AP alone resulted in an AUROC of 53.2.
As the fall risk assessment model is being trained and evaluated towards the development and deployment of an automated alert system for a fall prevention program, it is important to be mindful of potential disparate impact. It is, therefore, important to investigate model performance stratified based on population groups, e.g. based on sex or ethnicity. While ethnicity/race information is not available for the study participants, it is possible to break performance down based on sex. Among the 30 male participants, 14 (46.6%) were fallers. Among the 24 female participants, 11 (45.8%) were fallers. Table VIII presents the performance of the model separately for men and women in the cohort. As Table VIII shows, the model obtains similary AUROC levels (75.8 and 76.4) for the two groups.
The accuracy of fall risk scores provided by this model can be beneficial in a long-term care setting to trigger investigation and potential interventions. Clinicians can interpret the predictions of the model as a fall risk score and use it to triage their attention Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply. to residents who are assigned a high risk of falling in the near future. That is, rather than relying on the system to correctly detect each future faller, the system will help clinicians and caregivers to monitor and reduce falling risk more efficiently by focusing their attention on those who are most likely to benefit from a thorough clinical examination. Performance of the model should also be viewed within the context and complexity of the task. Falls are caused by a combination of intrinsic and extrinsic factors and model predictions here are based only on the regular monitoring of gait/balance and medication intake, and without knowledge of other factors such as dizziness, poor vision (e.g., forgetting one's glasses), muscle weakness, being tired, dim lighting, obstacles, and other falling hazards. Given this, it is remarkable that the model could identify short-term fallers with and AUROC of above 76.2 and with a sensitivity and specificity of 72.8 and 73.2.
This study presented important findings and demonstrated the possibility of dynamic fall risk assessment in older adults with dementia via ambient monitoring. While the findings are important and novel, there are also several limitations with the current study. First, while the number of ambient gait assessments in the study is substantial (4851), the number of participants is limited to 54, thus limiting the variability in the dataset. Second, while the average number of walks per participant was relatively high (89.9, see Table I), the range varied significantly, from as low as 11 to as high as 323. This further limits the variability captured in the dataset as individuals with more recorded walks and a longer length of stay are over-represented. Third, most participants had a short length of stay. The setting where data were collected was a specialized dementia unit which admits older adults with behavioural and psychological symptoms of dementia. The average length of stay in this unit is significantly shorter than that of LTC facilities (2-3 months vs. years). The camera was installed in one corridor of a large unit and required the participants to choose to walk down that corridor to capture walking data, which meant that daily walks were not necessarily captured. Fourth, because the unit typically admits individuals with severe behavioural and psychological symptoms who have a higher risk of falling, the rate of falls in this unit is higher than a typical LTC facility. As a result, it is important to externally validate these findings on longer-term longitudinal data collected in LTC facilities where the fall rate is expected to be lower.

V. CONCLUSION AND FUTURE WORK
People living with dementia are at a high risk of falls and the negative consequences of falls, including injury and death. Many falls could be prevented if clinicians could determine a person's risk of falling and offer an intervention to reduce the risk. To provide a clear opportunity to intervene and reduce the risk of falling, we need a tool that can detect subtle changes in falls risk and can predict falls in the short-term, i.e. days or weeks. Such a tool would also be useful in research studies of different interventions that aim to decrease falls risk, where at present, an improvement in falls risk can be difficult to detect.
This study is the first to present a machine learning model capable of dynamically estimating short-term fall risk for older adults with dementia. The model used baseline fall risk scores, gait measurements from baseline and daily dose of antipsychotic medication in the past 5 days. The AUROC of the model (76.2 for predicting falls over the next 4 weeks, and 77.4 for predicting falls over the next 5 weeks) are close to clinically acceptable levels, motivating further research to improve and prepare the technology for deployment.
Ongoing work includes longitudinal data collection from multiple ambient monitoring systems in other long-term care facilities to externally validate the developed machine learning models on data collected at other sites. Ongoing work is also investigating the accuracy by which clinicians, e.g., physiotherapists, are able to identify falling risk by way of observing residents' gait. Comparison between machine learning model and clinician predictions will shed light onto factors by which fall risk is estimated by each and whether predictions could be improved by combining these factors. Future work will involve deploying the dynamic fall risk assessment technology in LTC facilities to alert the care staff when a large increase in probability of a fall is detected for a resident, and to investigate the effect of this technology on reducing falls and fall related injuries in older adults with dementia.
The future clinical application of this research involves deploying dynamic fall risk analytics in LTC facility video surveillance systems to alert the care staff when a large increase in probability of a fall is detected for a resident. This will allow for reassessment of residents and evaluation of the need for any further intervention. This technology would also allow clinicians to evaluate the impact of a fall prevention intervention (such as an exercise program or medication change) on the predicted risk of falling. Controlled trials will be required to determine whether a vision-based quantification of falls risk has an impact on clinically important outcomes since as reducing the number of falls and fall related injuries in older adults with dementia. Given the high rate of falls and fall-related morbidity and mortality in LTC residents, and the substantial healthcare costs and utilization associated with falls, they are an important target population for this technology.
Further work is needed to test the generalizability of this model to different populations of older adults, such as those living in retirement homes or in the community. With the widespread use of video surveillance systems in assisted living environments, this type of gait monitoring and falls risk prediction technology has the potential for widespread use and impact.