Automatic gait analysis during steady and unsteady walking using a smartphone

—Human gait analysis is routinely used in healthcare applications for diagnosis, monitoring disease progression and evaluating the effectiveness of interven- tions. In this study, a new smartphone-based automatic gait analysis system is presented. The system employs an adaptive algorithm to automatically recognise step events and to extract clinically relevant temporal gait parameters, including cadence, stride time, swing to stance ratio, double support time, intra step variability and inter step asym- metry. The performance and generalisation ability of the system is evaluated during steady and unsteady walking on a large group of human participants with varying demographics and degrees of gait impairments. Such a large scale testing presents a signiﬁcant departure from previous studies that were mostly focused on gait analysis dur- ing steady walking involving more homogeneous datasets. Compared to ground truth data, the system successfully performs in recognising gait events (with an accuracy > 98%) and estimating gait parameters with high precision (mean absolute error < 50 ms) and limit of agreement (inter-class correlation coefﬁcient > 0.8) except gait variability and symmetry. In addition, the system is sufﬁciently reliable to differentiate between steady and unsteady walking as well as normal and impaired gait, producing gait assessments that are inline with expert opinion. Overall, the system’s performance is up to par with the state-of-the-art and this inexpensive and accessible solution offers a step forward towards personalised, continuous gait monitoring in free-living environments.


I. INTRODUCTION
Human gait is a key biomarker which correlates positively with health, independence and quality of life. In healthcare applications, gait analysis is used as an assessment tool to measure the presence or progress of a specific condition and as an outcome measure to evaluate the effectiveness of an intervention. Gait analysis also has a prognostic value, for instance, while predicting falls, cognitive decline or life expectancy.
Two most common methods for measuring gait are observer-rated clinical evaluations which are routinely used in daily practice, and 3D motion capture systems often found in human biomechanics laboratories. The former is fast and effective where scores are generated based on standardised functional assessment scales (e.g., functional gait assessment and dynamic gait index [1].Yet, these relatively simple scores are qualitative, prone to subjective bias, and do not provide detailed gait characterisation. The latter is capable of measuring full body spatial and temporal gait parameters with high precision and resolution, and is considered to be the gold standard in the field. However, the motion capture systems are rather expensive, require a fixed setup with multiple cameras, have limited capture volume, and gait measurements are laborious and time consuming. In recent years, there has been growing evidence that laboratory-based gait measurements are not necessarily representative of real gait performance during daily living [2]. Within the scope of continuous real world gait monitoring, there have been significant efforts to develop portable systems. These systems vary from external sensors such as depth cameras (e.g., Kinect) and instrumented walking mats (e.g., GaitRite) to body-worn (i.e., wearable) devices made from inertial sensors and pressure sensitive insoles [3]- [6].

A. Related work
Custom-built wearable gait measurement systems usually deploy multiple sensors distributed around the body [7]- [9]. These systems have the capacity to provide multi-dimensional gait data by tracking multiple body parts (e.g., neck, lower back, limbs and feet). However, they may be less preferable in real-world applications because of human factors (e.g., convenience, appearance, comfort or fear of social stigma). An alternative, less obtrusive approach is to write software applications for smartphones [10], [11].
Smartphones come with onboard motion sensors (i.e., accelerometer, gyroscope, magnetometer and GPS) that can be used in gait analysis [12] as well as falls prediction [13] and activity monitoring in the community [14]. These devices are becoming an integral part of human life and their accessibility and acceptance rate are expected to be higher (although phone/watch usage and technological literacy in older users is still lacking compared to younger users [15]).
Phone-based gait measurements are limited to one specific area of the body. To what extent single-point motion measurements can provide accurate gait characterisation and how the position of a device affects performance are open research questions that have been tackled by different research groups in recent years. Publications from these groups clearly show that step-related temporal gait parameters (e.g., cadence, step time and swing to stance ratio) can be measured accurately by a phone or an inertial sensor usually placed at the trunk close to centre of gravity line; at the lumbar spine (lower back) between L2 and L5 vertebrae [16]- [18].
While walking, every step creates peaks and troughs in the sensor data due to translational and rotational motions of the trunk. Gait analysis typically starts with detecting these local minima and maxima points to differentiate between right and left steps and to extract initial and final foot contact time points; in this study, we will refer to them as heel strike and toe offs, respectively. Temporal gait parameters are calculated from heel strike and toe off time points [16], [19]. In addition, it is possible to estimate step length indirectly using biomechanical models [20].
So far, great contributions have been made in developing new signal processing algorithms to elevate the performance of phone-based gait measurements to gold standard levels, and studying gait in different people and conditions. Performance is often quantified by comparing the phone predictions to measurements by an independent gold standard method (e.g., motion capture systems, walking mat, video camera and a pair of smart shoes). A brief review of the state-of-the-art is provided below. In these studies, unless stated otherwise, the wearable sensor was placed at the lower back. a) Steady walking: Bugane et al. [21] showed that stride length, stride time, cadence and walking speed can be detected with an accuracy up to 98% using a single accelerometer. The experimental data included 22 healthy participants (ages 20-35) during 10-meter steady walking. Avvenuti et al. [18] extended the analysis to 40 m steady walking (measuring three participants; ages 22, 32 and 61) and evaluated the changes in performance when the accelerometer was placed in the pocket. The error between estimated and actual stride, swing and stance times was less than 50 ms. Del Din et al. [16] used an accelerometer to measure 30 early stage Parkinson's volunteers (with minor gait impairments) and 30 age-matched controls (age > 60) during 10-meter steady walking. They estimated stride, swing and stance times with less than 14 ms error.
De Ridder et al. [22] evaluated the validity of the commercial BTS G-Walk sensor system in detecting cadence, stride length and time, single and double support times, and swing and stance times. The BTS G-Walk combines information from inertial sensors and GPS to estimate gait parameters. The experimental data included 30 healthy participants (age < 60) walking steadily at a comfortable pace (each participant repeated the experiment five times). The system had good test-retest variability; inter class correlation coefficient (ICC hereinafter) was 0.85 or higher and concurrent validity was between 0.88 and 0.97. Fujiwara et al. [17] also evaluated testretest reliability of an accelerometer in 36 healthy participants (age < 50) who were tested twice (at least one month break between the tests). The study reports ICCs of 0.80, 0.79 and 0.78 for stride time, step time and cadence, respectively. Manor et al. [10] used a mobile phone in the pocket to detect stride times in 14 healthy individuals (ages 18-35). The system was tested both in the lab and in participants' homes during 45 s normal and dual-task steady walking. The error was less than 17 ms.
Zhong & Rau [11] studied the gait parameters of 148 community dwelling adults (age > 60) during normal, dualtask and fast walking. Buckley et al. [23] analyzed gait asymmetry in 25 stroke participants (age > 60) with mild and moderate gait deficits during 10-meter walking. Zijlstra et al. [20] compared slow, medium and fast walking in 41 healthy participants; 26 young (ages [19][20][21][22][23][24][25][26][27] and 15 elderly (ages 62-89). Seo et al. [5] used the BTS G-Walk system to differentiate between stroke and healthy participants as well as affected and unaffected side in stroke participants. González et al. [24] studied differences in temporal gait parameters between five frail elderly (age > 80) and five young adults (age < 30). Esser et al. [25] presented a comparative gait analysis on people with various gait impairments showing that step length calculations need to be corrected for each gait disorder type. b) Unsteady walking: The majority of existing work on phone-based gait analysis has focused on steady walking along a straight path. However, activities of daily living often involve unsteady walking, for instance while navigating in the house, walking in the garden or shopping. During these activities, people often speed up, slow down and change direction, hence the amplitude and timing of peaks/troughs (corresponding to heel strike and toe off time points) in the wearable data vary considerably. These variations make step detection and gait analysis more challenging.
Silsupadol et al. [19] studied unsteady walking in 24 healthy participants while turning, accelerating and decelerating both in laboratory and outside environments. Although the authors were able to estimate velocity, step length, step time and cadence with high accuracy (Pearson correlation coefficient varied between 0.5 and 1), the accuracy of their gait symmetry estimations was reduced significantly (correlation coefficient < 0.5).
In recent years, researchers have started to analyse unsteady gait parameters during the Timed Up and Go (TUG herein after) test, which is a standard test used to evaluate mobility and balance. Traditionally, the test is performed under supervision and scored by the completion time with high scores correlating with fall risk [1]. It has been shown that multiple inertial sensors distributed around the body could provide more in depth analysis of the test by quantifying postural and gait transitions, arm swing and lower limb movement patterns as well as gait characterisation [26], [27]. However, achieving similar performance levels using a single phone is still pending [28]- [30].

B. Research objectives
In this study, our objective is to design a smartphone-based gait analysis system that can be generalisable to different walking styles and population groups. We propose a novel peak detection algorithm with adaptive thresholding and auxiliary post processing algorithm which are used together to extract gait parameters accurately and reliably. The robustness and generalisation abilities of the system were evaluated during steady and unsteady walking on a large group of participants with varying demographics and degree of gait impairment. Further performance analysis was performed to evaluate the sensitivity of the overall system in differentiating between normal versus impaired gait, recognising intra-individual gait variability depending on medication use or walking aid, and producing gait assessments that are inline with expert reports.
The remainder of the paper is organised as follows. Section II describes the proposed smartphone system including a data recording mobile app and a novel gait analysis method. It also details about the participants involved in the study and experimental procedures. Section III presents results on evaluating the proposed systems performance. Section IV highlights the novel contributions and main findings of the presented study. It also discusses the current limitations of the system. Section V concludes the paper with a brief summary and further discussion on future work.

A. The wearable system
A custom-built motion recording app was designed in Android Studio. The app runs on a standard smartphone and tracks lower limb motions by reading data from inertial sensors found on the phone. When the app is downloaded for the first time, it detects all available sensors on the phone and lists them for the user. The user has the option to select one or more sensors for recording. At the beginning of each recording, the app prompts the user to enter the participant ID and demographics information including age, height, weight, as well as select a physical activity, trial number and duration of recording from drop-down menus. The user starts recording manually. At the end of the recording, inertial sensors and meta data are collated into a single json file on the phone and uploaded to an online secure server maintained by the Department of Computer Science at the Aberystwyth University. If the phone does not have an internet connection, the json file is stored temporarily in the phone's local file store. Once the internet connection is reestablished, the file is transferred to the server.

B. Participants
The study had two datasets. Dataset 1 consisted of 98 volunteers participated (44 males and 54 female) including students and staff members studying/working at the Aberystwyth University and community dwelling volunteers residing in Ceredigion county. In dataset 1, participants did not have any apparent neurological or chronic physical condition that could affect their gait. Dataset 2 included 26 volunteers (12 male and 14 female) who were considered as frail and/or had gait impairments caused by Stroke and Parkinson's (PD hereinafter). The classification of frailty was based on slow walking speed (< 0.8 m s −1 ) and increased completion time in TUG test (> 10 s) [31].

C. Walking tests
Both steady and unsteady walking tests were carried out under supervision in a controlled environment using standard functional fitness tests which are routinely used in clinical settings. Both tests were performed at maximum speed that participants could walk comfortably and safely. a) Steady walking: Participants walked 10-meter straight at a constant pace. The actual walking distance was 14meter allowing participants to accelerate/decelerate at the beginning/end of the walk. b) Unsteady walking: Participants performed TUG test which consists of sequence of motions; i.e., rising from a chair, walking three meters, turning 180 degrees, walking back to the chair and sitting down while turning 180 degrees.

D. Data collection
All participants provided their informed consent before participating in the study, all experimental protocols were approved by the Aberystwyth University and the NHS Ethical Committees, and all experiments were conducted according to the guidelines provided in the Declaration of Helsinki. The experiments took place on the University's Penglais campus in the Human Biomechanics Laboratory in Carwyn James building at the Institute of Biology, Environmental and Rural Sciences. The experimental data was collected over two years between 2018 and 2019. For dataset 1, students and staff members were tested gradually whenever the lab was available, whereas the community dwelling volunteers were tested during the University's annual Health and Fitness MOT Test Day for the over 60's in two consecutive years (28 th of June 2018 and 27 th of March 2019). For dataset 2, the frail participants were tested during the same MOT event days, whereas participants with Stroke and PD were invited to the lab on separate days to allocate more time for data collection. a) Experiments: The majority of the participants, except for those who attended the MOT events, completed at least one steady walking and one unsteady walking test. The MOT participants only completed the unsteady walking test due to limited time available. In addition, there were a few students/staff members who did not complete the unsteady walking test. Participants were standing still (or sitting on the chair) before starting the steady walking test (or unsteady walking test). Experiments with two PD participants were repeated twice to study whether pharmacological treatment or using a walking aid affected gait. The first participant, PD1 who had cardinal features on the left, was tested half an hour before and after taking medication (i.e., OFF versus ON). The second participant, PD2 who also had cardinal features on the left, was tested with and without holding a walking cane in the right hand. Experiments with the stroke participant, who had right-side paresis and foot drop, were also repeated twice to compare walking performance while wearing two different orthotic devices (i.e., splint and functional electrical stimulation).
b) Wearable measurements: The data were recorded from phone (Google Pixel 2) with an average sampling rate of 405 Hz. During experiments, participants wore a fixation belt with a phone holder, which was positioned around the lower back close to the L3 vertebra. The phone was placed horizontally in the phone holder such that the accelerometer z-axis would align with the anterior-posterior axis and accelerometer yaxis pointing in the lateral-medial axis (from right to left). To reduce motion artefacts on the phone data, the belt was tightened as much as possible without discomforting the participants. The experimenter started and stopped the data recording manually on average five seconds before the start and after the completion of each test, respectively. Once the mobile data were uploaded on the server, they were visually inspected for quality control. c) External camera measurements: Experiments were also recorded from a side view using a GoPro Hero 6 (frame rate: 240 frames s −1 , resolution: 1920 x 1080 pixels). For consistency, the camera was mounted on a tripod 56 cm high and its field of view covered the entire walking distance during each test. The GoPro data was used for two purposes. First, all videos (from both datasets 1 and 2) were analysed frame by frame to label steps (right versus left) and extract heel strike and toe-off time points manually. This ground-truth data was then used to evaluate the performance of the peak detection algorithm proposed in the study. Second, videos from dataset 2 were analyzed by two experts to provide a qualitative description of the gait impairments observed in each participant. Dr Federico Villagra Povina is a Lecturer in Exercise and Physiology and Mr David Langford who is a Chartered Health and Exercise Practitioner; both have more than 20 years experience working in neurorehabilitation. Their descriptions provided a reference point while interpreting the results obtained by the wearable system (a detailed comparison is provided in Section IV).

E. Wearable data processing pipeline
The presented study focuses on extracting lower extremity gait parameters. The data processing pipeline started with detecting heel strike and toe off time points as well as recognising left versus right steps. This information was then used to derive clinically relevant temporal gait parameters including cadence, stride, step, stance, swing and double support times as well as intra step variability (i.e., step time variance on the same foot) and inter step asymmetry (i.e., difference between right and left step times).
Heel strike and toe off time points aligned with clear peaks in the phone accelerometer data in the direction of walking (anterior-posterior axis); heel strikes corresponding to negative peaks (i.e., when maximum deceleration occurred) and toe offs corresponding to positive peaks (i.e., when maximum acceleration occurred). Similarly, right and left steps could be detected from trunk rotations measured by the phone gyroscope around the longitudinal (vertical) axis; right steps corresponding to negative peaks (i.e., maximum angular velocity in the anticlockwise direction) and left steps corresponding to positive peaks (i.e., maximum angular velocity in the clockwise direction), respectively.
The data processing pipeline had four steps: i) preprocessing the raw data to filter out noise, ii) peak detection for heel strike, toe off, and right/left step detection, iii) estimation of temporal gait parameters and iv) post-processing the data to detect missing steps. a) Pre-processing: All raw data was filtered using a second order low pass Butterworth filter with zero phase delay and 10 Hz cut-off frequency. Given the cyclic nature of walking and assuming that on average each step takes less than a second [32], a 10 Hz cut-off frequency preserved the dominant frequency and its harmonics in the data, hence allowing more detailed analysis in gait characterisation (for instance, see [33] for harmonics gait analysis). b) Peak detection algorithm: The peak detection algorithm run two methods in tandem. The first method detected all the peaks in the data which were above an amplitude threshold. It achieved this by simply iterating through the data and comparing three consecutive data points at any given time. The data point in the centre was regarded as a peak if its amplitude was above the threshold and was larger than the amplitudes of the adjacent left and right points. On average, 30% of the peaks returned by the first method were false positives corresponding to other gait features (e.g., mid swing). The amplitudes of these peaks were smaller than those of the true positives (i.e., actual heel strike and toe off time points). To select the actual peaks, the second method computed the time elapsed between the peaks and removed those that were too close to the local maximum peaks (i.e., if the time difference between the two peaks was smaller than the threshold, the peak with smaller amplitude was removed). The pseudo code for the peak detection algorithm is presented in Algorithm 1. c) Adaptive threshold selection: The peak detection algorithm relied on two thresholds, amplitude threshold and time threshold, which were determined automatically from the input data. The amplitude threshold was set to the mean ± one standard deviation of the input data amplitudes. The time threshold was based on half of the average step time, which was estimated from the dominant frequency using a Fast Fourier Transform. These threshold selection methods made the peak detection algorithm more adaptive to individual datasets reducing the risk of missing true peaks or detecting false positives. For instance, in a dataset where the participant walked slowly (i.e., with longer step times and smaller acceleration and decelerations phases), the method chose a smaller amplitude threshold and bigger time threshold, respectively. First, the phone accelerometer data was entered into the peak detection algorithm to detect toe off time points. Second, the accelerometer data was multiplied by -1 and inputted into the peak detection algorithm again to detect heel strike time points. Third, phone gyroscope data was entered into the peak detection algorithm to detect left steps. Fourth, the gyroscope data was multiplied by -1 and entered into the peak detection algorithm again to detect right steps. In addition, a simple check was incorporated in the algorithm to ensure that a toe-off followed a heel strike (or vice versa) and a right step followed by a left step (or vice versa). e) Estimating temporal gait parameters: Cadence (number of steps per minute) was calculated by counting the number of steps divided by the duration of walking and extrapolating this number to a minute unit time. While calculating cadence during unsteady walking, the initial (sit to stand) and final (stand to sit) sections of the TUG test were omitted from the analysis. Stride time was defined as the time passed between two consecutive heel strikes of the same foot (whole gait cycle), whereas step time was defined as the time passed between right and left or left and right heel strikes. Swing time was defined as the time passed between the toe off and heel strike of the same foot whereas the stance time was defined as the time passed between the heel strike and toe off of the same foot. Both swing and stance times were reported as the percentages of stride time. Double support time was defined as the time passed when both feet were on the ground and again reported as the percentage of stride time. Intra step variability was measured as the standard deviation of the step times of the same foot; the higher the standard deviation, the higher the variability was. The inter step asymmetry was measured as the step time difference between right and left foot normalised to the sum of the right and left foot step times. The asymmetry score varied between 0 and 1 corresponding to 100% symmetric and asymmetric gait, respectively. For each foot, step, swing and stance times and intra step variability were evaluated separately. f) Post-processing: A post-processing algorithm was devised to catch instances when the peak detection algorithm failed to recognise a step (e.g., when participant had a small step while turning around the landmark). For each foot, the algorithm sorted the step times and calculated the difference between the maximum and average step times. If the difference was bigger than the 50% of the average step time, the algorithm raised a flag for a potentially missing step. Here, it was assumed that walking was not interrupted (e.g., a participant could have stopped and started walking again) and the maximum intra step variability was less than 50% of the average step time. When a flag was raised, the peak detection algorithm was rerun on the specific data segment (during which a step was missed) using a smaller amplitude threshold (with 10% decrements). This iterative process continued until all step times were within the 50% of the average step time or amplitude threshold was reduced to the 50% of the original value.
F. Statistical analysis a) Validation of the wearable system: For each dataset, the percentage of steps detected accurately was calculated. For each gait parameter, the agreement between the phone and camera were evaluated by calculating mean absolute error (MAE), and inter-class correlation coefficient (ICC) with 95% lower and upper bound confidence intervals (CI). b) Comparison between steady versus unsteady walking as well as normal versus pathological gait: Dataset 1 was divided into three groups: 1) young adults (age < 60) walking steadily, 2) young adults walking unsteadily and 3) older adults (age > 60) walking unsteadily. Change in gait parameters were tested for statistical significance using one way analysis of variance (ANOVA). To reduce the chance of type-1 statistical error, the significance level for ANOVAs was reduced to 0.005 using a bonferroni correction.
To evaluate whether gait parameters during unsteady walking differed between datasets 1 and 2, one-sample t-tests at the significance level of 0.005 (bonferroni corrected) was performed. In addition, a logistic regression model was trained to differentiate between the two datasets. The classification performance of the model was evaluated using standard metrics; precision, recall and f-measure.

III. RESULTS
All data analysis was performed offline using custom-built Python scripts on a standard laptop (MSI, Intel Core i7 processor and 16GB RAM). The average wall clock time for processing 10 s walking trial was 0.044 ± 0.009 s. In total, 2457 steps were included in the analysis. Table I presents the participant information and total number of steps analysed for each group within the two datasets.
a) The new system was reliable and accurate in detecting steps during steady and unsteady walking: Overall, the system detected 98.8% of the steps successfully; weighted average of 99% (dataset 1) and 98.5% (dataset 2). Two examples from each dataset (one steady and one unsteady walking) are shown in Figure 1. The system only missed a few steps while participants were taking quicker steps or dragging their feet during turning. Similarly, the performance was slightly worse if participants had an asymmetric gait (peaks on the weaker side were less prominent). In addition, there were a few instances where the algorithm detected false positives when there was a long pause between standing up and starting to walk or when a participant used a walking aid which interfered with the actual steps (by creating extra peaks in the data).
The system also had a very good validity in estimating temporal gait parameters except step time variability and asymmetry (Table II). In dataset 1, MAEs and ICCs for cadence were < 4 steps min. −1 and > 0.95, for stride and step times were < 30 ms and > 0.88 and for stance, swing and double support times were <4% and >0.6. In dataset 2, MAEs and ICCs for MAEs and ICCs for cadence were < 4 steps min. −1 and > 0.95, for stride and step times < 45 ms and > 0.88 and for stance, swing and double support times were < 6% and > 0.88. The ICC of step asymmetry was also significant in dataset 2. In dataset 1, unsteady gait parameters were different from steady gait parameters but they did not vary significantly between the two age groups (Table III, repeated ANOVAs). During unsteady walking participants took more and faster steps (cadence was 17.4% higher and stride time was 15.1% lower), and had increased stance (3.2%) and double support times (15.7%). In addition, step time variability and asymmetry were higher during unsteady walking (9.3% and 48.7%, respectively). c) Impaired gait was different from normal gait: A simple logistic regression model was able to differentiate between normal and impaired gait during unsteady walking with perfect precision, recall and f-measure (i.e., 1). In the model, the contributions of the gait parameters cadence, stride, step and double support times and step variability and asymmetry were significant (p < 0.01). Further pairwise comparison of gait parameters between two datasets (Table III, repeated Student t-tests) showed that participants with impaired gait had lower cadence (24.5%), higher stride and step times (23.6% and 25.7%), and increased stance (7.7%) and double support times (23.2%). Although not significant, these participants also had higher step time variability and asymmetry. d) Individual gait parameters changed depending on the use of medication or walking aid: Studying intra-individual variability in three case studies (one stroke and two PD participants) showed clear differences in gait parameters (Table  IV). The stroke participant was able to walk 4.1% faster with FES than with splint. This was due to improved improved cadence (9.1%) and swing time (3.3%), as well as lower lower stride (8.3%) and double support times (11.9%). However, the gait became less symmetric (62.5%) due to longer steps  Four examples of filtered accelerometer data from: (a, b) healthy subject (female, age = 67 , bmi = 16.9 kg/m 2 ) and (c, d)) stroke subject (male, age = 66 and bmi = 24.7 kg/m 2 ). For each data set, toe off and heel strike time points are shown on the top and bottom, respectively (phone, cyan and camera, magenta). Amplitude thresholds (horizontal dashed lines) and average stride times (horizontal grey line between two peaks) are also shown. In (b), lowering amplitude threshold after post-processing (horizontal dashed line, orange) enabled the phone system to detect smaller steps during turning (orange ). In (a, c), arrows pointing upwards indicate start (S) and finish (F) time points, respectively. In (b, d), time interval of TUG sub-activities (i.e., sit to stand, turn 1 and 2 and stand to sit) are shown with horizontal solid lines at the bottom (grey). In (c, d), + indicate steps on the paretic side. In d, ∧ and ∨ indicate false positive toe off and heel strike time points, respectively. There is also one missed step indicated by shaded vertical bar, grey.
on the hemiparetic side. These measurements were inline with the expert notes: "Improved foot clearance and reduced circumduction with FES compared to splint which." The expert also mentioned that: "Asymmetric step length and cadence did not change." This comment was noteworthy as the phone estimations suggested otherwise (more on this in the discussion).
In PD1 the use of cane improved cadence (5.7%) and gait asymmetry (60%). The expert report confirmed this by stating that: "The participant continued to display the same Parkinsonian gait characteristics with improved symmetry." In PD2, there were also changes in the gait parameters before and after taking medication. While ON, PD2's walking speed increased by 66.7% due to longer steps. Their swing and double support times also went back to normal (16% and 32% change). The expert: "While OFF, no reciprocal arm swing, forward flexion and rigidity in lower back and left shoulder. They also had postural malalignment to the left side which I suspect was the affected side. Short, regular steps with no apparent gait asymmetry. While ON, noticeable dyskinesia in trunk but improved knee flexion. Reciprocal arm swings restored. Walking was much faster with longer steps. Gait retained a symmetrical rhythm throughout the trial".    This study complements previous studies in several ways: a) Rich testing data: The system was tested on a large group of participants with varying age and degree of gait impairment during steady and unsteady walking. Overall, the system's performance in identifying steps and estimating temporal gait parameters was comparable to the state-of-theart systems, which were often tested on a smaller and more homogenous group of participants [10], [17], [34]. b) Adaptive threshold selection during peak detection: In these two datasets, step induced peaks in the accelerometer data varied greatly in intensity (0.09 − 5.26 m s −2 ) and in time (0.41 − 0.99 s) between participants. The adaptive threshold selection accounted for this variance by adjusting the amplitude (0.42 − 1.18 m s −2 ) and time thresholds (0.22 − 0.40 s) for each data automatically. This improved the performance of step detection by 23% compared to using fixed time threshold (0.30 s). Inline with our findings, recent studies [35], [36] discussed the benefits of the adaptive thresholding approach while analysing walking patterns across varying speeds and participant populations. c) Catching missed steps using a post-processing algorithm:

IV. DISCUSSIONS
The post-processing algorithm ensured that the right and left step times followed a consistent pattern. If two right (or left) steps were detected in a row or if an estimated step time was too long (i.e., if there was an outlier in the step time distributions), a flag was raised and the data was reanalysed using slightly more lenient amplitude thresholds. This approach allowed us to catch most of the missing peaks without introducing false positives.

B. Potential applications
The study shows that the proposed phone-based technology was sensitive enough to separate steady from unsteady walking, differentiate between normal and impaired gait and recognise changes in the impaired gait depending on the use of medication or walking aid.
The technology is poised to provide objective outcome measures, which can be used in a variety of personalised applications including reassuring patients that an intervention has a positive impact on their gait, progress monitoring, recommending the medication dose or frequency of administration [37] and screening new orthotic devices. It can also pick up on subtle changes in gait which may go undetected in clinical observations [38]. For instance, during their initial evaluation the expert did not notice any changes in gait asymmetry and cadence when the stroke participant switched over to using FES. After reviewing phone measurements, the expert watched the participant's videos again and stated: "This is why these technologies can be useful in the future. I agree with the phone measurements. Now that I had a second look at the videos, there were some changes which I did not notice at the beginning. This was partially due to the camera angle. It remains to be seen, however, how much these changes impact outcome measures such as independence and quality of life".

C. Current limitations
The proposed algorithm has several limitations. First, it presumes that the heel strike and toe off time points would align with the most prominent peaks in the accelerometer data. Although this was the case for the majority of the datasets, there were a few exceptions; for instance, when a frail participant was walking with a four-wheel rollator. In this participant's data, each gait cycle had multiple, similar amplitude peaks, and it was not straightforward to differentiate between the peaks generated by the actual steps and peaks resulted from the motion of the assistive device.
Second, the performance of the peak detection algorithm depends on the selection of the appropriate amplitude and time thresholds. This was harder while analysing the unsteady walking data. In some participants, the initial (i.e., rising from a chair) and final phases (i.e., sitting down while turning 180 degrees) of the TUG test created large amplitude, low frequency peaks which changed the statistical properties of the accelerometer data. As a result, the accelerometer data had relatively larger mean amplitude and standard deviation as well as lower dominant frequency. These changes increased the amplitude threshold and decreased the time threshold superfluously causing errors during step detection; i.e., missing steps because of higher amplitude threshold and having false positives because of lower time threshold.

V. CONCLUSION
A new phone-based gait assessment system has been presented and tested against steady and unsteady walking. It employs an adaptive peak detection and post-processing algorithm to recognise left and right steps and to estimate heel strike and toe off time points. With this development, clinically relevant temporal gait parameters such as cadence, intra step variability and inter step asymmetry have been determined. Systematic empirical evaluation tests with a diverse group of human participants have demonstrated that the system successfully performs the challenging task of differentiating between steady versus unsteady and normal versus pathological gait. The system is also able to produce impaired gait assessments that are in line with the expert opinion. Compared to the state-of-the-art high resolution motion tracking systems used in controlled laboratory settings, this inexpensive and portable solution can be utilised for continuous gait monitoring in real world applications.

A. Future work
We are currently working to improve the performance of the system in measuring step variability and asymmetry. We are also working toward creating an automatic data segmentation algorithm to differentiate walking data from others. In addition, while analysing data from participants with asymmetric gait, it may be beneficial to use separate amplitude and time thresholds for the affected side during peak detection. Similarly, threshold values can be adaptive to walking speed (e.g., using progressively lower time thresholds as participants slow down) or for gait disorder (e.g., again using lower time thresholds to detect small, shuffling steps in Parkinsonian gait).
The long term goal is to improve the quality and robustness of the wearable gait monitoring technology so that it can be used in the community and clinical settings to assist social and health care professionals. We have recently proposed a framework for automatic gait analysis in free-living environments [39]. We have also started evaluating the performance of the system in a clinical study which is used to quantify the recovery of chronic stroke patients during longitudinal, intensive exercise rehabilitation program.

DATA
Data will be made publicly available on our group's website (www.aber.ac.uk) upon publication of the manuscript.

ACKNOWLEDGMENTS
We thank Keith Hughes, Director of Excel Electronics Ltd, who helped us collect data from one PD participant. We also thank Stefani Todorova Dimitrova, Bishnu Paudel, Vera Akpokodje and Megan Taylor Bunker for commenting on the manuscript.

AUTHORS' CONTRIBUTIONS
AS, FVP and OA designed the study. ED, LIL and DM developed the mobile app and online server. MA and DL organised the Health MOT days. MA, DL and HT recruited the participants. AS, ED, LIL, MA, AH, FVP, DL and OA collected the data. AS and OA developed the gait analysis methods. AS and OA analysed the data. FVP and DL provided expert reports on participants' gait impairments. AS and OA wrote the manuscript. DM, MS, AH, MA, QS, RZ, DL and FVP provided feedback on the manuscript.