A Digital Camera-Based Eye Movement Assessment Method for NeuroEye Examination

The ability to perform quantitative and automated neurological assessment could enhance diagnosis and treatment in the pre-hospital setting, such as during telemedicine or emergency medical services (EMS) encounters. Such a tool could be developed by adapting clinically significant information such as symmetry of eye movement or conjugate eye movement. Here we describe a digital camera-based eye tracking method “NeuroGaze” to capture the symmetry of eye movement while performing neurological eye examination. The proposed method was developed based on detecting the center of the pupil for both eyes from a given video and measuring eye conjugacy by transforming the pupil center coordinates to relative gaze. The method was tested on healthy volunteers while performing three neurological eye examinations1. We also compared our proposed approach to state-of-the-art digital camera-based eye-tracking methods and commercial off-the-shelf (COTS) eye trackers. NeuroGaze outperformed digital camera-based eye tracking methods by reporting a mean Spearman rank-order correlation coefficient of 0.86 for the H-test, 0.87 for the Dot-test, and 0.56 for the OKN-Test, and shows similarity in trends for the relative gaze trajectories with a noticeable offset in the scale of the relative gaze angle compared to COTS eye tracker (see Fig. 1). The study demonstrates that by using a pupil-center-based eye-tracking method, a digital camera can measure clinically relevant information regarding eye movement.


I. INTRODUCTION
A BNORMAL eye alignment and motion can indicate the presence of neurological diseases.Abnormal eye movements are often indicators of an underlying neurological disease, like Alzheimer's and stroke.Alzheimer's disease affects about 910,000 adults aged 65 or older per year [1], and stroke is a leading cause of death and disability globally [2].As a result, there is considerable interest in implementing eye-tracking technology to evaluate these and other neurological diseases [3], [4], [5].
Recent developments [6], [7], [8], [9] in eye-tracking technology have shown that commercially available eye trackers can be used successfully to detect saccades and quick phases during eye movements.However, this approach typically requires expensive equipment and laboratory setups to achieve the accuracy and precision needed to track eye gaze over a screen, making it difficult to scale up.While low-cost eye trackers [6], [10], [11] have been developed to address cost issues, the need for a specific device (such as a wearable or binocular camera) limits their clinical application.Machine learning techniques are being applied to make eye-tracking more practical and clinically useful.Several research groups have attempted to track eye movements using digital cameras found in laptops and smartphones [12], [13].While these attempts have shown promise with relatively accurate gaze coordinate predictions compared to commercial eye trackers [14], [15], [16], more development is needed to understand their strengths and limitations relative to commercial eye tracking equipment and overall clinical utility.
Given machine learning advancements in eye tracking for neurological diseases, these newer approaches [9], [17], [18], [19] attempt to replicate gaze coordinates.Previously we have demonstrated [20] a non-calibrated eye tracker can be used to assess eye movement symmetry and variability for posterior circulation stroke patients by evaluating how the eyes move in tandem, which is an important component of a typical neurological exam.How the eyes move in tandem together is known as conjugacy.With a few congenital exceptions, humans have nearperfect eye conjugacy as a characteristic of normal physiology.(see Fig. 1).Conversely, a significant deviation or acquired loss of eye conjugacy is most often considered pathological [20].For example, when the right eye can no longer move outward toward the temple, the eyes lose conjugacy as the person looks to the right since the left eye continues while the right eye remains still, unable to move.In this scenario, the differential diagnosis for a neurologist would include a brainstem stroke.
With the clinician's perspective in mind, we aim to determine the feasibility of digital camera-based eye trackers to measure eye conjugacy in healthy individuals by comparing the performance to COTS eye tracking equipment [21].The details of our study arrange themselves in the following order: Section II discusses the related work in eye tracking.Section III presents the proposed method for digital camera-based eye movement assessment.Section IV presents the experimental setup used to perform the analysis.Section V and VI present the results and discussion of the digital camera-based eye movement assessment for NeuroEye.Section VII discusses the conclusive remarks of the study.

II. RELATED WORK
In general, video-based eye tracking systems can be categorized into model-based approaches and appearance-based approaches [22].

A. Model-Based Tracking Systems
Model-based approaches often use one or more digital cameras to perform gaze estimation by using specific eye characteristics such as iris or pupil parameters to construct the 3D eye geometry model and determine the gaze points [23].The most popular model-based methods use infrared sensors for imaging, including several available COTS [6], [24].For example, medical research has used model-based eye trackers in several fields, such as ophthalmology [25], [26], psychiatry [27], psychology [28], psychopharmacology [29], and neurology [9], [17], [18], [19].
The COTS eye trackers have demonstrated use in analyzing pathological nystagmus [30], investigating gaze characteristics associated with autism [31], and classifying various visual field defects [26].However, model-based methods are complex and expensive to set up (e.g., infrared cameras).A common element of these setups is a dedicated calibration procedure that must be performed prior to gaze tracking and estimation for each study participant.Multiple studies utilizing model-based methods in neurological disease research [9], [26], [32], [33], [34], [35], [36], [37] had to exclude patients that cannot complete the calibration procedure.Excluding these patients leads to severe issues in the clinical generalizability of results and greatly reduces the practical use of these setups in clinical care.

B. Appearance-Based Tracking Systems
Appearance-based methods often perform gaze estimation using a single digital camera.These methods directly model a mapping relationship between pupil positions from eye images and gaze points on the screen using learning-based techniques, such as linear regression [38] and neural networks [15], [39].MPIIGaze is among the state-of-the-art gaze estimation methods [15], [40].It is based on learning eye images and head orientation to predict gaze angles.The model is trained on the MPIIGaze dataset that learns from annotated screen coordinates.
Krafka et al. [16] proposed a different estimation method that uses a mobile phone/tablet camera.The process was based on a convolutional neural network (CNN) learning method using eye and face images to predict the point of gaze on the screen.Similarly, [14], [41], [42] also used a method involving predicting the point of gaze on the screen by using annotating screen coordinates.Huang et al. [43] proposed a regressionbased learning method that used hand-crafted features from eye images.The features are input data for training and annotated screen coordinates.Park et al. [44] suggested a two-step learning process that learns from eye images to map gaze and its direction using a regression-based CNN.This method uses the eye images and the annotated screen coordinates as learning parameters to perform gaze direction predictions.
Detecting the pupil center to estimate screen coordinates is an alternative method that uses annotated screen coordinates; we have listed such recent implementations.Ahmed et al. [45] implemented an iris center localization approach by combining the circular gradient intensity with a CNN.This resulted in gaze estimation from the center of the iris.Arvin et al. [46] reported an approach involving using a contour detector to fit ellipsoids on 300 × 300 eye images for pupil center estimation.DeepLab-Cut [41], [47] has been used to annotate and detect pupil centers using deep learning models such as Resnet50.George et al. [48] implemented a geometrical eye center localization method dependent on fast convolution and ellipse fitting.Fabbian et al. [49] applied an eye center detection method by fitting a bounding box over the eyes.The localization of the gradient vector detected the pupil center within the bounding box.Similarly, Pauly et al. [50] used a bounding box extracted from the Haar cascade classifier and eye localization from histogram-oriented gradient features.
Although using annotated screen coordinates as a learning parameter may be suitable for fixation-based eye movement tasks in behavioral studies and general gaze estimation, it may not be suitable for examining eye movement pathology.This is because participants may not be able to focus on the stimulation in the screen that is used for annotating the screen coordinates, particularly if they experience involuntary eye movements such as catch-up saccades of low gain while performing smooth pursuit eye movement tasks [51], [52].These involuntary eye movements can affect the consistency of the screen coordinates annotation.Additionally, relying on user feedback and participant effort to capture gaze through annotating screen coordinates is a potential limitation.Multiple studies suggest that participants may have varying attention spans and not consistently follow the stimuli [53], [54], [55].

III. METHODS
To address the issues identified in the previous section, we developed a pupil detection method that utilizes annotated pupil center coordinates.The development of the proposed method has three main components described under the following subsections: 1) RoADIE, the apparatus used to acquire the data, 2) NeuroEye, a computer adaption of standard bedside clinical tests; and 3) NeuroGaze, the proposed method to quantify the conjugate eye movements.We hypothesized that a method developed for digital camera-based gaze estimation could demonstrate similar performance to a COTS eye-tracking device.Our proposed method uses a pupil detector to detect the center of the pupil for both eyes for a given video, then determine the conjugacy of the eye movement.

A. RoADIE
RoADIE (Rolling Apparatus for the Detection and Identification of Eye Movements) is a mobile rig we constructed to acquire data with the NeuroEye examination.The examination consists of three digitally adapted clinical neuro-ocular bedside tests.The RoADIE has a HIPPA-compliant computer to run our custom-built data acquisition software and to store the data.RoADIE has two digital cameras RealSense camera [56] embedded with RGB and infrared sensors.The RoADIE captures the gaze coordinates of the eyes using the "Tobii" Pro Fusion Eye Tracker [21].The NeuroEye examination is displayed on the screen 0.6 meters from the participants.The sensor modalities' data acquisition is triggered globally with the activation of the examination session.

B. NeuroEye
The NeuroEye examination comprises three tests to examine various motor functionalities of the eye.The three tests are the "Dot-Test", "H-Test", and "OKN-Test", which are computer adaptions of standard bedside clinical tests [57].The Dot Test is designed to assess the quality of eye coordination for each saccade.A clinician observes the movement of both eyes as the patient shifts their gaze from target to target (see Fig. 4(a)).The clinician looks for the eyes to suddenly move to the next visual target and stop accurately at its destination.Abnormal signs include the eyes stopping short or too far away from the target, known as under or over-shooting, respectively, or the eyes do not initiate a saccade.
The H Test is designed to assess the quality of eye movement at a constant slow pace as the eyes track a visual target, smooth pursuit in all directions.The "H" pattern (see Fig. 4(b)) ensures that both eyes track the target in all directions to their end ranges.Abnormal signs consist of the eyes lacking motion, or its initiation, in any direction or the eyes utilizing saccades to catch up to the moving visual target.
The optokinetic nystagmus (OKN) test measures the participant's ability to switch from smooth pursuit to a saccade in order to fixate on the next visual target after the first target disappears.The visual stimulus is typically comprised of vertical bars with a high contrast to the background.The bars move at a quick and constant pace from right to left (see Fig. 4(c)).This is done for each opposite direction.OKN typically remains preserved in individuals with occipital lobe infarcts, although impairment of a visual field may limit the amplitude of saccadic fixation.Asymmetry in performance of the eyes or poor ability to generate saccades during the OKN test may suggest damage to the ocular, brainstem, or cerebellar nuclei or tracts.

C. NeuroGaze
NeuroGaze is our proposed method to quantify eye movements during the NeuroEye examination (see Fig. 2).The purpose of NeuroGaze is to provide physicians and healthcare workers with easy-to-interpret information regarding the patient's ability to perform conjugate eye movement.Our proposed method contains three main components (1) pre-processing, (2) pupil detection, and (3) conjugate gaze estimation.
1) Pre-Processing: The RGB video stream from the Re-alSense camera was configured to capture the video of 1280 × 720 resolution at 30 fps.The stimuli of the H-test last for 24 seconds.Hence capturing 720 uncompressed frames during the examination.We used the "dlib" face detector [58] to extract facial landmarks to perform landmarks and intensity normalization that removes translation, rotation, and scale variations.Upon normalization, the left and right eyes were cropped from the face as separate images of 40 × 20 resolution using the facial landmark.The facial landmarks were also used to remove eye blink images.Blinking results in the movement of the eyelid that occludes the pupil.The vertical length ratio to the horizontal length is used to determine the blink, which affects conjugate eye movement estimation by 0.7%.
2) Pupil Detection: Clinicians often follow the eye's iris and pupil center to assess the eye's ability to perform the conjugate movement.To mimic a clinician's perspective, we developed our pupil detector to locate the center of the pupil.The pupil center detector is parameterized by a convolutional neural network (CNN) and trained by minimizing the mean square error (MSE) between predicted pupil centers and ground truth pupil centers.The network consists (see Fig. 5) of two convolution layers (with kernel size 3 × 3) followed by two fully-connected layers.The last fully connected layer has two output nodes that are used for predicting the pupil center's x coordinate and y coordinate.We use the normalized coordinates to compute the MSE loss.The normalization is performed by dividing the original coordinate in the 20 × 40 coordinate system by 40 so that the normalized value falls in the [0, 1] range.The network uses ReLU as the activation function.
We [MAH, XY, and YZ] annotated and verified 5400 pupil centers from here onwards, referred to as the "NeuroEye dataset" (see Fig. 6(a) for examples).The NeuroEye dataset is made available at the project website 1 .The annotated pupil center images were accumulated by randomly selecting the left and right eye images for each participant for all three NeuroEye examinations.An Adam optimizer [59] of a learning rate of 5e − 4 is used for optimizing the model.To validate the effectiveness of the proposed pupil detection, we conducted a leave-one-out cross-validation (leave one patient's data for validation.and used other patients' data for training) and computed the l 1 distance between predicted and ground truth pupil center to quantify the prediction error, using 1 where N is the total number of test eye images, s i gth is the ground truth pupil center location and s i pred is the predicted pupil center location.The mean error for 18 subjects is 0.805 pixels, with a standard deviation of 0.128.Fig. 6 shows pupil center ground truth annotations and test predictions.
3) Conjugate Gaze Estimation: We estimated the ability to perform conjugate eye movement by calculating the Spearman correlation coefficient r between the relative eye position of the left eye and the right eye, d the distance between two observed ranks, n is the number of observations using (1).The relative eye position is estimated from the "x", "y" coordinates of the pupil detector.We calculated the relative position of the eye using (2) to (4).Where θ x and θ y are the relative eye orientation on the x and y-axis.p x and p y are the normalized pupil center coordinates in the x and y-axis.a and b is the horizontal and vertical pixel range in the display screen (see Fig. 3(b)).κ is the size of the pixel in the screen, d is the distance from the screen, and g is the relative eye position.

IV. EXPERIMENT SETUP
We designed the experiment to capture calibrated and noncalibrated gaze data using the Tobii eye tracker during separate sessions (please refer to [20] for details of the main experiment).University of Virginia's Institutional Review Board approved the study protocol.As such, the protocol complies with all national ethical research standards in accordance with the Declaration of Helsinki.Written informed consent was obtained prior to subject enrollment and testing.Nineteen healthy controls participated in the calibration study.The mean age is 40 years consisting of 79 % female and 21 % male.The racial distribution of the controls was 81% White, 14% Asian, and 5% American Indian or Alaska Native.One participant did not follow the instructions during the experiment and was excluded from the study.

A. Data Synchronisation
During this study, we used the gaze data and the synchronized video from the RGB sensor of the RealSense Camera (1), which was acquired during the non-calibrated session.Further detail regarding non-calibrated gaze estimation using the Tobii eye tracker can be found here [20].The Tobii eye tracker acquired data at 120 Hz, the digital camera acquired data at 30 Hz, and the NeuroEye simulations were set at 240 Hz.For a fair comparison, the data from the three sources were synchronized at 30 Hz during this study (see Fig. 7).

B. Method Comparison
We evaluated the state-of-the-art, digital camera-based methods and our NeuroGaze prototype on the acquired NeuroEye  examination video and compared it to the "Tobii" eye tracker.We used the video feeds during the NeuroEye examination as "input" for gaze estimation.The gaze estimations of the relative position of the left and right eyes were the "output".We substituted the Tobii gaze coordinates, the centroid of the Bounding Box coordinate, and the GazeML [60] screen coordinates to (1) to (4) to estimate the relative eye position for the left and right eye.We evaluated pre-trained implementations for MPIIGaze, and GazeML on NeuroEye Dataset since these two methods fundamentally predict the gaze points from annotated screen coordinates.The NeuroEye Dataset does not use labels/annotation for gaze points as screen coordinates.The screen coordinates of GazeML were accounted for and normalized for the relative gaze estimation.We used gaze orientation output from MPI-IGaze [15], [40] implementation substituted to (4) to compute the relative gaze from the gaze orientation, and (1) to compute the conjugacy of the eye movement.
We performed a cross-dataset validation for a fair evaluation of NeuroGaze with the state-of-the-art methods above.Here we annotated pupil centers on a subset of the MPIIGaze dataset [61] and used the data in training the detection model of NeuroGaze.We [MAH, XY, and YZ] annotated and verified 2250 pupil center images by randomly selecting images of the left eye and right eye from each participant for each day.The implementations of NeuroGaze are as follows: "NeuroGaze (A)" trained on Neuro-Eye data and tested on NeuroEye data, "NeuroGaze (B)" trained on MPIIGaze-pupil center data and tested on NeuroEye data, and "NeuroGaze (C)" trained on NeuroEye and MPIIGaze-pupil center data and tested on NeuroEye data (see Table I).
The methods [49], [50] developed for pupil center detection using a bounding box were dependent on non-generalized local variables and failed to perform effectively in the NeuroEye data.Therefore we developed a Bounding Box approach using a similar CNN architecture to NeuroGaze pupil center detection.The model consists of two convolution layers followed by two Relu activation layers of dimension 3 × 20 × 40 and 32 × 10 × 20 A total of 5400 bounding boxes around pupil centers were annotated to train and validate the Bounding Box pupil detector.The annotated pupil center images were accumulated from segments of 50 random images of the left and right eye for each participant performing the NeuroEye examinations: H-Test, Dot-Test, and OKN-Test.We trained the model to minimize the mean square error between the predicted and ground truth coordinates.An Adam optimizer [59] of a learning rate of 5e − 4 was used for model optimization.
"NeruoGaze (D)" was developed from the pupil center detector using DeepLabCut [47] to track the movement of the pupil in NeuroEye dataset.We fine-tuned a pre-trained ResNet50 architecture on our labeled data to optimize for the task of pupil center detection.We used a batch size of 8, a momentum of 0.9, and trained for 500,000 iterations using SGD with a learning rate of 0.0001.To validate the accuracy of the model, we used a leave-one-patient-out validation approach.The model achieved an average MSE of 0.8 pixels on the validation dataset.

C. Eye Conjugacy
We utilized the Spearman correlation coefficient to quantify the conjugacy of left and right gaze estimations obtained from the digital camera-based methods above.We chose the correlation coefficient because movement between the eye in humans has near "perfect coordination" [62].This means that both eyes move at the same velocity in all directions.This is a de facto constant rate of change between the left and right eye, which results in a linear relationship.Mathematically a correlation coefficient represents this ocular physiology.Since the distribution of gaze coordinates is non-Gaussian, the Spearmen coefficient was preferred over Pearson.

V. RESULT
Table II presents the overall performance of the camera-based eye-tracking methods for conjugate gaze estimation.For this analysis, we assumed that healthy participants could perform conjugate eye movement in all directions.This assumption was confirmed by the high Spearman rank correlation coefficient reported by the Tobii Eye tracker.
Among the digital camera-based methods, the NeuroGaze (D) reported the best conjugate gaze estimation with the highest mean correlation coefficient (see Table II).The variations of the NeuroGaze method "NeuroGaze (A)", "NeuroGaze (B)", and "NeuroGaze (C)" with a shallow CNN also reported improved eye tracking performance compared to the state-of-the-art eye tracking methods.The deepnet ResNet50 improved 2% over the shallow CNN model used in "NeuroGaze (A)".The GazeML method reported the lowest correlation coefficient mean for all three NeuroEye examinations.In general, digital camera-based eye-tracking methods reported a higher variability in eye tracking for different participants compared to the Tobii eye tracker (see Fig. 8).
The H-Test stimulates a motion across all the quadrants of the screen.The NeuroGaze (D) reported a correlation coefficient mean of 0.86 with a 95% confidence interval ranging from 0.71 to 0.93.The Bounding Box method reported a correlation coefficient mean of 0.72 with a 95% confidence interval ranging from 0.47 to 0.90.The MPIIGaze reported a correlation coefficient mean of 0.60 with a 95% confidence interval ranging from 0.33 to 0.81.The overall performance of digital camera-based eye tracking methods showed a 40% to 45% variability in conjugate gaze estimation (see Fig. 8).
The relative gaze plot for the H-Test shows (see Fig. 9) a highly conjugate eye movement between the left and right eye for Tobii.The NeuroGaze showed the most conjugate eye movement while tracking with a similar trajectory to the relative gaze trajectory of Tobii.The Bounding Box approach also showed a similar trajectory to the relative gaze trajectory of Tobii.However, the Bounding Box approach did not consistently capture the conjugacy of the eye movement since there was an offset in pupil center detection for both eyes.The MPIIGaze also showed evidence of following a similar trajectory to the relative gaze trajectory of Tobii.However, the MPIIGaze trajectory was inconsistent and did not capture the eye movement's conjugacy.
The Dot-Test stimulates a series of saccade and fixation eye movement events across all the quadrants of the screen canvas.The NeuroGaze (D) outperformed the state-of-the-art digital camera-based methods with a correlation coefficient mean of 0.87 with a 95% confidence interval ranging from 0.65 to 0.94 (see Table II).The Bounding Box approach reported a correlation coefficient mean of 0.72 with a 95% confidence interval ranging from 0.27 to 0.86.The MPIIGaze reported a correlation coefficient mean of 0.55 with a 95% confidence interval ranging from 0.29 to 0.84.The pupil center-based eye tracking methods reported a tighter eye tracking variability of 20% to 25% (see Fig. 8).
The relative gaze plot for the Dot-Test shows (see Fig. 9) the relative eye movement during periods of fixation (when the relative gaze is flat) and saccade (when the relative gaze has a steep rise).NeuroGaze captured the most conjugate Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.eye movement while tracking the fixation and saccade eye movement.The NeuroGaze showed a similar trajectory to the relative gaze trajectory of Tobii.The Bounding Box approach did not consistently capture the left and right eye movement.The conjugacy of eye movement was also reasonable during fixation periods.However, the ability to capture eye movement conjugacy lowered during periods of saccades.
The OKN-Test stimulates a series of alternative smooth pursuit and saccade eye movements resulting in a consistent movement of the eyes horizontally at two different speeds.The Tobii reported its lowest correlation coefficient mean of 0.84 among the three NeuroEye examinations for the OKN test.At the same time, Tobii eye tracking presented a variability of 25% among the healthy participants from the study.Among digital camera-based eye tracking, NeuroGaze (D) reported the highest performance with a correlation coefficient mean of 0.56 with a 95% confidence interval ranging from 0.17 to 0.61 (see Table II).The Bounding Box approach reported a correlation coefficient mean of 0.38 with a 95% confidence interval ranging from 0.09 to 0.6.The MPIIGaze reported a correlation coefficient mean of 0.38 with a 95% confidence interval ranging from 0.13 to 0.61.Both COTS and digital camera-based eye tracker performance decreased during the OKN-Test.
The relative gaze plot for the OKN-Test shows that (see Fig. 9) the eye movement during periods of smooth pursuit (when the relative gaze has a linear rise) and saccades (when the relative gaze has a steep rise).The NeuroGaze outperformed other digital camera-based-eye tracking methods by capturing the most conjugate eye movement while tracking the smooth pursuit and saccade eye movement.The Neu-roGaze also showed a similarity in trajectory to the relative gaze trajectory of Tobii.However, the Bounding Box approach showed a less similar trajectory to the relative gaze trajectory of Tobii.This could also result from the Bounding Box approach's inability to capture the left and right eye movement consistently.

VI. DISCUSSION
In this preliminary study, we have demonstrated the potential of digital camera-based methods for estimating eye movement and conjugacy, which is a clinically relevant characteristic used by neurologists and ophthalmologists to determine the presence of neuropathology.To achieve this, we developed a novel method to investigate the performance of digital camera-based eye tracking during computer-adapted neurological eye examinations.We also emphasized the importance of eye-tracking methods that use the pupil center to track the eyes for neuroophthalmology clinical studies.Furthermore, we explored the impact of relative gaze use and the generalizability of the pupil-center detection approach.Our findings demonstrate the feasibility of using digital camera-based gaze estimation to measure clinically relevant eye-tracking information.Results suggest that eye tracking has the potential to be incorporated into existing clinical workflows, such as EMS triage systems and telemedicine, without the need for a dedicated device, such as a wearable or binocular camera eye tracker.

A. Impact of Neurological Eye Examinations
The previous implementation of eye tracking with a digital camera mainly measured gaze points, fixation sequence, heat maps, and area of interest [63], [64].Neurological eye examinations stimulate or elicit fixation patterns, smooth pursuit, and saccadic eye movements.These clinically relevant measurements are widely different compared to conventional gaze measurements [64].Eye conjugacy is one clinically relevant measurement estimated from binocular eye movements.The COTS and NeuroGaze eye conjugacy estimations were mainly affected during the OKN-test by continuous eye motion stimulated by smooth pursuit and saccadic eye movements.The low image acquisition speed of Tobii at 120 Hz and NeuroGaze at 30 Hz compared to the speed of saccadic movements may have impacted the quantification of eye conjugacy measurements during the OKN-test.The impact of image acquisition speed for NeuroGaze may be reflected in the variability in conjugacy measurements from H-test and Dot-Test (see Fig. 8).NeuroGaze reported a high variability during H-test in which the eye continuously moved at a physiologically slow speed, whereas low variability is present in the Dot-test in which the eyes switch from fixated and saccadic phases.The "stop" and "go" pattern of eye movement (Dot-test) is more suitable for NeuroGaze compared to continuous eye movement elicited during the H-test.

B. Pupil Center
We demonstrated that a digital camera could measure clinically relevant information using a pupil-center-based eye-tracking method.The method comprises multiple steps, including facial landmark alignment, normalization, and appearance-based pupil center detection.We showed that it is more advantageous to develop a model by training on pupil center coordinates compared to screen coordinates.Comparing the performance of the pre-trained MPIIGaze and GazeML on the NeuroEye dataset was unfair since the models were not trained on this dataset.For a fairer evaluation, we trained NeuroGaze (B) on the MPIIGaze dataset, then tested it on the NeuroEye dataset (see Fig. 10).NeuroGaze (B) still outperformed the MPIIGaze method, which further emphasizes appearance-based pupil detection's benefit in measuring eye

C. Relative Gaze Trajectory
In addition to quantifying the conjugacy, combining the relative gaze trajectory/path with eye movement detection methods [7], [65] enables the quantification of eye movement events such as the number of saccades performed during the OKN-test.This correlative information about the left and right eye may overcome the limitations caused by the slow image acquisition speeds of the digital camera as shown in relative gaze plots in Figs. 1 and 11.

D. NeuroGaze Vs. Tobii
The relative gaze plots comparing Tobii and NeuroGaze show (see Figs. 1,11,12) that similar trends among the trajectories.The path plots also highlight the capability of NeuroGaze to capture fixation, smooth pursuit, and saccade eye movements.However, there is a clear difference in the scale of the relative gaze angle.The dissimilarity is most likely due to the different working principles of the COTS eye tracker "Tobii" and NueroGaze.The NeuroGaze works based on identifying the pupil center of the eye to estimate the relative gaze.This is achieved using the eye's optical axis to perform relative gaze estimation.Tobii, on the other hand, works on the principle of pupil central corneal reflection (PCCR) to estimate relative gaze [66].PCCR is based on re-constructing the optical axis to derive the visual axis and estimate gaze using the visual axis.The difference between the optical and visual axis is known as the Kappa angle.Therefore, we can assume that the Kappa angle may be the most likely cause of the difference in scale of the relative gaze angle.

E. Limitations
There are a few limitations to this study.First, as a proofof-concept, our study included small sample sizes and lacked race/ethnic diversity.The full extent of a gaze correlation coefficient to discriminate between normal and abnormal eye movements will require further research with larger and more diverse sample sizes and a greater range of neuro-ocular deficits.Despite the small sample size, it may be reasonable to assume that the healthy participants in this cohort represent the distribution of normal eye movements in a healthy adult population.We are currently collecting data on consecutive patients presenting with the acute vestibular syndrome and posterior circulation stroke to better assess the feasibility of NeuroGaze in a population with pathological eye movements.
The relative gaze plot of Tobii for the H-test (see Fig. 11) shows catch-up saccades of low gain while performing smooth Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.pursuit eye movements.These small catch-up saccades are normal during smooth pursuits [67].However, we noticed that the digital camera-based eye-tracking methods had limited ability to capture these small saccades consistently.We hypothesize that the slow image acquisition speed is why these methods miss small saccades.

VII. CONCLUSION
We investigated the feasibility of using a digital camera to capture eye movements during digitally adapted clinical eye exams (NeuroEye).We presented the performance of a novel method NeuroGaze along with other state-of-the-art digital camera-based eye tracking methods and compared them internally and to a COTS eye tracker.NeuroGaze demonstrated the ability to estimate eye conjugacy consistently better than other state-of-the-art methods, under fair comparison conditions.Neu-roGaze did so by having the most similar conjugacy estimates to the "Tobii" reference and less variability.Specific to NeuroGaze, our method tested accurately for most participants of the H-Test and Dot-Test, and a few participants of the OKN-test.This is promising since NeuroGaze can be further improved to capture conjugate eye movements in these less accurate cases by (1) increasing the number of training samples by enrolling more participants and (2) improving the architecture of the pupil detector to achieve enhanced accuracy of measuring the pupil center.
In conclusion, we present preliminary evidence that digital cameras can be used with machine learning techniques to estimate eye conjugacy for future clinical applications.This is clinically significant because this approach overcomes the limitations of complex and expensive setups (e.g., infrared cameras).More importantly, this approach removes the need for a calibration procedure which has caused prior studies to exclude participants, potentially introducing selection bias and limiting generalizability.Our feasibility study suggests that this technology could be deployed for clinical use in the clinic or pre-hospital setting, including telemedicine or emergency medical services (EMS) encounters to detect neurological injury or diseases that cause neuro-ocular deficits, like stroke.This underscores the need for further research on this approach with neurological disease populations.This is the focus of our continued research.

Fig. 3 .
Fig. 3. Rolling Apparatus to Detect Impairment of the Eyes -RoADIE.(a) side view of the RoADIE mobile rig, (b) Illustration screen view including the hardware components for RoADIE.

Fig. 4 .
Fig. 4. Illustration of the NeuroEye examination.(a) Represents the Dot-Test for the Neuro-eye examination.The red dot with the higher contrast represents the current position of the dot, and the red dot with the lower contrast represents the past, or future position of the dot (b) represents the H-Test for the Neuro-eye examination.The red dot with the higher contrast represents the current position of the dot, and the red dot with the lower contrast represents the past or future position of the dot (c) Illustration of the OKN-Test.

Fig. 7 .
Fig. 7. Illustration of the data synchronization for RoADIE.The yellow guideline indicates the time point at which the data was extracted.

Fig. 8 .
Fig. 8. Box-Whisker plot representation of the Spearman rank-order correlation coefficient summary of all participants for COTS eye tracker and state-of-the-art web camera-based eye-tracking methods.

TABLE I NEUROGAZE
PUPIL DETECTOR MODEL DESCRIPTION

TABLE II MEAN
CORRELATION AND THE 95% CONFIDENCE INTERVAL (CI) COMPARISON FOR THE SPEARMAN RANK-ORDER CORRELATION COEFFICIENT OF THE PROPOSED NEUROGAZE, COTS EYE TRACKER, AND STATE-OF-THE-ART DIGITAL CAMERA-BASED EYE-TRACKING METHODS performing convolutions with a kernel of 3 × 3 and stride of 2. The third Relu activation layer is between two fully connected layers of dimension 3200 × 128 and 128 × 2. The final fully connected layer is connected to the Regression layer that outputs four nodes, respectively, used for the prediction of x, y, h, and w of the bounding box around the pupil center.The pupil center is estimated by estimating the centroid of the bounding box.