Associations between depression symptom severity and daily-life gait characteristics derived from long-term acceleration signals in real-world settings

Gait is an essential manifestation of depression. Laboratory gait characteristics have been found to be closely associated with depression. However, the gait characteristics of daily walking in real-world scenarios and their relationships with depression are yet to be fully explored. This study aimed to explore associations between depression symptom severity and daily-life gait characteristics derived from acceleration signals in real-world settings. In this study, we used two ambulatory datasets: a public dataset with 71 elder adults' 3-day acceleration signals collected by a wearable device, and a subset of an EU longitudinal depression study with 215 participants and their phone-collected acceleration signals (average 463 hours per participant). We detected participants' gait cycles and force from acceleration signals and extracted 20 statistics-based daily-life gait features to describe the distribution and variance of gait cadence and force over a long-term period corresponding to the self-reported depression score. The gait cadence of faster steps (75th percentile) over a long-term period has a significant negative association with the depression symptom severity of this period in both datasets. Daily-life gait features could significantly improve the goodness of fit of evaluating depression severity relative to laboratory gait patterns and demographics, which was assessed by likelihood-ratio tests in both datasets. This study indicated that the significant links between daily-life walking characteristics and depression symptom severity could be captured by both wearable devices and mobile phones. The gait cadence of faster steps in daily-life walking has the potential to be a biomarker for evaluating depression severity, which may contribute to clinical tools to remotely monitor mental health in real-world settings.


Introduction
Depression affects over 300 million people's lives worldwide [1] and is associated with many adverse outcomes, including decreased quality of life, loss of occupational function, disability, premature mortality, and suicide [2][3][4][5]. While early treatment can be effective and prevent more serious adverse outcomes [6], more than half of depressed people do not receive timely treatment [7,8]. Current questionnaire-based depression assessments may be affected by recall bias and may not be able to collect dynamic information [9,10]. Therefore, several recent studies have attempted to explore the associations between depression and changes in individuals' behaviors using mobile technologies [11].
Changes in gait are essential manifestations of depression [12,13]. The main hypothesis linking gait with depression is a bi-directional interaction between the brain motor system and cortical and subcortical structures, which are related to emotions and cognitive functions [14][15][16]. Many studies have explored the relationships between depression and gait characteristics based on "gold standard" laboratory walking tests.
Laboratory tests are hard to apply in real-world settings because of the need for expensive equipment (e.g. video camera and force plates), specialized laboratories, and the inconvenience of wearing sensors on knees and ankles for example [14,26]. Further, studies have found that laboratory and daily-life gait measurements do not correlate perfectly probably due to subjective psychological factors, laboratory-controlled conditions, and complex situations in daily-life walking [27,28]. Some researchers have suggested that people's daily-life activity characteristics should have stronger links to their health conditions than laboratory tests [27]. Therefore, it is necessary to monitor and evaluate daily-life walking using an efficient method.
With the development of sensor technology, steps in daily-life walking can be detected by mobile phones and convenient wearable devices, which provide a cost-efficient, continuous, and unobtrusive means to quantitatively monitor daily-life gait characteristics. In recent years, several studies have used mobile phones or wearable devices for long-term monitoring of daily-life gait characteristics and explored their links with fall risks [28,29] and neurological disorders [30]. Weiss et al found that accelerometer-derived measures based on long-term monitoring improved the identification of fall risks compared with laboratory tests [29]. A recent review stated that free-living monitoring using an accelerometer can confer advantages over clinical assessments in Parkinson's disease [30]. However, for the relationships between depression and daily-life walking, to the best of our knowledge, only the number of steps has been investigated [31][32][33]. The number of steps is more focused on reflecting levels of individuals' mobility and physical activity, rather than gait characteristics. Gait characteristics of daily-life walking, such as gait cadence, gait force, and variance in gait, and their associations with depression are yet to be fully explored.
To fill this gap, this study aimed to explore the associations between depression symptom severity and daily-life gait characteristics derived from mobile technologies.
Specifically, we extracted several features related to gait cadence [34] and gait force [35] which could be extracted from acceleration signals to represent characteristics of daily-life walking over a long-term period, and assessed their associations with corresponding self-reported depression scores of this period. We also tested whether daily-life gait characteristics could provide additional value for evaluating depression symptom severity relative to laboratory gait patterns or demographics. To explore whether associations between daily-life walking and depression could be captured by different accelerometer devices, we performed our analyses on two ambulatory datasets whose acceleration signals were collected by a wearable device and mobile phone, respectively [29,36].

Datasets Long Term Movement Monitoring dataset
The Long Term Movement Monitoring (LTMM) dataset includes 71 elderly adults' demographics (age and gender), depression scores (the 15-item Geriatric Depression Scale [GDS-15]), and acceleration signals of laboratory walking tests and 3-day activities [29], which can be downloaded at PhysioNet [37]. None of the participants in the LTMM dataset showed signs of cognitive impairment (according to the Mini-Mental State Examination) and any gait or balance disorders [29]. At the enrollment session, the participant's depression symptom severity was estimated using the GDS-15 whose total score ranges from 0 to 15 (increasing depression symptom severity) with a cutoff score of ≥ 5 indicating probable depressive disorders [38]. Participants were asked to walk at a self-selected and comfortable speed for 1 minute in the laboratory while wearing a 3-axis accelerometer on their lower back [29]. After the laboratory walking test, all participants were asked to wear the accelerometer for the next 3 consecutive days to record daily activities. All acceleration signals were recorded at 100 Hz [29].

RADAR-MDD-KCL dataset
The EU research program Remote Assessment of Disease and Relapse -Major Depressive Disorder (RADAR-MDD) aimed to investigate the utility of mobile technologies for long-term monitoring of participants with depression in real-world settings [36,39]. In this paper, we used a subset of RADAR-MDD which was collected from a study site in the United Kingdom (King's College London [KCL]), because the KCL site was the only site to acquire ethical approval for collecting the phone's acceleration signals of participants' daily-life walking. We denoted this subset as the RADAR-MDD-KCL dataset for reading convenience. The phone's acceleration signals were collected at 50 Hz and uploaded to an open-source platform, RADAR-base [40].
The participant's depression symptom severity was assessed by the 8-item Patient Health Questionnaire (PHQ-8) conducted through mobile phones every 2 weeks. The total score of the PHQ-8 ranges from 0 to 24 (increasing severity) with a cutoff score ≥ 10 for clinically significant depression symptoms [41]. Participants' demographics (age and gender) and the number of comorbidities (Supplement Table 1) were considered as covariates in this study to control confounding factors that may affect gait characteristics. A patient advisory board comprising service users co-developed the study. They were involved in the choice of measures, the timing, and issues of engagement and have also been involved in developing the analysis plan.
Step detection algorithm Since we need to respectively detect steps on the acceleration signals collected by wearable devices and mobile phones, we chose to apply the step detection algorithm presented in [42], which was based on mobile phones ( Figure 1). Given a segment of 3-axis acceleration signals ( , , ) , first, the magnitude of the acceleration of the segment of acceleration signals was calculated to combine 3-dimension signals to a single series , where = √ 2 + 2 + 2 . The magnitude of the acceleration signals does not depend on the orientation and tilt of the mobile phone during walking [42].
Then, was filtered by a weighted moving average filter to remove noise (equation 1, = 150 ). Next, the filtered ̅ was subtracted by the mean of ̅ to make ̅ symmetric to the x-axis. We calculated two new series 1 and 2 based on two thresholds to detect the walking swing phase and stance phase, respectively (Equation (1)

Gait cycles and gait force
Then, the gait cycle series could be derived by calculating time intervals between consecutive steps, which was denoted as . During each gait cycle, the amplitude from peak to valley of the magnitude of the acceleration signals was used to reflect the gait force of each step. The force of all steps in the given acceleration signal was denoted as the series .

Figure 1.
Step detection algorithm. ACC is the 3-axis acceleration signals, B1 and B2 are two series calculated by thresholds to detect walking swing and stance phase respectively, pink dash lines represent detected steps.

Feature extraction
Since some gait metrics, such as stride length and body sway, are hard to be precisely extracted from acceleration signals, gait features extracted in this study were based on gait cadence and gait force. Gait cadence is the rate at which the individual feet contact the ground [34], which changes over time during daily-life walking. Therefore, we used not only the step count in one minute (the measurement of gait cadence in the laboratory gait tests [17]) but also the median of gait cycles and parameters in the frequency domain to describe gait cadence in this paper. Gait force reflects the ground reaction force during walking [35].
We extracted two categories of gait features: short-term (laboratory) gait features and daily-life gait features. Five short-term gait features were extracted from 1-minute laboratory walking tests of the LTMM dataset or 1-minute continuous walking segments (defined later) in both datasets to reflect the average gait cadence and gait force in that minute. Twenty daily-life gait features were extracted from 3 days after the enrollment for the LTMM dataset and 14 days before each PHQ-8 for the RADAR-MDD-KCL dataset to describe the distribution and variance of gait cadence and gait force over the period. Table 1 summarizes all gait features extracted in this paper.

Short-term (laboratory) gait features
For 1-minute acceleration signals, we first applied the step detection algorithm to obtain the gait cycles series ( ) and gait force series ( ). The median of gait cycles series and the number of steps were used to reflect the gait cadence of this minute from the time domain, which were denoted as Median_Cycle and Step_Count, respectively.
To assess the gait cadence from the frequency domain, the power spectral density (PSD) of walking was obtained by applying the fast Fourier transformation (FFT) to the filtered magnitude (̅) of the acceleration signals. The peak frequency [43] and mean frequency [44] of 0.5-3 Hz band [29] of the PSD were used to reflect the main rhythm and average rhythm of steps from the frequency domain, which were denoted as Peak_Freq and Mean_Freq, respectively. For gait force, we calculated the median of series (Median_Force) to represent the average power of all steps in one minute.

Daily-life gait features
Extracting daily-life gait characteristics from a relatively long-term period (3 days for LTMM or 14 days for RADAR-MDD-KCL) acceleration signals can be divided into 2 steps: (1) continuous walking segments detection and (2) daily-life gait feature extraction from detected continuous walking segments. A schematic diagram of dailylife gait feature extraction is shown in Figure 2. For the first step, the size of the walking segments was chosen to be 1 minute [45]. We applied the step detection algorithm to every minute of the long-term acceleration signals. The walking time (sum of all gait cycles in the minute) of a minute can reflect whether the participant is continuously walking in this minute. Intermittent walking (such as walking in a crowded environment or a walking-rest transition status) with a short walking time of a minute may not fully reflect a participant's normal walking patterns. Therefore, we set 50 seconds as the threshold for selecting continuous walking segments, that is, the segments with more than 50 seconds of walking time were selected for further analysis. In this step, depression score records with no continuous walking segments detected in the corresponding period were discarded.
In the second step, we first extracted 5 short-term (laboratory) gait features (described above) from each detected continuous walking segment.   The median of gait cycles in the 1-minute walking. Step_Count The number of steps detected in the 1-minute walking. Peak_Freq The peak frequency in the PSD a of magnitude of 1-minute acceleration signals.

Mean_Freq
The mean frequency in the PSD of magnitude of 1-minute acceleration signals.

Median_Force
The median of gait force in the 1-minute walking. Daily-life gait feature Median_Cycle_25 The 25th percentile of median gait cycle values of all walking segments b . Median_Cycle_50 The median of median gait cycle values of all walking segments. Median_Cycle_75 The 75th percentile of median gait cycle values of all walking segments.

Median_Cycle_Std
The standard deviation of median gait cycle values of all walking segments. Step_Count_25 The 25th percentile of step count values of all walking segments. Step_Count_50 The median of step count values of all walking segments. Step_Count_75 The 75th percentile of step count values of all walking segments. Step_Count_Std The standard deviation of step count values of all walking segments. Peak_Freq_25 The 25th percentile of peak frequency values of all walking segments. Peak_Freq_50 The median of peak frequency values of all walking segments. Peak_Freq_75 The 75th percentile of peak frequency values of all walking segments. Peak_Freq_Std The standard deviation of peak frequency values of all walking segments.

Mean_Freq_25
The 25th percentile of mean frequency values of all walking segments. Mean_Freq_50 The median of mean frequency values of all walking segments.

Mean_Freq_75
The 75th percentile of mean frequency values of all walking segments. Mean_Freq_Std The standard deviation of mean frequency values of all walking segments. Median_Force_25 The 25th percentile of median gait force values of all walking segments. Median_Force_50 The median of median gait force values of all walking segments. Median_Force_75 The 75th percentile of median gait force values of all walking segments.

Median_Force_Std
The standard deviation of median gait force values of all walking segments. a PSD: power spectral density from 0.5 Hz to 3 Hz. b All 1-minute continuous walking segments (defined in the Method section) in a specified time window (3 days for the Long-Term Movement Monitoring dataset and 14 days for the RADAR-MDD-KCL which is a subset of Remote Assessment of Disease and Relapse -Major Depressive Disorder dataset, which was collected from King's college London, United Kingdoms).

Associations analyses
For the LTMM dataset, Spearman's coefficients [46] were performed to assess associations between the GDS-15 score and gait features (5 laboratory gait features and 20 daily-life features). As the data in the RADAR-MDD-KCL dataset is longitudinal (repeated PHQ-8 measurements for each participant), a series of pairwise linear mixedeffect regression models [47] with random participant intercepts were performed to explore the association between the PHQ-8 score and each of 20 daily-life gait features (no laboratory tests in the RADAR-MDD-KCL dataset). Age, gender, and the number of comorbidities were considered as covariates. The Benjamini-Hochberg method was used for multiple comparison corrections in both datasets [48].  [49]) before including in the regression model with dailylife gait features. To indicate the proportion of data was explained by regression models, we calculated 2 and adjusted 2 for multiple linear regression models. For linear mixed-effect regression models, we calculated marginal 2 for indicating data variance explained by fixed effects and conditional 2 for representing data variance explained by both fixed effects and random effects [50]. The likelihood ratio tests [51] were performed to test whether the models with daily-life gait features fit the depression score significantly better than the models without daily-life gait features.

Data summary
The 71 participants in the LTMM dataset have a mean (SD) age of 78. 36 Table   2.

Associations between gait features and the GDS-15 in the LTMM dataset
The significant Spearman correlations between the GDS-15 score and gait features in the LTMM dataset are shown in Table 3.

RADAR-MDD-KCL dataset
The pairwise linear mixed-effect models performed on the RADAR-MDD-KCL dataset revealed that 3 of 20 daily-life gait features extracted from 14-day acceleration signals were significantly associated with the PHQ-8 score (Table 4) Table 5

Multivariate linear mixed-effect regression models and likelihoodratio test in the RADAR-MDD-KCL dataset
The results of 2 nested multivariate linear mixed-effect regression models with and without daily-life gait features of the RADAR-MDD-KCL dataset are displayed in

Principal findings
This study explored the associations between depression symptom severity and dailylife gait characteristics of real-world settings using two separate datasets (LTMM and RADAR-MDD-KCL) from different populations, assessed by different depression questionnaires and accelerometer devices. To the best of our knowledge, our study is the first to investigate associations between depression symptom severity and daily-life gait characteristics derived from acceleration signals in real-world scenarios. We extracted 20 daily-life gait features to describe the distribution (25th percentile, median, and 75th percentile) and variance of gait cadence and force over a long-term period corresponding to a self-reported depression score. The main findings of this paper are 1) gait cadence of faster steps (75th percentile) over a long-term period has a significant negative association with the depression symptom severity of that period, 2) daily-life gait features could provide additional value for evaluating depression symptom severity relative to laboratory gait characteristics and demographics, and 3) wearable devices and mobile phones both have potentials to capture the associations between daily gait and depression.

Associations between depression symptom severity and gait features
The results of Spearman correlations between laboratory gait features and the GDS-15 score in the LTMM dataset are consistent with previous studies [17][18][19][20][21][22][23][24][25], that is, the participants with more severe depression symptoms were likely to have slower gait cadence (longer the median of gait cycles and lower gait frequency) and smaller gait force in laboratory walking tests. For daily-life gait features, gait cadence of faster steps (75th percentile) over a long-term period has a significant negative association with the depression symptom severity of that period. Specifically, if a participant has severe depression symptoms, the frequency of his/her faster steps (over a specific period) is lower (or the median of these steps' cycles is longer). This finding is consistent in both the LTMM and the RADAR-MDD-KCL datasets. As the situations in daily-life walking are complex (such as walking during the day or at night, walking under fatigue or walking after rest, and walking to a destination or navigating a crowded supermarket), the performances of participants during walking were also different [28]. Therefore, from the main finding of this paper, we speculated that the faster steps over a long-term period could represent the optimal performance of a participant which could be associated closely with their depression status. A previous study of another field (fall risks) also showed that the extreme values (10th and 90th percentile) of gait characteristics could reflect the physical or mental conditions better than the median value of gait characteristics [28].
For gait features related to the variance of gait over a long-term period, associations with the depression score were inconsistent across two datasets. In the LTMM dataset, we found features of the variance of gait in 3 days were significantly and negatively associated with the depression symptom severity (Table 3)

Additional value of daily-life gait characteristics for evaluating the depression symptom severity
The results of likelihood ratio tests for regression models with and without daily-life gait features in the LTMM and the RADAR-MDD-KCL datasets ( Table 5 and Table 6) both indicated that daily-life gait characteristics could provide additional value for evaluating depression symptom severity relative to laboratory gait features (only in LTMM) and demographics in two datasets.
From the results of multiple linear regression models in the LTMM dataset (

The LTMM and RADAR-MDD-KCL datasets
Populations, depression questionnaires, and accelerometer devices are all different across the LTMM and the RADAR-MDD-KCL datasets. Since the LTMM dataset was designed for exploring elderly people's fall risks, the participant's depression status was not considered in the study protocol. The proportion of participants with potential depressive disorders was less than one-third (25.35%), which may lead to statistical bias. To the best of our knowledge, existing studies on the LTMM dataset were all related to fall risks, therefore, this paper is the first to use the LTMM dataset to explore the associations between depression and gait. Compared with the LTMM dataset, the RADAR-MDD-KCL dataset has a wider age distribution population with longer follow-up. Participants in the RADAR-MDD-KCL dataset have at least one diagnosis of depression in the last 2 years, which may be the reason that the RADAR-MDD-KCL dataset has more than half of PHQ-8 records with potential severe depression symptoms (50.08%; Table 2). Furthermore, as the RADAR-MDD-KCL dataset has multilevel data, the results indicated the significant associations between daily-life gait patterns and depression exist in both individual and population levels. Some consistent results in two separate ambulatory datasets indicated the generalizability of our findings, which provide additional confidence in associations between depression and daily-life gait and features identified in this paper.

Limitations
In this work, we did not fully explore all the gait characteristics of daily-life walking, because some gait metrics are hard to be extracted from acceleration signals such as step length, body posture, and body swing [14]. Gait features used in this paper were only related to gait cadence and gait force. Our aim was to explore information from daily-life walking that can improve the evaluation of depression symptom severity, rather than replacing the "golden standard" laboratory gait assessment. Therefore, we focused on illustrative statistics-based features demonstrating the importance of longterm gait features for depression monitoring. More complex features such as nonlinear features will be considered in future research.
In our previous studies on sleep and Bluetooth data in the RADAR-MDD dataset [53,54], each participant has an average of 8 PHQ-8 records with sleep or Bluetooth data.
However, each participant has only an average of 3.09 PHQ-8 records with detected continuous walking segments (Table 2). Possible reasons are that high battery consumption and network traffic for uploading the raw acceleration signals, the Android operating system moderation of resources, and the range of Android phones have some variability in the performance of accelerometer sensors. We also found many PHQ-8 records with acceleration signals have no continuous walking segment detected (Supplement Table 2). The potential reasons are that several participants have low mobility because of physical morbidities and depression [55], and some participants may not bring the phone during their walking.
Although missing acceleration signals caused several steps to be undetected and affected the distribution and variance of steps, the remaining detected continuous walking segments can still partially reflect the gait cadence and force of the participant.
Since we are not measuring mobility and activity but gait characteristics, we included PHQ-8 records with at least one continuous walking segment into data analyses. The results also showed that the associations between daily-life walking patterns and the depression score could still be captured by the mobile phone's acceleration signals.
According to the findings in this paper, we can consider uploading gait cycles instead of uploading raw acceleration signals in future long-term monitoring research, which may spend less battery consumption and reduce the missingness of steps information.
It is not difficult to implement, as most current smartphones have real-time step detection functions or apps [56,57].
Gait characteristics could be affected by some physical diseases, neurological disorders, and age [58][59][60]. Although we considered the number of comorbidities (Supplement Table1) and demographics as covariates to control the confounder factors, physical comorbidities and other comorbidities may have different impacts on the gait characteristics. We will consider a wider range of comorbidities and investigate them further in future research.
We considered a 1-minute acceleration segment with more than 50 seconds of walking as a continuous walking segment and included it in our analysis. The 1-minute segment size was suggested by previous studies [29,45], and the threshold of 50 seconds for continuous walking was manually specified by our experience for excluding some abnormal steps in several conditions, such as walking in a crowded environment and intermittent walking indoors, which may not reflect the normal walking patterns.
Another past study used a 10-second time window for the step detection and short-term feature extraction [28], but this may reduce the accuracy of step detection and involve abnormal steps. Therefore, we will explore the optimal window size for detecting steps and optimal threshold for continuous walking in future research.

Conclusion
We found significant links between depression symptom severity and daily-life gait characteristics and these associations could be captured by acceleration signals of both wearable devices and mobile phones. These daily-life walking patterns could provide additional value for understanding depression manifestations relative to gait patterns in laboratory walking tests and demographics. The gait cadence of faster steps of dailylife walking over a long-term period has a significant and negative association with depression symptom severity of that period which has the potential to be a biomarker for detecting depression. This study illustrated that long-term gait monitoring for depression may contribute to clinical tools to remotely monitor mental health in realworld settings.