This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.
Freezing of gait (FoG) is one of the most disturbing and least understood symptoms in Parkinson disease (PD). Although the majority of existing assistive systems assume accurate detections of FoG episodes, the detection itself is still an open problem. The specificity of FoG is its dependency on the context of a patient, such as the current location or activity. Knowing the patient's context might improve FoG detection. One of the main technical challenges that needs to be solved in order to start using contextual information for FoG detection is accurate estimation of the patient's position and orientation toward key elements of his or her indoor environment.
The objectives of this paper are to (1) present the concept of the monitoring system, based on wearable and ambient sensors, which is designed to detect FoG using the spatial context of the user, (2) establish a set of requirements for the application of position and orientation tracking in FoG detection, (3) evaluate the accuracy of the position estimation for the tracking system, and (4) evaluate two different methods for human orientation estimation.
We developed a prototype system to localize humans and track their orientation, as an important prerequisite for a context-based FoG monitoring system. To setup the system for experiments with real PD patients, the accuracy of the position and orientation tracking was assessed under laboratory conditions in 12 participants. To collect the data, the participants were asked to wear a smartphone, with and without known orientation around the waist, while walking over a predefined path in the marked area captured by two Kinect cameras with non-overlapping fields of view.
We used the root mean square error (RMSE) as the main performance measure. The vision based position tracking algorithm achieved RMSE = 0.16 m in position estimation for upright standing people. The experimental results for the proposed human orientation estimation methods demonstrated the adaptivity and robustness to changes in the smartphone attachment position, when the fusion of both vision and inertial information was used.
The system achieves satisfactory accuracy on indoor position tracking for the use in the FoG detection application with spatial context. The combination of inertial and vision information has the potential for correct patient heading estimation even when the inertial wearable sensor device is put into an a priori unknown position.
Freezing of Gait (FoG) is a temporary, involuntary inability to initiate or continue movement lasting just a few seconds, or on some occasions, several minutes [
In cognitive psychology, attention set-shifting is defined as the ability to move back and forth between tasks, operations, or mental sets in response to the changing internal goals or the changes in the environment perceived through senses. According to Naismith et al [
The usual pharmacological way of treating FoG is the same as the general treatment of PD. Research has shown that dopamine treatment helps in reducing the number of occurrences of the symptom, but that it cannot eliminate the symptom completely [
Active monitoring technology has the potential to alleviate FoG through timely episode detection and sensory stimulation. Timely detection is based on online data acquisition of motor symptoms of PD. The usual approach is to use wearable inertial sensors in order to obtain kinematic parameters of the movements of body segments. As already mentioned, gait alterations like short shuffling steps and festinations are characteristic of FoG. Therefore, the analysis of gait parameters is a good indicator of the patient’s state. The foundation of the work on adaptive systems for ambulatory monitoring of FoG was established with the offline detection algorithm based on frequency analysis of leg movements proposed by Moore et al [
So far, there has been no consensus on the best inertial sensor combination/position for the ambulatory analysis of human gait. The system for FoG detection and gait unfreezing presented by Jovanov et al [
In the area of assistive technology, user acceptance of technological solutions is crucial. It has been proposed that using one light inertial measurement unit (IMU) fixed on the lateral side of the waist of the user is the most user-friendly position, which also gives satisfying results in gait analysis [
The amount of information that can be extracted from one sensor device is finite, and it is reasonable to expect that the overall detection accuracy of such a system cannot be higher than the accuracy of a system composed of multiple sensors. Furthermore, wearable inertial sensors currently used for gait analysis are able to sense only physical context of the user, while it is known that the FoG episode onset can also be under direct influence of other types of context (situation, location, and/or cognitive load). It would be nice to keep the ease-of-use of the gait monitoring system composed of only one wearable sensor, and at the same time to enhance its reliability. One way to achieve this improvement is through the use of spatial context, which is the term used to describe a combination of the patient's location and his relation toward elements triggering FoG in his environment.
In their home environment, PD patients are likely to encounter narrow passages such as doorways or dynamically changing spaces created by the presence of other people and movable objects such as chairs. When PD patients perceive the space as too narrow for the dimensions of their body, adaptive postural changes during locomotion may be needed to achieve collision-free passage [
When inferring spatial context in FoG, we are primarily interested in the locomotion behavior of the patients. Examples from literature show that a two-dimensional (2D) point representation on a floor map is sufficient for this kind of task [
One of the main objectives of our research is to discover if spatial context and the principle of direct geometric correlation can effectively be used to improve automatic detection of FoG in a home environment. This objective requires design and development of a technical system that is able to observe people and their environment, along with the ability to apply correct contextual rules using the observed data. The hypothesis of the direct correlation between geometry of the surrounding environment and FoG episodes has so far been tested only in controlled laboratory environments. There is a need for behavioral data of FoG patients from homes, because a clinical environment is not perceived by people in the same way as their natural environment. As a result, freezing episodes do not happen in the artificial setting in the same way that they would happen at home. The lack of domestic behavioral and environmental data of FoG patients is a significant obstacle that must be taken into account during the analysis of requirements for the design of a future context-aware system.
We divided the development process of the system into two principal stages. The goal of the first stage is to establish a people-tracking system for the collection of behavioral data in the homes of people with FoG. Collected data will be used to build the needed contextual model of FoG. In the second stage, the contextual inference part will be added to the existing tracking system with the goal of testing the finalized system through long term deployment. During the first stage, short term, one day long experimental sessions are expected in both clinical settings and home environments. Because of this, the position and orientation tracking system being developed needs to have the properties of a portable system, allowing for fast installation and setup. Besides reliability and accuracy in tracking people's position and orientation, the system also needs to be modular allowing for scalability in the coverage of an indoor space. Additional requirements for permanent deployment are the ability to identify the FoG patient among members of a household, usability on a daily basis, and ultimately, affordability.
Taking into account the above requirements, we have designed a solution for an improved, pervasive context-aware home-based system for PD patients based on distributed sensing. In the development process, we have come to the end of the first stage, where we have obtained a prototype of the indoor position and orientation tracking system. The prototype consists of a network of Microsoft Kinect [
The main objective of this paper is to present a functional and architectural solution for the ubiquitous context aware system for FoG detection, with special attention given to the accuracy evaluation of the developed prototype system for indoor position and orientation tracking.
In the concept of a ubiquitous FoG monitoring system [
Block diagram for the concept of the ubiquitous monitoring system. The wearable system independently detects FoG based on inertial data (blue rectangle). Gait-based detection is complemented by the user's spatial context from the vision sensor system (red rectangle) in the areas of the home where such a system is present.
Video cameras and video processing are often used in smart environments for event detection and context inference. Cameras enable the observation of changes in the environment, and at the same time, they are able to provide sub-meter accuracy of indoor localization. Limitations of the usual color (RGB) camera system are its sensitivity to changing lighting conditions, shadows, and occlusions. Active range cameras, such as the Kinect's depth sensor can be applied to overcome the drawbacks of color cameras. Furthermore, one depth sensor is enough to retrieve the three-dimensional (3D) information about the environment compared to a setup of multiple calibrated color cameras usually required for the same task.
To achieve the maximum spatial coverage for each Kinect sensor in an indoor environment with normal ceiling height, we decided to use these sensors in an over-head mounting position. Also, to achieve the most effective coverage inside a home with a minimum number of vision sensors, we decided to use non-overlapping scene coverage with only one or two Kinect sensors per room. An example of the intended spatial coverage is given in
Multiple person tracking and identification should be included in the system since the majority of PD patients live with at least one other person (see
Example of a test bed with two scenes being independently covered by Kinect sensors. Mock-up of a living room on the left and a dining room on the right. Images in the top row depict the point-of-view of the cameras when they are mounted in the overhead position. The bottom row displays colored point clouds of scenes that are obtained from depth sensing. Green trapezoid indicates the area in which it is possible to track people.
The workflow diagram of the system is given in
Independent elements of the process include 2D position tracking and 2D scene map calculation using RGB-D image, 3D orientation calculation using inertial data from the wearable sensor, and gait-based detection of FoG from inertial signals. These elements have to work independently, so that FoG detection can be achieved using the wearable sensor even when the patient is not in front of the camera.
The main prerequisite for position tracking is background subtraction in each frame. Background subtraction is based solely on the depth image. The background model for subtraction is set by periodic updates of the 3D point cloud of the whole observed scene. These periodic updates are done every few minutes on occasions when no tracked objects are present in the field of view. Furthermore, this background model is used to build the 2D map of the scene, which is used as one of the inputs for spatial context inference.
The foreground image obtained after background subtraction is used to build point clouds for updating the positions of the persons being tracked and to detect any new person in front of the camera. After the detection of new persons, positions of all tracked persons are updated. We are only interested in the position of the patient. If the track of the patient is not identified, the process of matching all known track histories against inertial sensor data is executed. If the match is successful and the patient's track is known, the position of the matched track is used in the calculation of the patient's pose. If none of the tracks in front of the camera are identified as the patient, the camera data is excluded from FoG detection.
Pose calculation involves a combination of the position obtained from the vision tracker and the 2D heading obtained from the wearable sensor. The estimated 2D pose is combined with the 2D map information and history of FoG detections to infer contextual probability of a FoG episode. This probability is published over a wireless network and read by the FoG State Interpreter (FSI) module running on a smartphone device. The FSI module conducts a high level probabilistic fusion of spatial context and gait detector outputs and produces the final system output which can be used to activate cueing.
Workflow diagram for FoG detection using the distributed sensor system.
The hardware prototype of our distributed sensing system consists of two static Kinect devices and one Samsung Galaxy Nexus smartphone, worn by the user. Each Kinect is connected to its own notebook computer, which acts as a processing unit for data acquisition and also runs one instance of the vision tracking algorithm. The notebooks are connected in a dedicated local area network (LAN) and they are synchronized with respect to time. Each Kinect acquires a depth and color image of resolution 640×480 pixels at a frequency of 30 Hz. The smartphone has connectivity with the dedicated wired LAN over a software access point running on one of the notebook computers. The smartphone reads data from its internal inertial sensors, three-axial accelerometers, gyroscopes, and magnetometers with the frequency of 100 Hz.
After the investigation of available middleware systems for intelligent environments, we chose an open source, community-supported middleware from the robotics domain to develop our distributed sensor system. The Robot Operating System (ROS) [
Although the nominal operation range of the Kinect depth sensor is 0.8-3.5 m, our goal was to apply the sensor in the extended range up to 6 m, which more than doubles the area coverage. At the distances greater than 3 m, the quality of Kinect depth sensor data degrades due to noise and low resolution of measurements [
Plan-view tracking is a computer vision approach that uses 3D data as input and combines geometric analysis, appearance models, and probabilistic methods to track people on the 2D floor plane [
For FoG detection, it is sufficient to track people when they stand. This can be a mitigating circumstance under real world conditions. When detecting people who are standing, it is sufficient for the system to observe only the 3D environment above a certain height. Setting a height cut-off threshold at around 1.0 m solves two frequent problems in indoor tracking, which are static object occlusions by furniture like chairs and tables, and background updates. Using such a threshold implies that changes in the scene below the height of the threshold do not have any influence, which results in a more robust tracking algorithm.
The combination of the accelerometer, gyroscope, and magnetometer signals from the smartphone allows the estimation of the absolute 3D orientation of the device toward the fixed global coordinate system defined by directions of gravity and magnetic North. The focus of our work in people orientation estimation is not on the development of new fusion algorithms for inertial devices, but it is on the development of methods for the use of existing inertial fusion orientation algorithms in the context of our distributed system.
There are two reasons why the measured orientation of the device cannot be used without adaptation in our tracking system. First, the estimation of the user's (patient's) orientation is needed in the distributed system only when the user is viewed by any of RGB-D cameras. Each camera in the system has its own coordinate system. Therefore, the orientation of the user at the given moment needs to be expressed as the angle in the coordinate system of the camera which performs the tracking, instead of global magnetic-North-referenced world frame. Second, we must strictly differentiate between the orientation of the inertial device and the orientation of the user, and emphasize that they cannot be considered equal. When the inertial device estimating orientation in reference to the global frame is fixed on the body of the user, its orientation in reference to the user's body must be exactly known in order to be able to correctly calculate the user's orientation toward the global frame of reference. In the real-world, every-day scenario, there are no means to exactly know the device orientation in reference to the user, even if the sensor is fixed in the correct position. When the smartphone is placed in a horizontal belt case by the user, there is an uncertainty because the device is not fixed directly on the user. The belt case could actually be positioned anywhere on the belt around the waist.
We have developed two methods for transforming the orientation of the inertial device into the 2D heading of the user, expressed in the referent camera coordinate system. In our methods, we use the very good and proven device orientation estimation algorithm introduced by Madgwick [
The first method we developed for person orientation estimation uses data only from wearable inertial sensor. The method employs
We defined the user's orientation as a vector along his dorsoventral axis with the direction from the dorsal to the ventral side of the body. As the predetermined position for placing the smartphone, we chose the left hip. As the reference coordinate system orientation for the smartphone, we set the x-axis facing upward along the anteroposterior axis of the body, the y-axis parallel to dorsoventral axis, and the z-axis facing left from the body along the left-right axis. Expected smartphone positioning is depicted in
When the smartphone is in the expected ideal position and orientation on the user's body, the vector of gravity will be along its negative x-axis, while y-axis and z-axis define the plane parallel with the floor (see
Our second person orientation estimation method uses wearable inertial sensor data in combination with the classification of the person's orientation conducted in the vision tracking system. The goal of the method is to eliminate the set of assumptions used in the first method, making it more robust and applicable for use in uncontrolled home environments. The method uses the previously-introduced
The implemented vision-based orientation classifier was inspired by the work of Harville [
The classification accuracy test on 100 height templates gave 92% correct classifications. During testing under real-world circumstances (ie, when movement paths and poses of people were not in the strict consensus with the eight trained orientations), a significantly higher amount of incorrect classifications was observed. Errors were noticed in classification between opposite directions and also in classification of body poses that differ too much from upright standing. This is the source of the possible error in the heading reference.
When the classifier proposes the orientation reference for wearable system, its accuracy needs to be ensured. A high confidence level for the heading reference can be achieved with the use of two additional sources of information: the quality score of the classification result, and the position history of the person. The quality score of the classification result is calculated using values at output neurons. An eight class neural network has eight output neurons, and the rule is that the output class of the whole classifier is assigned to the neuron with the maximum probability. The output neuron with the maximum probability has a high value when the user's height template is similar to a training template. This probability number can be used as the quality indicator for classification. A high confidence level using the classification quality score is achieved through a temporal process, where the classifier output is tracked for consistency to be above a certain threshold during several consecutive frames. When this consistency holds, the orientation angle represented by the class can be taken as the person's heading proposition. We call this angle
When the person is upright and wears the smartphone in the belt case, one of the axes of the device points approximately along the gravity vector, while the other two axes span the plane, which is almost parallel with the floor. This can be seen in
The external heading reference angle
In the subsequent frames when no external heading reference is available and there is the dependency only on the inertial system orientation estimation, angle
Frame definitions. a) Smartphone reference axes. b) Smartphone in the correct predetermined orientation at the expected position and orientation on the waist. c) Smartphone in the non-expected position and orientation on the waist. There is an angle of error in the transverse body plane between the device's real (green arrow) and expected (yellow arrow) orientation.
Overhead view of the relations between the different frames in the system.
The top row shows eight headings for one person at the same position in reference to the camera. The bottom row contains examples of related height templates used in orientation classification with neural network.
Coordinate frames in the process of fusion of vision and inertial information for orientation estimation. a) The moment in time when the external heading reference is available. b) Using the calculated correction angle to get person's heading at times when only the inertial orientation estimation is available.
The purpose of the experiment was to confirm the functionality of the position and orientation tracking system for different users, and to collect sufficient data for the statistical analysis of the system accuracy. Additionally, we wanted to show that the user's position can constantly be estimated within certain statistical error limits irrespective of his distance from the camera and his orientation. We chose the approach with the known static ground truths for position and orientation to enable an evaluation based on comparing with known referent values. The smartphone position on the waist of the participant was taken as a parameter in this experiment with the objective of assessing how each of the two heading estimation methods adapts to a change in the sensor attachment position.
The experiment had 12 participants (9 male, 3 female), who were recruited from among the staff and graduate students of the Industrial Design Department of Eindhoven University of Technology. The average height of the participants was 174.2 +/- 8.8 cm. None of the participants had gait problems. The area used for walking had dimensions 8×5 m, and it was covered with a green carpet which had a visible grid of squares of size 0.5×0.5 m. Two Kinect devices were set at a height of 2.25 m facing downward with a pitch angle of approximately 25°. The devices were placed to cover the walking area in a non-overlapping manner. A unique world frame for the experiment was set at the corner of the walking area, with its orientation equal to the base frame orientation of Kinect 1. To confirm the uniformity of the magnetic field in the walking area, we executed control measurements of its quality at approximate waist height (1.0 m) before and after the experiment.
On the green carpet surface, markers were placed to indicate points on the floor, where the participants are supposed to stop in predefined orientations (see
The experimental condition was the sensor attachment position with two possibilities,
During the experiment, color images and depth data of each Kinect were recorded along with the data from the smartphone which encompassed raw accelerations, orientation, magnetometer measurements and calculated orientations for
The vision-based position tracking algorithm gives a new estimation of the position for each frame. With a 30 Hz frame rate, approximately 30 position estimations were available to calculate the average value of the
The experiment venue. Markers on the floor indicate the start and end points and numbered reference points for standing in a predefined orientation. Additional markers also show which part of the area is covered by which Kinect device.
Schematic of marker positions and numbering for walks starting from the left side.
Schematic of marker positions and numbering for walks starting from the right side.
Calculated position values from all test walks were aggregated on a per point basis to enable comparison with reference values. Statistical results (see
The results of the estimation of person orientation closest to the ground truth were expected for tests with the sensor in Position1 when all assumptions needed to get the correct result were satisfied. The results for Method1 (
The average error values do not point to the existence of any specific bias. We took the highest observed value of the RMSE as the reference for error. Statistically, an average error of 17° can be expected if the initially assumed conditions about smartphone placement and upright walking posture hold.
Evaluation results of the person orientation using Method2 (see
Our expectation is that Method2 is able to compensate for the unknown orientation change of the attachment point of the smartphone. The adaptive nature of the method is visible in
Statistical results for position measurements of reference points.
Point ID | Coordinate | Ref. value [m] | Avg. value [m] | Mean error [m] | RMSE [m] |
1 | x | 2.25 | 2.21 | -0.04 | 0.07 |
y | 1.75 | 1.74 | -0.01 | 0.06 | |
2 | x | 3.25 | 3.14 | -0.11 | 0.16 |
y | 1.75 | 1.66 | -0.09 | 0.13 | |
3 | x | 5.75 | 5.65 | -0.10 | 0.15 |
y | 3.00 | 3.02 | 0.02 | 0.05 | |
4 | x | 3.25 | 3.23 | -0.02 | 0.09 |
y | 4.25 | 4.19 | -0.06 | 0.10 | |
5 | x | 1.25 | 1.22 | -0.03 | 0.06 |
y | 4.25 | 4.31 | 0.06 | 0.10 | |
6 | x | 1.75 | 1.64 | -0.11 | 0.16 |
y | 2.75 | 2.77 | 0.02 | 0.06 | |
7 | x | 6.25 | 6.20 | -0.05 | 0.07 |
y | 1.75 | 1.89 | 0.14 | 0.20 | |
8 | x | 4.75 | 4.79 | 0.04 | 0.08 |
y | 1.75 | 1.93 | 0.18 | 0.25 | |
9 | x | 1.75 | 1.73 | -0.02 | 0.07 |
y | 2.75 | 2.72 | -0.03 | 0.06 | |
10 | x | 1.25 | 1.14 | -0.11 | 0.16 |
y | 4.25 | 4.21 | -0.04 | 0.08 | |
11 | x | 3.25 | 3.17 | -0.08 | 0.13 |
y | 4.25 | 4.19 | -0.06 | 0.10 | |
12 | x | 6.75 | 6.71 | -0.04 | 0.08 |
y | 2.25 | 2.27 | 0.02 | 0.06 |
Statistical results aggregated per marker point for person orientation estimation method using
Point ID | Ref. angle [°] | Avg. angle [°] | Avg. error [°] | RMES [°] | Max. error [°] |
1 | 270 | 278 | 8 | 11 | 24 |
2 | 0 | -2 | -2 | 8 | 20 |
3 | 30 | 37 | 7 | 13 | 24 |
4 | 180 | 181 | 1 | 7 | 13 |
5 | 225 | 231 | 6 | 10 | 19 |
6 | 330 | 333 | 3 | 7 | 13 |
7 | 270 | 269 | -1 | 10 | 26 |
8 | 180 | 181 | 1 | 9 | 16 |
9 | 150 | 150 | 0 | 9 | 17 |
10 | 45 | 41 | -4 | 12 | 22 |
11 | 0 | -8 | -8 | 13 | 23 |
12 | 330 | 331 | 1 | 12 | 22 |
Statistical results aggregated per participant for the orientation estimation method using
Position1 | Position2 | |||
Participant | Avg. error [°] | RMSE [°] | Avg. error [°] | RMSE [°] |
1 | -8 | 9 | -66 | 66 |
2 | -7 | 13 | -41 | 42 |
3 | 3 | 8 | -60 | 62 |
4 | 3 | 5 | -60 | 60 |
5 | 3 | 5 | -43 | 43 |
6 | 7 | 8 | -55 | 55 |
7 | -8 | 8 | -62 | 63 |
8 | -6 | 14 | -57 | 57 |
9 | 8 | 8 | -50 | 50 |
10 | 11 | 11 | -47 | 47 |
11 | 13 | 15 | -58 | 58 |
12 | 5 | 7 | -39 | 40 |
Statistical results aggregated per marker point for orientation estimation using vision based classification and the
Point ID | Ref. angle [°] | Avg. angle [°] | Avg. error[°] | RMSE [°] | Max. error [°] | |
1 | 270 | 276 | 6 | 15 | 47 | |
2 | 0 | 2 | 2 | 15 | 44 | |
3 | 30 | 50 | 20 | 21 | 32 | |
4 | 180 | 188 | 8 | 10 | 15 | |
5 | 225 | 236 | 11 | 17 | 37 | |
6 | 330 | 334 | 4 | 13 | 33 | |
7 | 270 | 272 | 2 | 14 | 27 | |
8 | 180 | 187 | 7 | 16 | 35 | |
9 | 150 | 143 | -7 | 24 | 32 | |
10 | 45 | 40 | -5 | 17 | 32 | |
11 | 0 | -6 | -6 | 13 | 22 | |
12 | 330 | 313 | -17 | 18 | 28 |
Statistical results aggregated per participant for the person orientation estimation method using vision based classification and
Position1 | Position2 | |||
Participant | Avg. error [°] | RMSE [°] | Avg. error [°] | RMSE [°] |
1 | 11 | 28 | 5 | 13 |
2 | -2 | 19 | 3 | 14 |
3 | -4 | 20 | 4 | 21 |
4 | 10 | 14 | 13 | 22 |
5 | -3 | 15 | 4 | 14 |
6 | 4 | 17 | 6 | 16 |
7 | 0 | 13 | 1 | 14 |
8 | 4 | 12 | 11 | 17 |
9 | -3 | 12 | 0 | 13 |
10 | 0 | 13 | -6 | 18 |
11 | 0 | 14 | 13 | 12 |
12 | 8 | 12 | 9 | 16 |
The final goal of the experimental measurements of the position orientation tracking subsystem is to properly model its output as a virtual sensor that senses 2D poses and has known characteristics in terms of accuracy and noise. This will enable the output of the patient localization subsystem to be combined with environment mapping data using probabilistic principles, similar to the ones already developed in robotics [
The position estimation errors in
The RMSE is equal or less to 0.16 m for all the measurement points in
The comparison of the average orientation errors for the same points across
For FoG detection based on location, it is of great importance to achieve sufficient accuracy when measuring the distance between the patient and an obstacle. For the case when the system needs to observe that the patient is passing through a door frame, necessary accuracy of location sensing is in the range of several decimeters. The same is true for the case when the patient is standing next to an object, such as a chair. Proximity to an object in a congested space can easily be inferred when the person is standing at a very short distance (<0.4-0.5 m). To set the criteria for sufficient accuracy, we can use the literature about the minimal distance from objects that was observed for people during locomotion behavior. According to Weidmann [
The heading of the patient should be observed with the goal of inferring if he is facing any specific landmark on the map. When observing the patient's relation with the landmark, such as having the intention of going through a door or facing a kitchen sink, the heading error of 15-20 degrees left or right from the true angle is acceptable, because such an error cannot change the perception about the patient being generally directed toward the object. As the indicator of the orientation accuracy for each method we took the worst RMSE value in its related table (
In conclusion, for the orientation data collection from patients in controlled conditions, the recommendation is to use the smartphone and
We had a chance to deploy the prototype of the tracking system in its current form in a clinical environment, where we observed a rehabilitation session of one 80-year-old PD patient with a 13-year history of PD and high affinity toward FoG. The purpose of the test was to confirm that the system can be used as a portable system and to find out its applicability and value for clinical rehabilitation. Two Kinects were set on special 2.5 m high tripods and put in the corners of two rooms (both size 4.5×4.5 m) in the rehabilitation facility. The time necessary to setup the system was around 15 minutes. The patient was wearing the smartphone at the hip position. First, the usual therapy protocol which included warm-up, Get-up and Go exercise and walking with the visual and audio cues inside one of the rooms was observed. Our first addition to this protocol was the exercise for the patient which included quarter turns on the marked position in front of the camera. The second addition to the protocol was the exercise in which the patient started by sitting on the chair in one room and then had to walk to the chair in the other room passing two doorways and a hallway in-between the rooms. Each Kinect covered a part of one room with a doorway and a chair.
The quarter turns exercise gave us the opportunity to observe the influence of the patient’s stooped posture on positional tracking and visual orientation classification. We had been aware that the change of the posture might influence the final tracking output, although we are using a non-articulated tracking model. Initial qualitative results indicate that the stooped posture has minor influence on the positional tracking, while its influence on the orientation classification is higher than expected, which was manifested as an increase in the rate of incorrect classifications.
The exercise with sitting and walking between the rooms was a combination of Get-up and Go exercise, door passing, and on-the-spot-turning, and it was very demanding for the observed patient who experienced multiple FoG episodes. During the exercise the system was able to track the patient when he was standing, walking, and sitting. Taking this into account, we envisage the use of this tracking system in the clinical setting. We base the exploitation possibilities of the system on the idea of the quantitative assessment of the effectiveness of the therapeutic tests, in order to monitor the long-term advancement of the patient. Since the system uses 3D data, it can measure the height of the patient and give his temporal height profile. Height is useful during a sit-to-stand test to measure posture transition times. Furthermore, the system can collect positional and velocity data to compare walking with and without applied visual and audio cues, and to objectively measure the effect that cues have on the patient when walking straight. By tracking orientation, the same approach can also be applied to the evaluation of the patient's response to cueing during turns.
There were several limitations to the presented experimental study of the accuracy of this system. The main limitation is that the study was conducted in healthy people, who were able to maintain an upright posture. The usual posture of PD patients is a stooped posture, and the future system should accommodate for that. This is especially important for template based recognition of static orientation. The next iteration of the prototype will try to take this fact into account. The benefit of developing the new specialized classifier is that by being able to detect the stooped posture, the system will have additional information to infer the general PD state of the patient. Moreover, this information could be used to improve the identification of the patient during long-term system deployment. Additional data from real PD patients will be needed in order to perform quantitative, statistical evaluation of the posture change influence on the system accuracy.
Position accuracy was measured only for persons who were not occluded. Position measurement with multiple persons would offer better insight into position errors caused by partial occlusions. Furthermore, positions and orientations were only analyzed in static cases. Analysis of the dynamic properties would offer better insight into the characteristics of the system. For this kind of experiment, it would be necessary to have a tracking system with higher accuracy and adequate spatial coverage. A comparison with the Vicon system [
In the presented orientation tracking methods, the assumption of upright posture needs to be upheld to obtain accurate results. The final orientation algorithm should be aware of the current posture of the person. This calls for the development of an even more contextually aware system.
In this work, we presented a solution that rethinks the problem of FoG detection and monitoring from the standpoint of technology that could be offered in the context-aware homes of the future. The most interesting novelty from the medical aspect is that we decided to form technical prerequisites for the collection of patient data about FoG which takes into account external contextual factors regarding the symptom and the relation of the patient to the environment.
We proposed using a combination of two technologies: 3D vision and wearable sensing with smartphones, which have developed a strong commercial presence in recent years. It is expected that this trend will continue in the future with wearable sensing offering smaller and more energy efficient devices, and 3D vision cameras offering better resolutions and smaller frame factors.
The study of the characteristics of the system prototype showed that at the current moment, we have a system that has sufficient position tracking accuracy for use in the intended FoG-monitoring application. The study of orientation algorithms gave us the necessary insight into the properties of smartphones for indoor orientation tracking in the context of FoG. The proposed method of data fusion for orientation tracking showed not only how it can improve usability, but also disclosed the factors that need to be improved. Future work goes in the direction of the improvement of the current system prototype toward home deployment and pilot experiments with PD patients. To enable the deployment of the system in real homes with multiple people, long-term identification based on inertial and vision sensor data matching needs to be implemented. In addition, to collect the data for contextual modeling, preparations are being made to undertake recordings of the daily activities of people with FoG in their homes using the current prototype of the system.
An example scenario for multiple people tracking and re-identification. One camera covers the living room space, while the other is installed in the dining room. The scenario depicts a caretaker and a patient at home with two visitors. The goal is to sustain identities for all the subjects in spite of short term occlusions, pose changes, and changes between cameras.
An example of pose tracking during one test walk. Orientation is estimated using the proposed camera and wearable data fusion algorithm. The red arrow shows the estimated orientation of the person. The purple arrow shows how the orientation would be without correction from the vision-based system.
two-dimensional
three-dimensional
Absolute Orientation Estimation
Freezing of Gait
FoG State Interpreter
Gravity Relative Orientation Estimation
Inertial Measurement Unit
local area network
Magnetic Angular Rate and Gravity
Parkinson disease
red green blue
red green blue and depth
root mean square error
Robot Operating System
Structured Query Language
This work was supported in part by the Erasmus Mundus, Joint Doctorate in Interactive and Cognitive Environments, which is funded by the Education, Audiovisual and Culture Executive Agency under the FPA 2010-0012. The authors would also like to thank to JNA Brown for his help in improving the literary style of the paper.
Conflicts of Interest: None declared.