This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on https://mhealth.jmir.org/, as well as this copyright and license information must be included.
The term
This systematic review aimed to evaluate the validity and utility of wearable devices for monitoring hospitalized patients.
This review involved a comprehensive search of 7 databases and included articles that met the following criteria: inpatients must be aged >18 years, the wearable devices studied in the articles must be used to continuously monitor patients, and wearables should monitor biomarkers other than solely physical activity (ie, heart rate, respiratory rate, blood pressure, etc). Only English-language studies were included. From each study, we extracted basic demographic information along with the characteristics of the intervention. We assessed the risk of bias for studies that validated their wearable readings by using a modification of the Consensus-Based Standards for the Selection of Health Status Measurement Instruments.
Of the 2012 articles that were screened, 14 studies met the selection criteria. All included articles were observational in design. In total, 9 different commercial wearables for various body locations were examined in this review. The devices collectively measured 7 different health parameters across all studies (heart rate, sleep duration, respiratory rate, oxygen saturation, skin temperature, blood pressure, and fall risk). Only 6 studies validated their results against a reference device or standard. There was a considerable risk of bias in these studies due to the low number of patients in most of the studies (4/6, 67%). Many studies that validated their results found that certain variables were inaccurate and had wide limits of agreement. Heart rate and sleep were the parameters with the most evidence for being valid for in-hospital monitoring. Overall, the mean patient completion rate across all 14 studies was >90%.
The included studies suggested that wearable devices show promise for monitoring the heart rate and sleep of patients in hospitals. Many devices were not validated in inpatient settings, and the readings from most of the devices that were validated in such settings had wide limits of agreement when compared to gold standards. Even some medical-grade devices were found to perform poorly in inpatient settings. Further research is needed to determine the accuracy of hospitalized patients’ digital biomarker readings and eventually determine whether these wearable devices improve health outcomes.
Most physiologic parameters, such as vital signs or activity, are routinely monitored a few times each day in hospital ward settings [
The rapid uptake of affordable wearables, such as fitness bands, may provide a method for continuously measuring sleep; activity; and vital signs, such as heart rate [
Although there have been systematic reviews of the monitoring of patients’ physical activity in hospitals [
For the purposes of this review, a wearable was considered to be any electronic device that has at least 1 sensor and can be worn on the body [
A comprehensive search strategy was developed to identify articles on the three main concepts of our question—wearables, monitoring, and inpatients. The initial search strategy was developed for Ovid MEDLINE by using a combination of database-specific subject headings and text words (
Searches of the following databases were executed on August 16, 2018: Ovid MEDLINE, Ovid MEDLINE Epub Ahead of Print and In-Process & Other Non-Indexed Citations, Cochrane Database of Systematic Reviews, Cochrane Central Register of Controlled Trials, Health Technology Assessment database (Ovid), and CINAHL with Full Text. The search in Ovid Embase was not executed until September 5, 2018, due to issues with the vendor’s August database reload. Additional search methods included reviewing the cited references of eligible studies via Web of Science (May 6, 2019) and the reference lists of eligible studies. There were no restrictions on publication period. Limits were imposed to ensure that only English-language studies and those with adult populations were included in this review. No other limits were applied to the literature search.
Records were screened by two reviewers (VP and RW) independently. For selected studies, full-text articles were obtained and evaluated for eligibility [
Medical or surgical inpatients aged >18 years
Device studied in the article must be a wearable (such as a watch, vest, pendant, jewelry, headset, and wristband)
Articles must describe an element of continuous monitoring for at least 24 hours or greater
Articles must describe the measurement of 1 or more digital biomarkers other than just physical activity or standard hospital telemetry for heart rate recording.
We excluded articles that were not considered original research, such as letters to the editor, comments, and reviews. We also excluded articles that monitored less than 3 patients, described the monitoring of a very specialized system in the body (eg, insole devices, ventricular assistive devices, and cochlear implants), involved the monitoring of patients in rehabilitation hospitals, or used wearables as tools for therapy (eg, insulin delivery).
Two reviewers (RW and VP) independently extracted the data and resolved any disagreements by discussing the findings and making a collective decision. The data extracted for each article included the year of publication, study setting and design, number of participants, gender ratio, mean age of participants, digital biomarkers measured in the study, average and maximum duration that the wearable was worn by participants in each study, and patient completion rate (the proportion of patients that wore the wearable for the minimum monitoring duration that was set by the study authors). For studies that used a reference standard, any participants who were missing data from the wearable or the standard were determined to be incomplete measurement pairs and were omitted from the final count of patients who completed the study. Furthermore, we extracted the types of wearables that were worn by the participants in each study along with the placement sites on the body. Devices were classified as medical grade (approved or cleared by the US Food and Drug Administration), research grade (typically used in research settings only), and consumer grade (used by general consumers).
Validation data were also collected for each article by assessing whether the authors compared the accuracy of their digital readings to a reference standard. To determine the validity of measures that were compared to a reference standard, correlation coefficients, mean differences, and limits of agreement were extracted from each study.
All articles that assessed for validated readings were independently assessed for their risk of bias by two independent reviewers (VP and RW) using a modification of the validation subscale from a checklist for assessing the methodological quality of studies on the measurement properties of health status measurement instruments (Consensus-Based Standards for the Selection of Health Status Measurement Instruments [COSMIN]) [
Risk of bias assessment for studies that validated their wearable readings.
Study | Assessment criterion | |||||||||
|
Mean or % difference | Correlation | LOAa | Percentage of missing data | Missing data management | Adequate sample size (patients) | Adequate sample size (measurements) | Acceptable reference comparison | Other methodological flaws | Acceptable accuracy analyses |
Bloch et al [ |
No | No | No | Excellent | Excellent | Poor | Poor | Excellent | No | Poor |
Breteler et al [ |
Yes | No | Yes | Excellent | Excellent | Poor | Excellent | Excellent | No | Excellent |
Gallo and Lee [ |
No | Yes | No | Excellent | Excellent | Fair | Fair | Fair | No | Excellent |
Kroll et al [ |
Yes | Yes | Yes | Excellent | Excellent | Good | Excellent | Excellent | No | Excellent |
Steinhubl et al [ |
No | Yes | No | Excellent | Excellent | Poor | Excellent | Excellent | No | Excellent |
Weenk et al [ |
Yes | No | Yes | Excellent | Excellent | Poor | Good | Excellent | No | Excellent |
aLOA: limits of agreement.
Our literature search identified 2754 article citations. After excluding duplicate records, 2012 records were deemed eligible for screening. A total of 83 studies were selected based on abstracts and underwent full-text review. After applying our inclusion and exclusion criteria, 15 articles that described 14 studies were selected for this review (
PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) flow diagram of included and excluded studies.
All of the articles included were prospective cohort studies (
Collectively, the mean patient completion rate across all 14 studies was over 90%. Of the 8 articles that included a qualitative analysis as a part of their methodology, 7 reported that wearables were well received by either or both patients and clinicians.
Of the 14 studies, 6 validated wearable measurements against another standard device or measure (
Summary of included studies.
Study | Year published | Setting (ward) | Methodology | Patients, N | Male %: Female % ratio, mean age (years) | Variables measured | Number of days device was worn, average (maximum) | Patient completion rate, % |
Lee and Lee [ |
2007 | Obstetric | Prospective cohort | 21 | Females only, 32 | Sleep | 2a | 100 |
Gallo and Lee [ |
2008 | Obstetric | Prospective cohort | 39 | Females only, 29 | Sleep | 2 (2) | 100 |
Bloch et al [ |
2011 | Geriatric | Prospective cohort | 10 | Males and femalesb, 83 | Falls | 21a | 90 |
Chiu et al [ |
2013 | Neurosurgery | Prospective cohort | 60 | 65:35, 35 | Sleep | 7a | 87 |
Watkins et al [ |
2015 | Medicine and surgical | Prospective cohort | 236 | Males and femalesb,c | HRd, RRe, SpO2f, and BPg | 3 (3) | 100 |
Jeffs et al [ |
2016 | Medicine | Prospective cohort | 208 | 72:28c | HR. RR, SpO2, temperature, and accelerometry | (14)h | 32 |
Steinhubl et al [ |
2016 | Medicine | Prospective cohort | 26 | 65:35, 33 | HR, RR, and temperature | 3 (3) | 100 |
Razjouyan et al [ |
2017 | Hematology and |
Prospective cohort | 35 | 45:55, 55 | HR and fall risk | 1a | 94 |
Weenk et al [ |
2017 | General internal medicine and surgical | Prospective cohort | 20 | 65:35, 50 | HR, RR, BP, SpO2, and temperature | 2.5 (3) | 100 |
Kroll et al [ |
2017 | Intensive care unit | Prospective cohort | 50 | 52:48, 64 | HR, sleep | 1a | 96 |
Weller et al [ |
2017 | Neurology and |
Prospective cohort | 736 | 54:46c | HR, RR, SpO2, and BP | 1.7 (9) | 100 |
Breteler et al [ |
2018 | Surgical | Prospective cohort | 33 | 72:28, 63 | HR and RR | 2.6 (3) | 76 |
Yang et al [ |
2018 | Oncology | Prospective cohort | 11 | 64:36c | Sleep | 16a | 91 |
Duus et al [ |
2018 | General |
Prospective cohort | 50 | 58:42, 71 | HR, RR, and SpO2 | 3.1 (4) | 100 |
aThe maximum number of days was not reported in the study.
bThe study included both male and female participants but did not report a ratio.
cMean age was not reported in the study.
dHR: heart rate.
eRR: respiratory rate.
fSpO2: oxygen saturation
gBP: blood pressure.
hThe average number of days was not reported in the study.
Distribution of the health variables that were assessed for accuracy in each study.
Study | Device characteristics | Digital biomarkers | |||||||
|
Device, manufacturer | FDAa clearance or approval | Heart rate | Sleep | Respiratory rate | SpO2b | Skin temperature | Blood pressure | Fall risk |
Gallo and Lee [ |
Wrist Actigraph, Ambulatory Monitoring Inc | Yes | —c | — | — | — | — | — | |
Lee and Lee [ |
Mini-MotionloggerActigraphy, Ambulatory Monitoring Inc | Yes | — | Not validated | — | — | — | — | — |
Chiu et al [ |
ActiGraph GT1M, Actigraph LLC | Yes | — | Not validated | — | — | — | — | — |
Yang et al [ |
Actigraph GT3X+ watch, Actigraph LLC | Yes | — | Not validated | — | — | — | — | — |
Kroll et al [ |
Fitbit Charge HR, Fitbit Inc | — | LoAd (sinus): 23.9 to 21.9 beats per minute | — | — | — | — | — | |
Breteler et al [ |
HealthPatch, VitalConnect | Yes | LoA: −8.8 to 6.5 beats per minute | — | LoA: −15.8 to 11.2 breaths per minute | — | Not validated | — | Not validated |
Jeffs et al [ |
Hidalgo EQ02, Equivital | Yes | Not validated | — | Not validated | Not validated | Not validated | — | — |
Duus et al [ |
LifeTouch, Isansys Lifecare | Yes | Not validated | — | Not validated | Not validated | — | — | — |
Steinhubl et al [ |
MultiSense patch, Rhythm Diagnostic Systems | — | — | — | — | — | |||
Bloch et al [ |
Vigi’Fall, Vigilio Telemedical | — | — | — | — | — | — | — | Sensitivity: 37.5% |
Weenk et al [ |
ViSi Mobile, Sotera Wireless | Yes | LoA: −11.1 to 10.7 beats per minute | — | −5.5 to 7.9 breaths per minute | −3.1% to 3.3% | Not validated | SBPf: −23 to 24 mm Hg; DBPg: 27.5 to 11.5 mm Hg | — |
Weenk et al [ |
HealthPatch, VitalConnect | Yes | −12.6 to 9.5 beats per minute | — | −10.3 to 9.0 breaths per minute | Not validated | Not validated | Not validated | — |
Weller et al [ |
ViSi Mobile, Sotera Wireless | Yes | Not validated | — | Not validated | Not validated | Not validated | Not validated | — |
Watkins et al [ |
ViSi Mobile, Sotera Wireless | Yes | Not validated | — | Not validated | Not validated | — | Not validated | — |
Razjouyan et al [ |
Zephyr BioPatch, Medtronic | Yes | Not validated | — | — | — | — | — | Not validated |
aFDA: US Food and Drug Administration.
bSpO2: oxygen saturation.
cNot available.
dLOA: limits of agreement.
eSteinhubl et al [
fSBP: systolic blood pressure.
gDBP: diastolic blood pressure.
Illustration of the types of and body locations for used wearable devices.
Of the 6 studies in the risk of bias assessment, 4 were ranked as poor due to a small sample size (participants: N<30). The study conducted by Gallo and Lee [
A total of 5 studies assessed heart rate accuracy. Breteler et al [
A total of 6 studies used wearables to assess sleep, of which 2 assessed whether wearable readings were reliable. Gallo and Lee [
Of the 8 articles that used different wearables to measure the respiratory rate of patients, 3 assessed the wearables’ accuracy. Breteler et al [
Only 1 study, which was conducted by Weenk et al [
We conducted a systematic review that evaluated the utility of wearable technology in continuously monitoring hospitalized patients for a wide variety of health parameters. Our review focused on the breadth of devices used and the signals measured in hospitalized patients and included consumer, research, and medical-grade devices. There was evidence to support the use of Fitbit, ViSi Mobile, and the HealthPatch to measure heart rate [
Of the various health parameters, the best evidence of validity was in the monitoring of heart rate in hospitalized patients. We also found that, in hospital settings, limits of agreement for medical-grade devices ranged from 16.4 to 21.8 BPM, whereas the limit for a Fitbit consumer device that uses photoplethysmography signals was 24 BPM. Further, during Fitbit-based continuous electrocardiogram monitoring, 73% of the readings were within 5 BPM of electrocardiogram readings. In a systematic review of 158 studies that measured heart rate by using consumer wearable devices, 71% and 51% of Apple Watch (Apple Inc) readings (used in 49 studies) and Fitbit readings (used in 71 studies), respectively, were within 3% of electrocardiogram readings in controlled settings [
We found that sleep only had a moderate correlation with sleep survey results from inpatient settings the use research and consumer devices. A recent systematic review of Fitbit-based sleep assessments found that readings from more recently developed devices correlated well with polysomnography readings for assessing sleep episodes [
There are a few limitations that should be noted for our systematic review. There is a considerable risk of bias, as the number of participants in the studies was low. Further, the studies included were observational in design and had a high degree of heterogeneity in terms of the objectives, populations, and outcomes reported. Thus, the data analysis methods were limited to broad categorization and the extraction of the common themes and trends that emerged from the results. Reports of wearable monitoring from individual studies should be viewed based on their methodological limitations. Although patient adherence has been found to correlate well with patients’ acceptability of wearables devices in inpatient settings, we realize that studying factors such as data loss, the duration of data gaps, and qualitative feedback from nurses and patients would further strengthen the generalizability of the results. Finally, it is important to note that wearable studies are being increasingly performed, and more relevant articles will become increasingly available.
This review also identifies gaps in knowledge that still exist within literature and provides information about what is required for further research. Specifically, the further validation of digital biomarkers by using gold standard comparators, such as polysomnography for assessing sleep and continuous electrocardiogram monitoring for assessing heart rate, is required. Ideally, large participant sample sizes and large numbers of measurement pairs within a population of interest should be used to assess parameters such as vital signs. The use of 2 reference standards to validate each health parameter, such as a heart rate, has also been recommended [
Overall, the assessment of studies in this review suggested that wearable devices show promise for monitoring the heart rate and sleep of patients in hospitals. The results show that many devices were not validated in inpatient settings, and the readings from most of the devices that were validated in such settings had wide limits of agreement. Further research is needed to determine the accuracy of the digital biomarker readings of hospitalized patients and to eventually determine whether wearable devices improve the health outcomes of hospitalized patients.
Search methods.
Modified Consensus-Based Standards for the Selection of Health Status Measurement Instruments (COSMIN) criteria used for the risk of bias assessment.
beats per minute
Consensus-Based Standards for the Selection of Health Status Measurement Instruments
RW is supported by an award from the Mak Pak Chiu and Mak-Soo Lai Hing Chair of General Internal Medicine, University of Toronto. Funding for this study was kindly provided by the University Health Network Foundation Complex Care Fund.
VP, AOC, and RW designed and planned the review. AOC conducted the search strategy. VP and RW screened the articles and conducted the data analysis. VP, AOC, and RW wrote and revised the manuscript.
None declared.