Published on in Vol 9, No 1 (2021): January

Preprints (earlier versions) of this paper are available at, first published .
Wrist-Worn Activity Trackers in Laboratory and Free-Living Settings for Patients With Chronic Pain: Criterion Validity Study

Wrist-Worn Activity Trackers in Laboratory and Free-Living Settings for Patients With Chronic Pain: Criterion Validity Study

Wrist-Worn Activity Trackers in Laboratory and Free-Living Settings for Patients With Chronic Pain: Criterion Validity Study

Original Paper

1School of Education, Health and Social Studies, Dalarna University, Falun, Sweden

2Military Academy Karlberg, Swedish Armed Forces, Solna, Sweden

3Department of Neurobiology, Care Sciences and Society, Division of Physiotherapy, Karolinska Institutet, Huddinge, Sweden

4Academic Primary Health Care Centre, Region Stockholm, Stockholm, Sweden

5Center for Clinical Research Dalarna, Uppsala University, Region Dalarna, Falun, Sweden

Corresponding Author:

Veronica Sjöberg, MSc

School of Education, Health and Social Studies

Dalarna University

Högskolegatan 2

Falun, SE-791 88


Phone: 46 23 77 87 57


Background: Physical activity is evidently a crucial part of the rehabilitation process for patients with chronic pain. Modern wrist-worn activity tracking devices seemingly have a great potential to provide objective feedback and assist in the adoption of healthy physical activity behavior by supplying data of energy expenditure expressed as metabolic equivalent of task units (MET). However, no studies of any wrist-worn activity tracking devices’ have examined criterion validity in estimating energy expenditure, heart rate, or step count in patients with chronic pain.

Objective: The aim was to determine the criterion validity of wrist-worn activity tracking devices for estimations of energy expenditure, heart rate, and step count in a controlled laboratory setting and free-living settings for patients with chronic pain.

Methods: In this combined laboratory and field validation study, energy expenditure, heart rate, and step count were simultaneously estimated by a wrist-worn activity tracker (Fitbit Versa), indirect calorimetry (Jaeger Oxycon Pro), and a research-grade hip-worn accelerometer (ActiGraph GT3X) during treadmill walking at 3 speeds (3.0 km/h, 4.5 km/h, and 6.0 km/h) in the laboratory setting. Energy expenditure and step count were also estimated by the wrist-worn activity tracker in free-living settings for 72 hours. The criterion validity of each measure was determined using intraclass and Spearman correlation, Bland-Altman plots, and mean absolute percentage error. An analysis of variance was used to determine whether there were any significant systematic differences between estimations.

Results: A total of 42 patients (age: 25-66 years; male: 10/42, 24%; female: 32/42, 76%), living with chronic pain (duration, in years: mean 9, SD 6.72) were included. At baseline, their mean pain intensity was 3.5 (SD 1.1) out of 6 (Multidimensional Pain Inventory, Swedish version). Results showed that the wrist-worn activity tracking device (Fitbit Versa) systematically overestimated energy expenditure when compared to the criterion standard (Jaeger Oxycon Pro) and the relative criterion standard (ActiGraph GT3X). Poor agreement and poor correlation were shown between Fitbit Versa and both Jaeger Oxycon Pro and ActiGraph GT3X for estimated energy expenditure at all treadmill speeds. Estimations of heart rate demonstrated poor to fair agreement during laboratory-based treadmill walks. For step count, the wrist-worn devices showed fair agreement and fair correlation at most treadmill speeds. In free-living settings; however, the agreement for step count between the wrist-worn device and waist-worn accelerometer was good, and the correlation was excellent.

Conclusions: The wrist-worn device systematically overestimated energy expenditure and showed poor agreement and correlation compared to the criterion standard (Jaeger Oxycon Pro) and the relative criterion standard (ActiGraph GT3X), which needs to be considered when used clinically. Step count measured with a wrist-worn device, however, seemed to be a valid estimation, suggesting that future guidelines could include such variables in this group with chronic pain.

JMIR Mhealth Uhealth 2021;9(1):e24806



Chronic pain is defined as “pain that persists past normal healing time and hence lacks the acute warning function of physiological nociception” [1,2] and is a leading major public health problem internationally due to its effects on physical, social, and emotional functions [3]. Physical activity is a central part of chronic pain rehabilitation due to the evident health benefits, which include improved cardiovascular health, prolonged lifespan [4,5], positive effects on pain intensity, health-related quality of life, and both physical and psychological functions [6]. The American Heart Association has provided guidelines regarding sufficient weekly amounts of physical activity to reap health benefits for a healthy population, as well as for populations with chronic conditions [4,7]. For patients with chronic pain, recommendations are to spend ≥150 minutes/week engaged in moderate-to-vigorous physical activity (MVPA). Moderate physical activity is defined as equal to or more than 3 and less than 6 metabolic equivalent task units (MET) [8]. One MET is defined as a resting metabolic rate obtained when quietly seated [8]. Despite clear guidelines, it seems that inadequate physical activity levels are common among patients with chronic pain, which can lead to an increased risk of physical and mental illness [5]. In rehabilitation settings, objective estimations of physical activity are rarely used. Instead, subjective measures are common practice due to their high degree of acceptance, cost effectiveness, and relatively low administrative burden [9]. However, despite its perceived benefits, subjective estimations of physical activity domains have estimation biases, such as recall bias and reactivity bias [9]. Several studies [10-14] have indicated the potential of wrist-worn activity tracking devices as tools that can facilitate behavior change and increase the degree to which patients follow individually modulated physical activity levels designed to improve health. Wearable devices for physical activity tracking have received increased interest from both the research community and consumers aiming to quantify domains of physical activity (eg, frequency and duration) in order to optimize health behaviors [10,15,16]; however, before the clinical use of these devices can be introduced, the validity of each device needs to be established [17]. In the past decade, there has been an increasing number of studies [18-22] assessing the validity of wrist-worn tracking devices that measure energy expenditure by comparison to a criterion standard such as indirect calorimetry or accelerometry. The majority of these validation studies were conducted among healthy adult participants [17,23], with studies reporting somewhat conflicting findings—both overestimation [20,23] and underestimation [19,24,25] with Fitbit devices were reported. In a recent systematic review [23] investigating the accuracy of Fitbit devices, it was reported that 49% (43 of 88 comparisons) overestimated energy expenditure, particularly during physical activity. In an earlier systematic review of the field, Evenson et al [17] reported a high validity of different brands of wearable activity tracking devices regarding step count when compared to various criterion standards made in laboratory settings [26,27]. Regarding the validity of heart rate estimations from by wrist-worn activity tracking devices, one study [28] have shown that the agreement between true rate and the estimated rate made by a wrist-worn device is higher during rest than during MVPA in healthy subjects. To our knowledge, there has been no prior research examining wrist-worn activity tracking device criterion validity in estimating energy expenditure (using MET), step count, or heart rate among patients with chronic pain. This lack of research constitutes a substantial knowledge gap given how important it is for patients with chronic pain to achieve adequate amounts of weekly physical activity. Therefore, the aim of this study was to evaluate the criterion validity of each of these measures estimated by a wrist-worn activity tracking device for patients with chronic musculoskeletal pain in both laboratory and free-living settings.

Study Design

We conducted a laboratory and field validation study. Data were collected between March 2019 to June 2020 (Health and Sports laboratory, Dalarna University). The sample size calculation was based on intraclass correlation (ICC), the primary statistic in the study. In order to achieve 80% power to detect an ICC of 0.80 (excellent agreement) with a 95% distribution (lower limit 0.6), calculation based on published recommendations [29] showed a requirement of 26 to 49 participants. This study was approved by Swedish Ethical Review authority (registration number 2018-307).

Recruitment and Study Sample

The inclusion criteria were adult age (between 18 and 67 years), with chronic (>3 months) musculoskeletal (neck or low back) pain or widespread pain, currently undergoing assessment or treatment (for chronic pain) in a primary or specialized health care clinic, and having the ability to understand information in Swedish. The exclusion criteria were having given birth within the previous 3 months, pregnant in the second or third trimester, requiring a walking aid indoors, currently undergoing heart assessment or investigation, with pain caused by malignancy or systematic disease, or having a known allergy to plaster or adhesive tape. Participants were recruited from 8 primary and specialized health care clinics in Region Dalarna. Patients who matched the study criteria (age, duration of pain, language) were asked by clinicians for consent to be contacted by a study representative, who conducted additional screening for eligibility. At the test site, for safety reasons, all participants declared whether they had been diagnosed with or experienced a heart condition, chest pain, dizziness, high or low blood pressure, any respiratory disorder, or diabetes before any tests were performed. Participants’ height and weight were manually measured using a stadiometer (Holtain Limited) and a weighing scale (Sartorius AG). A self-rated questionnaire captured date of birth; biological sex; education level; work status; years lived with pain; and pharmaceutical, caffeine, and nicotine consumption in the previous 24 hours. Participants also completed the Swedish National Board of Health and Welfare’s questionnaire on physical activity level (minutes per week spent in exercise and in physical activity) [30,31]. In addition, participants completed the Multidimensional Pain Inventory (in Swedish) to describe psychosocial and behavioral consequences of pain [32].


A wrist-worn activity tracker (Fitbit Versa, Fitbit Inc), chosen for its high degree of user-friendliness, because it can be used with web or smartphone apps, and it is suitable for water activities. The Fitbit Versa estimates movement (eg, active minutes) using a triaxial accelerometer and MET/minutes based on a combination of basal metabolic rate (adjusted for sex, age, height, and weight), accelerometry-based activity counts, and heart rate measured through optical sensors [33,34].

The criterion standard (gold standard) for energy expenditure in our laboratory setting was indirect calorimetry from pulmonary gas exchange. Oxygen uptake (VO2) and carbon dioxide production (VCO2) was measured using a mixing-chamber system (Jaeger Oxycon Pro) that measures respiratory gas exchange through a mouthpiece and tube [35]. Jaeger Oxycon Pro provides an assessment of resting energy expenditure and activity-related energy expenditure based on type and amount of substrate oxidized and the amount of energy produced by biological oxidation—MET values are based on the equation: 1 MET = 3.5 mL/min/kg VO2 [8]. Before the start of the testing protocol, ambient conditions were recorded, and automatic volume and gas calibration was performed using a high-precision gas mixture (Air Liquide AB). The Jaeger Oxycon Pro has been validated by comparison to the Douglas bag-method and has been found a reliable criterion standard for indirect calorimetry [36]. Real-time VO2 and heart rate data were recorded throughout the entire laboratory protocol.

The relative criterion standard was a research-grade hip-worn accelerometer: ActiGraph GT3X-BL (ActiGraph LLC) and appurtenant software Actilife (version 6.13.3; ActiGraph LLC). The ActiGraph GT3X is a research-based triaxial accelerometer commonly used as a criterion standard both in free-living and in laboratory settings, within various populations as it is a valid and reliable tool to quantify physical activity [11,37,38].


According to current guidelines [39,40], in investigations aiming to evaluate the criterion validity of a wrist-worn activity tracker, data collection should be conducted in laboratory and free-living settings. In the laboratory setting, energy expenditure data were concurrently collected from Jaeger Oxycon Pro and Fitbit Versa during rest (sitting quietly seated for 10 minutes) and during treadmill walking (18 minutes). Heart rate data were also collected with a chest band (Polar HR10). Step count was estimated by ActiGraph GT3X and Fitbit Versa. The last 2 minutes of each activity (rest, treadmill speed) was included in data analysis providing data during a steady state environment [41]. During rest, participants were seated (wearing the facemask with tube) in an inclined chair with supported arms, under a blanket to avoid feeling cold. The room temperature was set at 20 °C, and the laboratory was kept quiet during the resting period. The treadmill walk protocol consisted of 6 minutes at each speed of 3.0 km/h, 4.5 km/h, and 6.0 km/h. At the end of each 6 minutes, participants rated perceived exertion according to (Borg Rating of Perceived Exertion, rating from 6-20) [42], and after the third final speed, pain intensity was also assessed using a visual analog scale (0 mm to 100 mm) [43]. In the free-living setting, step count was concurrently estimated by Fitbit Versa and ActiGraph GT3X for the subsequent 72 hours after the laboratory testing [39]. Participants were instructed to wear the devices simultaneously for at least 10 hours each day, to remove the devices for sleeping, showering, and bathing, and to record their wear-time in a logbook. Data collection started once participants left the laboratory. A schematic overview for the study procedure is shown in Table 1.

Table 1. A schematic overview of the study procedure, measurements, and outcomes in both settings.
ActivityDurationInstruments and devicesOutcomes
Laboratory setting

Baseline measurementsN/AaSartorius weighting scaleWeight

Holtain StadiometerHeight

QuestionnairesN/AMultidimensional Pain InventoryPersonal characteristics
Pain characteristics

National Board of Health and Welfare questions for physical activity levelPhysical activity level

Seated rest measurements10 minutesJaeger Oxycon ProEnergy expenditure, heart rate

ActiGraph GT3XEnergy expenditure

Fitbit VersaEnergy expenditure, heart rate

Treadmill walk measurements18 minutes (6 minutes at 3.0, 4.5, 6.0 km/h each)Jaeger Oxycon ProEnergy expenditure, heart rate

ActiGraph GT3XEnergy expenditure, step count

Fitbit VersaEnergy expenditure, heart rate, step count

Borg´s RPE scale (6-20)Perceived exertion

Visual Analogue Scale (0-100)Pain intensity post–treadmill walk
Free-living setting

Free-living activities72 hoursActiGraph GT3XEnergy expenditure, step count

Fitbit VersaEnergy expenditure, step count

Logbook Wear-time

aN/A: not applicable.

Experimental Measurement

The Fitbit Versa was initialized, and participants’ age, height, length, and biological sex were registered. The device was synchronized to its app (Fitbit Dashboard) and fitted on participants’ nondominant wrist according to the manufacturer’s recommendations. To retrieve data (energy expenditure, step count, heart rate) we deployed a web-based application programming interface [44] with assistance from an experienced computer programmer. Through such script, Fitbit allows users to download defined data by minute resolution. After the devices were returned, they were resynchronized before data was downloaded.

Criterion Standard

Participants’ biological sex, height, and weight were entered into the software. Data (energy expenditure, heart rate) retrieved from Jaeger Oxycon Pro and Polar HR10 were manually aggregated to minute resolution (from 15 s to 60 s) to correspond with Fitbit Versa and ActiGraph GT3X data output.

Relative Criterion Measurement

ActiGraph GT3X was initialized at the 30 Hz sample rate and participants’ date of birth, height, length, and biological sex were entered. The device was fitted on participants’ waists, to the right of the spine, using an elasticated belt. Data (counts per axis, step count) were downloaded in epochs of 60 seconds, which is commonly used in corresponding research [45]. After download, we applied a cut-off (combining the Work-Energy Theorem and the Freedson equation) in Actilife software (version 6.13.4; ActiGraph LLC) that combines to calculate energy expenditure [46]. Actilife calculates MET values based on brand-specific activity counts and chosen cut points. We applied the Freedson cut-point to score MET per minute [47].

Data Management and Statistics

Frequency analysis of data was performed to identify potential errors. Manual checking of random samples (20% of the data)was carried out and deemed satisfactory with <3% error rate. Descriptive statistics were used to describe participant characteristics. The Shapiro-Wilk test was used to determine whether data were normally distributed. The criterion validity was determined through assessment of agreement as well as assessment of correlation between estimations and measurements of primary outcomes energy expenditure, heart rate, and step count in laboratory and free-living [39,40]. Agreement was assessed with ICC coefficient analysis (2-way random, average measures, 95% CI, absolute agreement) [48,49]. An ICC below 0.4 was considered poor, an ICC between 0.4 and 0.59 fair, an ICC between 0.6 and 0.74 good, and an ICC above 0.75 was considered as excellent [50]. Analysis of variance (ANOVA) was used to determine any significant systematic differences between estimations. To visualize the absolute, unscaled agreement [48,51], Bland-Altman plots with 95% CI (ie, limits of agreement, LOA) were calculated. Values beyond ±3 SD were identified as outliers and were excluded from analysis after sensitivity analysis. To determine correlation between estimations of energy expenditure, step count, and heart rate, Spearman (ρ) bivariate correlation analysis was used, and ρ<0.2 was considered poor, 0.2≤ρ<0.6 was considered fair, 0.6≤ρ<0.8 was considered moderate, 0.8≤ρ<0.9 was considered very strong, 0.9≤ρ<1 was considered perfect [52,53]. In addition, mean absolute percentage error (MAPE) were calculated as a measure of accuracy for both measured energy expenditure, steps, and heart rate as the mean difference between estimations of the wrist-worn activity tracker and estimations of the criterion measurement (Jaeger Oxycon Pro or ActiGraph GT3X) multiplied by 100, divided by the mean of the criterion measurement (Jaeger Oxycon Pro or ActiGraph GT3X) [27]. An MAPE value <1% was acceptable in the laboratory context [28,54] and a MAPE <10% of the criterion value was considered an acceptable rate of error in the free-living setting [9]. Missing data analysis was performed as recommended by Fox-Wasylyshyn [55] to evaluate any significant association between missing data and participant characteristics at baseline. Our predetermined significance level for P values was .05


A total of 42 patients (female: 32/42, 76%; male: 10/42, 24%) participated in the study, but only 41 participants completed the protocol due to the malfunction of 1 device. The participants’ mean age was 43.8 years (SD 11.8). Participants’ mean BMI was 29.4 (SD 5.8), 66% of participants (27/41) were working/studying at the time of the study, and 49% (20/41) stated that they were physically active 150 minutes/week or more (Table 2). Most participants (36/41, 88%) completed all 3 treadmill speeds, while the remaining participants (5/41, 12%) discontinued the treadmill test at the highest speed due to high physical exertion or increased pain. Missing analysis revealed 1 significant result—all participants who discontinued the treadmill walk at the highest speed reported being physically active <150 minutes/week at baseline, while 44% (16/36) among those who completed all 3 treadmill speeds rated <150 minutes/week (P=.05). The mean ratings of perceived exertion at the end of each treadmill walk were 9 (SD 2), 12 (SD 2), and 14 (SD 2) for 3.0 km/h, 4.5 km/h, 6.0 km/h. Pain intensity ranged from 1 mm to 96 mm, mean 43 mm (SD 29 mm) on the visual analog scale after completion of the treadmill walk. Within the 24 hours prior to testing, 15 of the 41 participants (37%) used analgesics, and 2 (<5%) used beta blockers. Because 3 participants did not return their logbooks, data from 38 participants were included in the free-living analyses. The mean wear-time of the devices during the free-living period was 31 hours and 23 minutes (SD 6 hours and 21 minutes).

Table 2. Personal and pain characteristics of participants.
CharacteristicValue (n=41)
Demographic characteristic

Sex, n (%)

Female31 (76)

Male10 (24)

Age (years), mean (SD)43.8 (11.8)

BMI, mean (SD)29.4 (5.8)

Education level, n (%)

Elementary1 (2)

Secondary28 (68)

University12 (29)

Other unspecified1 (2)

Working/studying, n (%)

Yes27 (66)

No14 (34)

Treatment, n (%)

Primary health care33 (80)

Specialized8 (20)
Pain characteristics

Multidimensional pain inventory, Swedish version (0-6), part 1, mean (SD)

Pain intensity3.6 (1.1)

Pain interference3.7 (0.8)

Life control3.4 (1.0)

Affective distress2.9 (0.9)

Social support3.6 (1.3)

Number of pain locations (0-36)14.0 (9.5)

Years lived with pain, n (%)

0-5a20 (49)

6-105 (12)

<1015 (37)

Pharmaceutical consumption last 24 hours, n (%)

Analgesics15 (37)

Beta blockers, n (%)2 (<5)
Physical activity levelb

Exercisec(minutes/week), n (%)

0-3015 (37)

31-9011 (27)

91-12011 (27)

>1204 (10)

Physical activityd (minutes/week), n (%)

0-608 (20)

61-15013 (32)

151-3007 (17)

>30013 (32)

aAll participants had experienced pain >3 months.

bNational Board of Health and Welfare\'s questions for physical activity level.

cStructured physical activity requiring physical effort and aims to improve health and fitness.

dAny bodily movement produced by skeletal muscles that requires energy expenditure.

Criterion Validity

The mean energy expenditure, heart rate, and step count of the criterion standard (Jaeger Oxycon Pro), the relative criterion measure (ActiGraph GT3X), and the experimental measure (Fitbit Versa) are presented in Table 3. The ICC (95% CI), mean difference with upper and lower LOA, Spearman correlation, and MAPE for all statistical calculations are presented in Table 4 and Table 5. The Bland-Altman plots for energy expenditure, step count, and heart rate are shown in Figures 1-5.

Table 3. Energy expenditure, heart rate, and step count during treadmill walking and in free-living setting.
MeasureSeated restTreadmill walkFree-living setting

3.0 km/h4.5 km/h6.0 km/hOverall
Energy expenditure

MET/minute, mean (SD)

Jaeger Oxycon Pro0.73 (0.17)a2.76 (0.36)3.50 (0.37)5.10 (0.42)b3.80 (0.33)bN/Ac

ActiGraph GT3X1.00 (0.00)1.31 (0.40)3.91 (1.40)5.48 (1.07)b3.56 (0.84)b2.53 (0.52)d

Fitbit Versa1.0 (0.02)5.73 (0.56)b6.41 (0.58)a7.56 (1.17)b6.56 (0.64)e3.73 (0.86)d

P value

Jaeger Oxycon Pro–Fitbit Versa<.001<.001<.001<.001<.001N/A

ActiGraph GT3X–Fitbit VersaN/A<.001<.001<.001<.001<.001
Heart rate

bpm, mean (SD)

Jaeger Oxycon Pro71.25 (9.82)a99.15 (15.54)a109.61 (14.89)a132.14 (19.48)b113.53 (16.14)bN/A

Fitbit Versa72.11(11.55)101.83 (10.92)a108.67 (8.18)121.95 (9.63)b110.85 (7.05)bN/A

P value

Jaeger Oxycon Pro–Fitbit Versa.
Step count


ActiGraph GT3XN/A92.70 (8.78)a110.83 (7.08)124.93 (6.52)e108.61 (6.11)e18.64 (8.51)d

Fitbit VersaN/A91.89 (8.75)106.72 (7.05)114.20 (11.09)e103.95 (5.62)e11.34 (5.87)f

P value

ActiGraph GT3X–Fitbit VersaN/A.53<.001<.001<.001<.001



cN/A: not applicable.




Table 4. Comparison between experimental measurement (Fitbit Versa) and the criterion standard (Jaeger Oxycon Pro) in the laboratory setting.
Test and measureJaeger Oxycon Pro vs Fitbit Versa (n=41)

Energy expenditure–energy expenditureHeart rate–heart rateEnergy expenditure–heart rateEnergy expenditure–step count
Seated resta

ICCb (95% CI)0.003 (–0.16 to 0.20)0.99 (0.98 to 0.99)N/AcN/A

Mean difference (LOAd)0.27 (–0.07 to 0.61)0.09 (–4.35 to 4.52)N/AN/A

ρ (P value)–0.03 (.86)0.96 (<.001)0.27 (.09)

Treadmill walk

3.0 km/hf

ICC (95% CI)0.01 (–0.04 to 0.04)0.09 (–0.72 to –0.52)N/AN/A

Mean difference (LOA)2.97 (1.61 to 4.34)–2.68 (–39.01 to 33.65)N/AN/A

ρ (P value)–0.14 (.39)0.24 (.14)0.10 (.53)–0.04 (.82)


4.5 km/hg

ICC (95% CI)–0.03 (–0.09 to –0.07)0.20 (–0.55 to 0.58)N/AN/A

Mean difference (LOA)2.91 (1.39 to 4.43)0.75 (–30.73 to 32.22)N/AN/A

ρ (P value)–0.31 (.05)0.16 (.33)–0.11 (.51)–0.19 (.24)


6.0 km/hh

ICC (95% CI)–0.05 (–0.19 to –0.16)0.40 (–0.09 to 0.68)N/AN/A

Mean difference (LOA)2.46 (–0.11 to 5.03)10.19 (–25.56 to 45.95)N/AN/A

ρ (P value)–0.11 (.51)0.44 (.01)0.17 (.34)0.01 (.97)


Overall speedsi

ICC (95 % CI)–0.03 (–0.08 to 0.08)0.19 (–0.58 to 0.59)N/AN/A

Mean difference (LOA)2.76 (1.21 to 4.31)–2.68 (–35.30 to 29.95)N/AN/A

ρ (P value)–0.22 (.20)0.23 (.17)–0.05 (.75)0.07 (.71)



bICC: intraclass correlation.

cN/A: not applicable.

dLOA: limits of agreement.

eMAPE: mean absolute percent error.

fn=39 for energy expenditure–energy expenditure; n=40 for heart rate–heart rate and energy expenditure–heart rate.

gn=40 for energy expenditure–energy expenditure and heart rate–heart rate.

hn=36 for energy expenditure–energy expenditure; heart rate–heart rate, and energy expenditure–heart rate; n=39 for energy expenditure–step count.

in=35 for energy expenditure–energy expenditure and energy expenditure–step count; n=36 for energy expenditure–heart rate and heart rate–heart rate.

Table 5. Comparison between the experimental measurement (Fitbit Versa) and the relative criterion measurement (ActiGraph GT3X) in both settings (laboratory and free-living).
Test and measureActiGraph GT3X vs Fitbit Versa (n=41)

Energy expenditure–energy expenditureStep count–step countEnergy expenditure (ActiGraph GT3X)–step count (Fitbit Versa)Step count (ActiGraph GT3X)–energy expenditure (Fitbit Versa)
Seated rest

ICCa (95% CI)N/AbN/AN/AN/A

Mean difference (LOAc)0.00 (–0.33 to 0.34)N/AN/AN/A

ρ (P value)N/AN/AN/AN/A

Treadmill walk

3.0 km/he

ICC (95% CI)–0.01 (–0.03 to 0.04)0.71 (0.44 to 0.84)N/AN/A

Mean difference (LOA)4.43 (2.92 to 5.94)0.84 (–15.73 to 17.40)N/AN/A

ρ (P value)–0.26 (.11)0.66 (<.001)–0.08 (.60)0.42 (.01)


4.5 km/hf

ICC (95% CI)0.02 (–0.12 to 0.21)0.69 (0.29 to 0.85)N/AN/A

Mean difference (LOA)2.55 (–0.36 to 5.45)4.11 (–8.13 to 16.35)N/AN/A

ρ (P value)0.11 (.51)0.66 (<.001)0.07 (.66)0.60 (<.001)


6.0 km/hg

ICC (95% CI)–0.14 (–0.50 to 0.24)0.05 (–0.51 to 0.35)N/AN/A

Mean difference (LOA)2.08 (–1.30 to 5.45)10.79 (–15.02 to 36.61)N/AN/A

ρ (P value)–0.07 (.67)0.09 (.60)0.23 (.18)0.37 (.03)


Overall speedsh

ICC (95% CI)–0.04 (–0.13 to 0.12)0.60 (0.03 to 0.82)N/AN/A

Mean difference (LOA)3.02 (0.78 to 5.26)–4.98 (–16.68 to 6.71)N/AN/A

ρ (P value)–0.10 (.56)0.51 (<.002)0.28 (.11)0.31 (.07)


Overall daysi

ICC (95% CI)0.46 (–0.16 to 0.80)0.70 (–0.21 to 0.91)N/AN/A

Mean difference (LOA)1.20 (0.17 to 2.24)–7.12 (–16.25 to 2.00)N/AN/A

ρ (P value)0.79 (<.001)0.87 (<.001)0.41 (.01)0.55 (<.001)


aICC: intraclass correlation.

bN/A: not applicable.

cLOA: limits of agreement.

dMAPE: mean absolute percent error.

en=39 for energy expenditure–energy expenditure; n=40 for step count–step count, n=38 for energy expenditure–step count.

fn=40 for energy expenditure–energy expenditure and step count–energy expenditure.

gn=36 for energy expenditure–energy expenditure; n=34 for step count–step count; n=35 for energy expenditure–step count and step count–energy expenditure.

hn=35 for energy expenditure–energy expenditure; n=34 for step count–step count; n=35 for energy expenditure–step count; n=34 for step count–energy expenditure.

in=38 for energy expenditure–energy expenditure and step count–energy expenditure; n=37 for step count–step count and energy expenditure–step count.

Figure 1. Bland-Altman plot visualizing agreement of energy expenditure (MET) estimated by Fitbit Versa and criterion measurement Jaeger Oxycon Pro during overall treadmill walk. The middle green line shows the mean difference (bias) between devices. The dashed lines indicate upper (+1.96 SD) and lower (–1.96 SD) limits of agreement and the black line represents the regression line illustrating association between estimations.
View this figure
Figure 2. Bland-Altman plot visualizing agreement of heartrate estimated by Fitbit Versa and criterion measurement Jaeger Oxycon Pro during overall treadmill walk. The middle green line shows the mean difference (bias) between devices. The dashed lines indicate upper (+1.96 SD) and lower (–1.96 SD) limits of agreement and the black line represents the regression line illustrating association between estimations.
View this figure
Figure 3. Bland-Altman plot visualizing agreement of energy expenditure (MET) estimated by Fitbit Versa and relative criterion measurement ActiGraph GT3X during overall treadmill walk. The middle green line shows the mean difference (bias) between devices. The dashed lines indicate upper (+1.96 SD) and lower (–1.96 SD) limits of agreement and the black line represents the regression line illustrating association between estimations.
View this figure
Figure 4. Bland-Altman plot visualizing agreement of step count estimated by Fitbit Versa and relative criterion measurement ActiGraph GT3X during overall treadmill walk. The middle green line shows the mean difference (bias) between devices. The dashed lines indicate upper (+1.96 SD) and lower (–1.96 SD) limits of agreement and the black line represents the regression line illustrating association between estimations.
View this figure
Figure 5. Bland-Altman plot visualizing agreement of step count estimated by Fitbit Versa and relative criterion measurement ActiGraph GT3X during free-living. The middle green line shows the mean difference (bias) between devices. The dashed lines indicate upper (+1.96 SD) and lower (–1.96 SD) limits of agreement and the black line represents the regression line illustrating association between estimations.
View this figure

Fitbit Versa versus Jaeger Oxycon Pro

In the laboratory setting we found that Fitbit Versa showed poor agreement of estimated energy expenditure with corresponding estimations by Jaeger Oxycon Pro in the overall treadmill walk (ICC –0.03, 95% CI –0.08 to 0.08). There were also significant systematic differences between estimations in all treadmill speeds as well as in the overall treadmill walk (P≤.001). In addition, the Bland-Altman plot showed a broad range for energy expenditure estimation, also indicated the overestimation, with a mean difference of 2.76 MET, LOA 1.21 to 4.31 for overall speeds (Table 4, Figure 1). A narrow mean difference was found during rest, 0.27 MET, LOA –0.07 to 0.61. In addition, the correlation of energy expenditure estimated by Fitbit Versa and Jaeger Oxycon Pro was weak at all measured timepoints. Overall treadmill speed MAPE for energy expenditure was 35.39 and ranged from 31.59 at 6 km/h to 51.52 at 3.0 km/h.

There was poor agreement between Fitbit Versa’s estimation of heart rate compared to Jaeger Oxycon Pro at overall treadmill (ICC 0.19, 95% CI –0.58 to 0.59). At the specific treadmill speeds ICC ranged from poor (ICC 0.09, 95% CI –0.72 to 0.52) at 3.0 km/h to fair (ICC 0.40, 95% CI –0.09 to 0.68) at the final treadmill speed (6 km/h). However, agreement of estimations was excellent (ICC 0.99, 95% CI 0.98 to 0.99) and correlation very strong (ρ=.96, P≤.001) during seated rest. ANOVA results showed no systematic differences between estimations of heart rate during rest (P=.81), at 3 km/h (P=.37), at 4.5 km/h (P=.77), or during the overall treadmill walk (P=.34). This was also confirmed by the Bland-Altman plot; the mean difference of heart rate estimation during the overall treadmill walk were –2.68 bpm, LOA –35.30 to 29.95 bpm. It ranged from –2.68 bpm, LOA –39.01 to 33.65 at 3.0 km/h to a broader range, 10.19 bpm, LOA –25.45 to 46.57 at 6 km/h (Table 4, Figure 2). Corresponding MAPE ranged from 2.24 at seated rest to 12.10 at 6 km/h, with the overall treadmill walk at 10.51.

We found only weak correlations between energy expenditure by Jaeger Oxycon Pro and heart rate by Fitbit Versa, and between energy expenditure by Jaeger Oxycon Pro and step count by Fitbit Versa, during both seated rest and during all treadmill speeds (Table 4).

In accordance with findings of poor agreement between Fitbit Versa and the criterion measurement’s (Jaeger Oxycon Pro) estimations of energy expenditure, we also found poor agreement between corresponding estimations by Fitbit Versa and the relative criterion measurement ActiGraph GT3X, at all treadmill speeds (Table 4). For the overall treadmill walk, the agreement was poor (ICC –0.04, 95% CI –0.13 to 0.12) as it also was at specific treadmill speeds (Table 5).

Fitbit Versa versus ActiGraph GT3X

Due to zero variation in data, ICC calculations of energy expenditure estimated by ActiGraph GT3X and Fitbit Versa during seated rest were not possible to perform. The Bland-Altman plot provided a mean difference of 0.00 MET, LOA –0.33 to 0.34 to for seated rest indicating a high agreement in estimations of heart rate between the devices (Table 4). Also, there were minimal individual differences between measurements during rest (MAPE 0.36) but greater differences (MAPE 76.86) at 3.0 km/h, however they decreased as treadmill speed increased (MAPE 39.44 at 4.5 km/h, MAPE 25.01 at 6 km/h) (Table 5).

Findings suggest a fair agreement (ICC 0.54, 95% CI 0.02 to 0.78) and a strong significant correlation (ρ=0.51, P≤.001) of step count estimations by Fitbit Versa and ActiGraph GT3X at the overall treadmill level (Table 5, Figure 3). At specific treadmill speeds, the agreement was good at both 3.0 km/h (ICC 0.71, 95% CI 0.44 to 0.84) and at 4.5 km/h (ICC 0.69, 95% CI 0.29 to 0.85), but decreased at 6 km/h (ICC 0.05, 95% CI –0.51 to 0.35) (Table 5).

There was fair and significant correlation in step count between devices in 2 out of 3 treadmill speeds (3.0 km/h, 4.5 km/h) and the overall treadmill walk (ρ=0.51, P≤.001). The ANOVA results were significant for the overall treadmill walk and at the 2 higher treadmill speeds (P≤.001) while the Bland-Altman plots showed a mean difference at the overall speed by –4.98 steps, LOA –16.68 to 6.71 (Table 5, Figure 4). MAPE ranged from 5.39 at 3.0 km/h to 11.15 at 6 km/h, with 5.41 for the overall treadmill walk (Table 5). The correlation between ActiGraph GT3X estimations of energy expenditure and Fitbit estimations of step count were weak for the treadmill walk in the laboratory setting. However, the correlation between ActiGraph GT3X estimations of step count and Fitbit estimations of energy expenditure were significant and fair for the slowest (ρ=0.42, P=.01) and fastest (ρ=0.37, P=.37) treadmill speed. Moderate and significant correlation (ρ=0.60, P≤.001) was found at 4.5 km/h (Table 5).

In the free-living setting, we found fair agreement between Fitbit Versa and ActiGraph GT3X’s estimations of energy expenditure (ICC 0.46, 95% CI –0.16 to 0.80), and a significant and strong correlation (ρ=0.79, P≤.001). ANOVA results show no systematic differences between estimations (P≤.001), which is confirmed by the Bland-Altman plot mean difference by 1.20 MET, LOA 0.17 to 2.24 MET and MAPE 31.11 (Table 5). The agreement between Fitbit Versa and ActiGraph GT3X’s estimations of step count were good (ICC 0.70, 95% CI –0.21 to 0.91) and the correlation between estimations was strong (ρ=0.87, P≤.001). ANOVA results showed no systematic differences between step count estimations (P≤.001). Bland-Altman plot showed a mean difference with –7.12 steps, LOA –16.25 to 2.00 confirming an agreement (Figure 5). MAPE, on the other hand, was 82.45, indicating great individual bias (Table 5).

The correlation between ActiGraph GT3X estimations of energy expenditure and Fitbit Versa estimations of step count were significant and fair (ρ=0.41, P=.01). A corresponding association was found (ρ=0.55, P≤.001) between ActiGraph GT3X’s step count, and Fitbit Versa’s estimation of energy expenditure (Table 5).

Principal Findings

To our knowledge, this is the first study that has evaluated criterion validity of Fitbit Versa’s estimations of energy expenditure, step count, and heart rate for patients with chronic pain. Evaluations of criterion validity wrist-worn outputs of energy expenditure, heart rate, and step count is essential before any clinical application may be implemented [39]. Poor agreement (ICC, mean difference and LOA, MAPE) as well as poor correlation were found between the criterion measurement (Jaeger Oxycon Pro) and the experimental measurement (Fitbit Versa) regarding energy expenditure for the overall treadmill walk as well as the 3 specific treadmill speeds (Table 4). However, good agreement and fair correlation emerged between estimations of step count by Fitbit Versa and ActiGraph GT3X for the majority of the treadmill walk as well as the overall treadmill walk (Table 5). Good agreement and correlation were shown for the estimation of heart rate during seated rest as well, but this decreased during all treadmill speeds.

Comparison With Previous Studies

Findings suggest that Fitbit Versa systematically overestimated energy expenditure across the full range of the testing protocol when compared to both the criterion (Jaeger Oxycon Pro) and the relative criterion measurement (ActiGraph GT3X). A strict comparison of our study findings with other research within this population was not possible due to the lack of such studies. On the other hand, studies have been conducted aiming to evaluate criterion validity in wrist-worn activity trackers among an elderly population [56] as well as among populations suffering from chronic cardiac conditions [57,58]. These reports also suggest an overestimation of energy expenditure. Herkert et al [57] studied the accuracy of Fitbit Charge 2 among patients with chronic heart conditions and compared estimations of energy expenditure with indirect calorimetry (Oxycon Mobile) during several household activities and a treadmill walk (4.0 km/h, 5.5 km/h, 4.0 km/h + 5% slope). While their findings are not strictly applicable to our sample, both samples included patients with physically impairments and findings suggested a clear overestimation of energy expenditure [57].

The results of other studies [19,24], conducted primarily among healthy participants and examining the validity of other Fitbit models’ ability to estimate energy expenditure, contradict our results—certain Fitbit models (Fitbit, Fitbit Ultra/Fitbit Zio, Fitbit Flex) underestimated energy expenditure and step count when data were compared to the criterion measurements. Furthermore, we found good agreement between step count by Fitbit Versa and ActiGraph GT3X for the first 2 treadmill speeds, as well as fair agreement for the overall treadmill test, but agreement decreased at 6 km/h. In the free-living setting, we found good agreement and excellent correlation between step count estimation by Fitbit Versa and ActiGraph GT3X. This corresponds to the findings of a study [59] that compared step count estimation of healthy participants in a free-living setting using a Fitbit device (Fitbit One) and ActiGraph GT3X, and reported excellent agreement between the 2 measurements.

The overestimation of energy expenditure found in our study could be explained by the proprietary algorithms applied by Fitbit, which are not tailored to specific populations. In the specific population related to this study, altered movement patterns is indicated due to changed motor control and kinematics as well as due to a fear of pain causing a protective avoidance in movement and activity [60-62]. Fitbit’s estimation of energy expenditure is based on both body composition metrics as well as (if available in the device) estimated heart rate [33,34], which requires a valid heart rate measurement. Our findings indicate a poor to fair criterion validity in estimations of heart rate for all treadmill speeds, which is consistent with previous studies [28,63]. Another important factor that may have influenced our findings is the placement of devices on the body. Our experimental device was placed on participants’ wrists according to the manufacturer’s instructions, and the relative criterion measurement device were placed around participants’ waists, near their right hip (also according to the manufacturer’s instructions). Feehan with colleges concludes that placing devices on the wrist generally leads to an overestimation of energy expenditure which may be explained by the waist sensor being placed closer to the center of the body [23].

Strengths and Limitations

Data collection was performed in both a laboratory, limiting many confounding variables, and in a natural free-living context, increasing the ecological validity, which is in line with current recommendations for validity studies examining wearable monitors for physical activity [39,40]. In this study, conventional statistics were applied in order to conduct a comprehensive evaluation of the criterion validity and report findings in a standardized manner [54]. Equivalence testing is also recommended as it provides both a risk evaluation of measurement agreement and zones of equivalence between estimations is established by consensus [54]. This method was not applied in this study, which may be a limitation, as knowledge of statistically significant risk of misclassified physical activity–level would certainly contribute to the interpretation of the results. The sample size was in accordance with the initial sample size calculation and equivalent to corresponding research studies [20,22,57,59,64]. Furthermore, when conducting validation studies, guidelines state that one should perform measurements in a large range of physical activity intensities [39]. In this study, treadmill speed was set to a maximum of 6 km/h which may be interpreted as a low intensity; however, we suspected that a higher speed could have been problematic for some participants. The fact that 5 of the 41 participants were unable to complete the treadmill walk at 6 km/h indicates that a higher treadmill speed could have resulted in a higher number of discontinuations. A possible source of bias in validation studies is pharmacological use of beta blockers because it affects heart rate and possibly biases evaluation of physical activity intensity level. In this study, only 2 participants (<5%) reported taking beta blockers within the last 24 hours which may have affected participants perceived exertion during treadmill use. However, we estimate this having a very little impact on our results.

Our sample included 76% women (31/41), which is in line with other studies describing people with chronic pain in Sweden [65,66] and other countries [67]. Patients in our sample had lived with pain for a shorter time than what is described in other studies of this population [68,69], and they rated their pain severity at baseline as equal to what have been reported in another study [70] describing patients participating in primary care management of low back pain patients. In our study, 37% of participants reported an exercise level of >90 minutes/week and 49% reported being physically active >150 minutes/week. A previous study [71] reported that that 38% of participating men and 37% of participating women with chronic pain were physically active >60 minutes/week. Since almost half of our participants report reaching the recommended weekly amounts of physical activity, it seems that our sample is slightly more physically active than the population with chronic pain, in general. In all, we believe that the external validity can be extended to the major group of individuals with chronic pain seeking care in primary and specialist care.


This study provides new knowledge on the criterion validity of Fitbit Versa’s estimations of energy expenditure, heart rate, and step count in patients with chronic pain. Findings show that Fitbit Versa overestimates energy expenditure when compared to criterion estimations in a controlled laboratory setting as well as in free-living settings, which needs to be considered when used clinically for patients with chronic pain. Step count measured from the wrist, however, seems to provide a valid estimation, suggesting that future guidelines should include this variable in this major patient group. Findings may contribute to the solicited documentation of estimation properties of wrist-worn activity tracking devices within specific patient groups and may therefore guide future application in further clinical research.


The authors would like to thank all our study participants who kindly devoted their time to the study. We are also truly grateful to all the involved staff at the primary and specialized health care clinics in Region Dalarna for their invaluable support in the recruitment process. We want to acknowledge Linnea McCarthy, MSc, at Dalarna University for her support with cross-checking data, and to Anton Westman, PhD, for his initial support with the Actilife software.

The study was funded by the Swedish Research Council for Health, Working Life and Welfare (registration number 2017-00491 and 2017-00177), the Swedish Research Council (registration number 2018-02455 and 2015-02512), and research funding from Dalarna University. Sponsors had no part in study design, data collection, data analysis, data interpretation, writing of the report, or the decision to submit for publication.

Authors' Contributions

LV and BÄ are responsible for the conception and design of the study. VS and JW contributed to study design, recruited participants, and collected all data. VS conducted data management, data reduction, and all statistical analyses with guidance from AM, BÄ, RLM, and LV. VS, JW, LV, and BÄ had full access to all data. BÄ was the principal investigator and was involved in all methodological decisions. RLM assisted with the retrieval of data from the commercial activity tracker and provided valuable insights regarding the interpretation of the results. MH contributed with field expertise throughout all stages of data collection, statistical analysis, and writing of the manuscript. VS drafted the manuscript. All authors reviewed and approved the final manuscript.

Conflicts of Interest

None declared.

  1. IASP terminology. International Association for the Study of Pain (IASP).   URL: [accessed 2020-08-18]
  2. Merskey H, Bogduk N. The present classification. In: Classification of Chronic Pain. Descriptions of Chronic Pain Syndromes and Definitions of Pain Terms. Seattle: IASP Press; 1994:11-11.
  3. Geneen LJ, Moore RA, Clarke C, Martin D, Colvin LA, Smith BH. Physical activity and exercise for chronic pain in adults: an overview of Cochrane Reviews. Cochrane Database Syst Rev 2017 Apr 24;4:CD011279 [FREE Full text] [CrossRef] [Medline]
  4. Piercy KL, Troiano RP. Physical activity guidelines for Americans from the US Department of Health and Human Services. Circ Cardiovasc Qual Outcomes 2018 Nov;11(11):e005263. [CrossRef] [Medline]
  5. Warburton DER, Bredin SSD. Health benefits of physical activity: a systematic review of current systematic reviews. Curr Opin Cardiol 2017 Sep;32(5):541-556. [CrossRef] [Medline]
  6. Ambrose KR, Golightly YM. Physical exercise as non-pharmacological treatment of chronic pain: why and when. Best Practice & Research Clinical Rheumatology 2015 Feb;29(1):120-130. [CrossRef]
  7. Gatchel RJ, McGeary DD, McGeary CA, Lippe B. Interdisciplinary chronic pain management: past, present, and future. Am Psychol 2014;69(2):119-130. [CrossRef] [Medline]
  8. Ainsworth BE, Haskell WL, Whitt MC, Irwin ML, Swartz AM, Strath SJ, et al. Compendium of physical activities: an update of activity codes and MET intensities. Med Sci Sports Exerc 2000 Sep;32(9 Suppl):S498-S504. [Medline]
  9. Shepard R, Tudor-Locke C. A history of physical activity measurement in epidemiology. In: Shephard RJ, Tudor-Locke C, editors. The objective monitoring of physical activity: Contributions of Accelerometry to Epidemiology, Exercise, Science and Rehabilitation. The Objective Monitoring of Physical Activity: Contributions of Accelerometry to Epidemiology, Exercise Science and Rehabilitation. 1st edition. Switzerlandpringer International Publishing; 2016:39-83.
  10. Wright SP, Hall Brown TS, Collier SR, Sandberg K. How consumer physical activity monitors could transform human physiology research. Am J Physiol Regul Integr Comp Physiol 2017 Mar 01;312(3):R358-R367 [FREE Full text] [CrossRef] [Medline]
  11. Bai Y, Welk GJ, Nam YH, Lee JA, Lee J, Kim Y, et al. Comparison of consumer and research monitors under semistructured settings. Med Sci Sports Exerc 2015 Jul 6. [CrossRef] [Medline]
  12. Ferguson T, Rowlands AV, Olds T, Maher C. The validity of consumer-level, activity monitors in healthy adults worn in free-living conditions: a cross-sectional study. Int J Behav Nutr Phys Act 2015;12:42 [FREE Full text] [CrossRef] [Medline]
  13. Brickwood K, Watson G, O'Brien J, Williams AD. Consumer-based wearable activity trackers increase physical activity participation: systematic review and meta-analysis. JMIR Mhealth Uhealth 2019 Apr 12;7(4):e11819 [FREE Full text] [CrossRef] [Medline]
  14. Le Hello C, Trombert B, Morel A, Chieh A, Brouard B, Boissier C. Performance analysis of walking of 10,000 regular users of a connected activity tracker. J Med Vasc 2018 Jul;43(4):231-237. [CrossRef] [Medline]
  15. Almalki M, Gray K, Sanchez FM. The use of self-quantification systems for personal health information: big data management activities and prospects. Health Inf Sci Syst 2015;3(Suppl 1 HISA Big Data in Biomedicine and Healthcare 2013 Con):S1 [FREE Full text] [CrossRef] [Medline]
  16. Lobelo F, Kelli HM, Tejedor SC, Pratt M, McConnell MV, Martin SS, et al. The wild wild west: a framework to integrate mhealth software applications and wearables to support physical activity assessment, counseling and interventions for cardiovascular disease risk reduction. Prog Cardiovasc Dis 2016;58(6):584-594. [CrossRef] [Medline]
  17. Evenson KR, Goto MM, Furberg RD. Systematic review of the validity and reliability of consumer-wearable activity trackers. Int J Behav Nutr Phys Act 2015;12(1):159 [FREE Full text] [CrossRef] [Medline]
  18. Diaz KM, Krupka DJ, Chang MJ, Peacock J, Ma Y, Goldsmith J, et al. Fitbit®: An accurate and reliable device for wireless physical activity tracking. Int J Cardiol 2015 Apr 15;185:138-140. [CrossRef] [Medline]
  19. Adam NJ, Spierer DK, Gu J, Bronner S. Comparison of steps and energy expenditure assessment in adults of Fitbit Tracker and Ultra to the Actical and indirect calorimetry. J Med Eng Technol 2013 Oct;37(7):456-462. [CrossRef] [Medline]
  20. Gusmer R, Bosch T, Watkins A, Ostrem J, Dengel D. Comparison of FitBit® Ultra to ActiGraph™ GT1M for assessment of physical activity in young adults during treadmill walking. TOSMJ 2014 Apr 04;8(1):11-15. [CrossRef]
  21. Stahl ST, Insana SP. Caloric expenditure assessment among older adults: criterion validity of a novel accelerometry device. J Health Psychol 2014 Nov;19(11):1382-1387 [FREE Full text] [CrossRef] [Medline]
  22. Wahl Y, Düking P, Droszez A, Wahl P, Mester J. Criterion-validity of commercially available physical activity tracker to estimate step count, covered distance and energy expenditure during sports conditions. Front Physiol 2017;8:725 [FREE Full text] [CrossRef] [Medline]
  23. Feehan LM, Geldman J, Sayre EC, Park C, Ezzat AM, Yoo JY, et al. Accuracy of Fitbit devices: systematic review and narrative syntheses of quantitative data. JMIR Mhealth Uhealth 2018 Aug 09;6(8):e10527 [FREE Full text] [CrossRef] [Medline]
  24. Imboden MT, Nelson MB, Kaminsky LA, Montoye AH. Comparison of four Fitbit and Jawbone activity monitors with a research-grade ActiGraph accelerometer for estimating physical activity and energy expenditure. Br J Sports Med 2017 May 08. [CrossRef] [Medline]
  25. Dannecker KL, Sazonova NA, Melanson EL, Sazonov ES, Browning RC. A comparison of energy expenditure estimation of several physical activity monitors. Med Sci Sports Exerc 2013 Nov;45(11):2105-2112 [FREE Full text] [CrossRef] [Medline]
  26. Case MA, Burwick HA, Volpp KG, Patel MS. Accuracy of smartphone applications and wearable devices for tracking physical activity data. JAMA 2015 Feb 10;313(6):625-626. [CrossRef] [Medline]
  27. Kooiman TJM, Dontje ML, Sprenger SR, Krijnen WP, van DSCP, de GM. Reliability and validity of ten consumer activity trackers. BMC Sports Sci Med Rehabil 2015;7:24 [FREE Full text] [CrossRef] [Medline]
  28. Boudreaux BD, Hebert EP, Hollander DB, Williams BM, Cormier CL, Naquin MR, et al. Validity of wearable activity monitors during cycling and resistance exercise. Med Sci Sports Exerc 2018 Mar;50(3):624-633. [CrossRef] [Medline]
  29. Zou GY. Sample size formulas for estimating intraclass correlation coefficients with precision and assurance. Stat Med 2012 Dec 20;31(29):3972-3981. [CrossRef] [Medline]
  30. Bästa frågan till patienter om fysisk aktivitet. GIH.   URL: https:/​/www.​​OM-GIH/​Press-och-media/​Pressmeddelanden/​Basta-fragan-om-fysisk-aktivitet-till-patienter/​ [accessed 2020-08-20]
  31. Forskningsstudie om validering av indikatorfrågor till patienter om fysisk aktivitet. Socialstyrelsen. 2013.   URL: http:/​/www.​​static/​files/​1185/​forskningsstudie_validering_av_indikatorfragor_till_patienter_om_fysisk_aktivitet.​pdf [accessed 2020-08-20]
  32. Bergström G, Jensen IB, Bodin L, Linton SJ, Nygren AL, Carlsson SG. Reliability and factor structure of the Multidimensional Pain Inventory--Swedish language version (MPI-S). Pain 1998 Mar;75(1):101-110. [CrossRef] [Medline]
  33. How does my Fitbit device calculate calories burned? Fitbit.   URL: [accessed 2020-08-30]
  34. How does my Fitbit device calculate active minutes? Fitbit.   URL: [accessed 2020-08-30]
  35. Haugen HA, Chan L, Li F. Indirect calorimetry: a practical guide for clinicians. Nutr Clin Pract 2007 Aug;22(4):377-388. [CrossRef] [Medline]
  36. Rietjens GJ, Kuipers H, Kester AD, Keizer HA. Validation of a computerized metabolic measurement system (Oxycon-Pro) during low and high intensity exercise. Int J Sports Med 2001 May;22(4):291-294. [CrossRef] [Medline]
  37. Tucker JM, Welk GJ, Beyler NK. Physical activity in U.S.: adults compliance with the Physical Activity Guidelines for Americans. Am J Prev Med 2011 Apr;40(4):454-461. [CrossRef] [Medline]
  38. Sasaki JE, John D, Freedson PS. Validation and comparison of ActiGraph activity monitors. J Sci Med Sport 2011 Sep;14(5):411-416. [CrossRef] [Medline]
  39. Bassett DR, Rowlands A, Trost SG. Calibration and validation of wearable monitors. Med Sci Sports Exerc 2012 Jan;44(1 Suppl 1):S32-S38 [FREE Full text] [CrossRef] [Medline]
  40. Keadle SK, Lyden KA, Strath SJ, Staudenmayer JW, Freedson PS. A framework to evaluate devices that assess physical behavior. Exercise and Sport Sciences Reviews 2019;47(4):206-214. [CrossRef]
  41. Dowd KP, Harrington DM, Donnelly AE. Criterion and concurrent validity of the activPAL™ professional physical activity monitor in adolescent females. PLoS One 2012;7(10):e47633 [FREE Full text] [CrossRef] [Medline]
  42. Borg G. Perceived exertion as an indicator of somatic stress. Scand J Rehabil Med 1970;2(2):92-98. [Medline]
  43. Bijur PE, Silver W, Gallagher EJ. Reliability of the visual analog scale for measurement of acute pain. Acad Emerg Med 2001 Dec;8(12):1153-1157. [Medline]
  44. Web API reference. Fitbit Platform.   URL: [accessed 2021-01-04]
  45. Redenius N, Kim Y, Byun W. Concurrent validity of the Fitbit for assessing sedentary behavior and moderate-to-vigorous physical activity. BMC Med Res Methodol 2019 Feb 7;19(1). [CrossRef]
  46. Lyden K, Kozey SL, Staudenmeyer JW, Freedson PS. A comprehensive evaluation of commonly used accelerometer energy expenditure and MET prediction equations. Eur J Appl Physiol 2011 Feb;111(2):187-201 [FREE Full text] [CrossRef] [Medline]
  47. Freedson PS, Melanson E, Sirard J. Calibration of the Computer Science and Applications, Inc. accelerometer. Med Sci Sports Exerc 1998 May;30(5):777-781. [Medline]
  48. Barnhart HX, Haber MJ, Lin LI. An overview on assessing agreement with continuous measurements. J Biopharm Stat 2007;17(4):529-569. [CrossRef] [Medline]
  49. Liljequist D, Elfving B, Skavberg Roaldsen K. Intraclass correlation - a discussion and demonstration of basic features. PLoS One 2019;14(7):e0219854 [FREE Full text] [CrossRef] [Medline]
  50. Cicchetti DV. Guidelines, criteria, and rules of thumb for evaluating normed and standardized assessment instruments in psychology. Psychological Assessment 1994;6(4):284-290. [CrossRef]
  51. Bland JM, Altman DG. Statistical methods for assessing agreement between two methods of clinical measurement. Lancet 1986 Feb 8;1(8476):307-310. [Medline]
  52. Akoglu H. User's guide to correlation coefficients. Turk J Emerg Med 2018 Sep;18(3):91-93 [FREE Full text] [CrossRef] [Medline]
  53. Chan YH. Biostatistics 104: correlational analysis. Singapore Med J 2003 Dec;44(12):614-619 [FREE Full text] [Medline]
  54. Welk GJ, Bai Y, Lee J, Godino J, Saint-Maurice PF, Carr L. Standardizing analytic methods and reporting in activity monitor validation studies. Medicine & Science in Sports & Exercise 2019;51(8):1767-1780. [CrossRef]
  55. Fox-Wasylyshyn SM, El-Masri MM. Handling missing data in self-report measures. Res Nurs Health 2005 Dec;28(6):488-495. [CrossRef] [Medline]
  56. Tedesco S, Sica M, Ancillao A, Timmons S, Barton J, O’Flynn B. Accuracy of consumer-level and research-grade activity trackers in ambulatory settings in older adults. PLoS ONE 2019 May 21;14(5):e0216891. [CrossRef]
  57. Herkert C, Kraal JJ, van Loon EMA, van Hooff M, Kemps HMC. Usefulness of modern activity trackers for monitoring exercise behavior in chronic cardiac patients: validation study. JMIR Mhealth Uhealth 2019 Dec 19;7(12):e15045 [FREE Full text] [CrossRef] [Medline]
  58. Vetrovsky T, Siranec M, Marencakova J, Tufano JJ, Capek V, Bunc V, et al. Validity of six consumer-level activity monitors for measuring steps in patients with chronic heart failure. PLoS One 2019;14(9):e0222569 [FREE Full text] [CrossRef] [Medline]
  59. Gomersall SR, Ng N, Burton NW, Pavey TG, Gilson ND, Brown WJ. Estimating physical activity and sedentary behavior in a free-living context: a pragmatic comparison of consumer-based activity trackers and ActiGraph accelerometry. J Med Internet Res 2016 Sep 07;18(9):e239 [FREE Full text] [CrossRef] [Medline]
  60. Meulders A. From fear of movement-related pain and avoidance to chronic pain disability: a state-of-the-art review. Current Opinion in Behavioral Sciences 2019 Apr;26:130-136. [CrossRef]
  61. Burnett AF, Cornelius MW, Dankaerts W, O'sullivan PB. Spinal kinematics and trunk muscle activity in cyclists: a comparison between healthy controls and non-specific chronic low back pain subjects-a pilot investigation. Man Ther 2004 Nov;9(4):211-219. [CrossRef] [Medline]
  62. Vlaeyen JW, Linton SJ. Fear-avoidance and its consequences in chronic musculoskeletal pain: a state of the art. Pain 2000;85(3):317-332. [CrossRef]
  63. Nelson BW, Allen NB. Accuracy of consumer wearable heart rate measurement during an ecologically valid 24-hour period: intraindividual validation study. JMIR Mhealth Uhealth 2019 Mar 11;7(3):e10828 [FREE Full text] [CrossRef] [Medline]
  64. Sushames A, Edwards A, Thompson F, McDermott R, Gebel K. Validity and reliability of Fitbit Flex for step count, moderate to vigorous physical activity and activity energy expenditure. PLoS One 2016;11(9):e0161224 [FREE Full text] [CrossRef] [Medline]
  65. Tseli E, Vixner L, LoMartire R, Grooten WJA, Gerdle B, Äng BO. Prognostic factors for improved physical and emotional functioning one year after interdisciplinary rehabilitation in patients with chronic pain: Results from a national quality registry in Sweden. J Rehabil Med 2020 Feb 27;52(2):jrm00019 [FREE Full text] [CrossRef] [Medline]
  66. Gerdle B, Björk J, Cöster L, Henriksson K, Henriksson C, Bengtsson A. Prevalence of widespread pain and associations with work status: a population study. BMC Musculoskelet Disord 2008 Jul 15;9:102 [FREE Full text] [CrossRef] [Medline]
  67. Macfarlane GJ, Morris S, Hunt IM, Benjamin S, McBeth J, Papageorgiou AC, et al. Chronic widespread pain in the community: the influence of psychological symptoms and mental disorder on healthcare seeking behavior. J Rheumatol 1999 Feb;26(2):413-419. [Medline]
  68. Breivik H, Collett B, Ventafridda V, Cohen R, Gallacher D. Survey of chronic pain in Europe: prevalence, impact on daily life, and treatment. Eur J Pain 2006 May;10(4):287-333. [CrossRef] [Medline]
  69. Gerdle B, Boersma K, Åsenlöf P, Stålnacke B, Larsson B, Ringqvist Å. Influences of sex, education, and country of birth on clinical presentations and overall outcomes of interdisciplinary pain rehabilitation in chronic pain patients: a cohort study from the Swedish Quality Registry for Pain Rehabilitation (SQRP). J Clin Med 2020 Jul 25;9(8) [FREE Full text] [CrossRef] [Medline]
  70. Johansson E, Lindberg P. Low back pain patients in primary care: Subgroups based on the multidimensional pain inventory. Int J Behav Med 2000 Dec;7(4):340-352. [CrossRef]
  71. Nilsen TIL, Holtermann A, Mork PJ. Physical exercise, body mass index, and risk of chronic pain in the low back and neck/shoulders: longitudinal data from the Nord-Trondelag Health Study. Am J Epidemiol 2011 Aug 01;174(3):267-273. [CrossRef] [Medline]

ANOVA: analysis of variance
ICC: intraclass correlation coefficient
LOA: limits of agreement
MAPE: mean absolute percent error
MVPA: moderate-to-vigorous physical activity

Edited by G Eysenbach; submitted 06.10.20; peer-reviewed by J Kraal, B Boudreaux; comments to author 28.10.20; revised version received 06.12.20; accepted 12.12.20; published 12.01.21


©Veronica Sjöberg, Jens Westergren, Andreas Monnier, Riccardo Lo Martire, Maria Hagströmer, Björn Olov Äng, Linda Vixner. Originally published in JMIR mHealth and uHealth (, 12.01.2021.

This is an open-access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on, as well as this copyright and license information must be included.