Accuracy of Smart Scales on Weight and Body Composition: Observational Study

Background Smart scales are increasingly used at home by patients to monitor their body weight and body composition, but scale accuracy has not often been documented. Objective The goal of the research was to determine the accuracy of 3 commercially available smart scales for weight and body composition compared with dual x-ray absorptiometry (DEXA) as the gold standard. Methods We designed a cross-sectional study in consecutive patients evaluated for DEXA in a physiology unit in a tertiary hospital in France. There were no exclusion criteria except patient declining to participate. Patients were weighed with one smart scale immediately after DEXA. Three scales were compared (scale 1: Body Partner [Téfal], scale 2: DietPack [Terraillon], and scale 3: Body Cardio [Nokia Withings]). We determined absolute error between the gold standard values obtained from DEXA and the smart scales for body mass, fat mass, and lean mass. Results The sample for analysis included 53, 52, and 48 patients for each of the 3 tested smart scales, respectively. The median absolute error for body weight was 0.3 kg (interquartile range [IQR] –0.1, 0.7), 0 kg (IQR –0.4, 0.3), and 0.25 kg (IQR –0.10, 0.52), respectively. For fat mass, absolute errors were –2.2 kg (IQR –5.8, 1.3), –4.4 kg (IQR –6.6, 0), and –3.7 kg (IQR –8.0, 0.28), respectively. For muscular mass, absolute errors were –2.2 kg (IQR –5.8, 1.3), –4.4 kg (IQR –6.6, 0), and –3.65 kg (IQR –8.03, 0.28), respectively. Factors associated with fat mass measurement error were weight for scales 1 and 2 (P=.03 and P<.001, respectively), BMI for scales 1 and 2 (P=.034 and P<.001, respectively), body fat for scale 1 (P<.001), and muscular and bone mass for scale 2 (P<.001 for both). Factors associated with muscular mass error were weight and BMI for scale 1 (P<.001 and P=.004, respectively), body fat for scales 1 and 2 (P<.001 for both), and muscular and bone mass for scale 2 (P<.001 and P=.002, respectively). Conclusions Smart scales are not accurate for body composition and should not replace DEXA in patient care. Trial Registration ClinicalTrials.gov NCT03803098; https://clinicaltrials.gov/ct2/show/NCT03803098


Introduction
Cellular-connected scales, familiarly known as smart scales, are increasingly used at home for weight follow-up. They have been shown to increase the frequency of self-weighing and weight loss [1,2]. They can be connected to other smart objects, such as motion sensors, and thus may help subjects engage in greater physical activity and better nutritional habits. Most available smart scales combine a classic weight scale with a foot-to-foot impedance meter (FFI) that can estimate body composition (ie, fat mass [FM] and fat-free mass [FFM]) by measuring foot-to-foot impedance at different frequency. Whole body FFM is calculated from a model comprising body impedance, height, weight, and age trained with dual x-ray absorptiometry (DEXA) data [3]. Smart scales are easier to use than medical impedance meters since they do not require a supine position and their electrodes are indefinitely reusable. But the accuracy of smart scales depends on the representativeness of the patient population used to train the model and the model itself. Although some FFIs have been compared with DEXA and to medical impedance meters [3,4], no data are available for smart scale FFIs, and the regression equations used are unknown. The purpose of this cross-sectional study was to assess metrologic accuracy of 3 commercially available smart scales compared with DEXA as the gold standard in a population of adult patients from a tertiary hospital physiology unit.

Patients
Consecutive adult patients evaluated for body composition by DEXA at the Bichat Hospital during the study were eligible. Patients were referred for DEXA because of obesity or a chronic condition that can affect body composition (eg, chronic kidney disease, long-term steroids). All patients were included in the analysis except those who refused to participate or whose weight measured by Lunar iDXA (

Statistics
Patient characteristics were described as median and interquartile range for quantitative variables and percentages for categorical variables. Absolute and relative errors for each scale with respect to DEXA were reported with median and interquartile range. Bland-Altman representations were used to show systematic bias or trend in measurement error. Univariate linear models were used to estimate and test the association between measurement error and possible associated variables. For this analysis, we reported the slope estimation with its 95% confidence interval and the P value corresponding to the Wald statistic. We performed no imputation for missing data. The significance threshold was .05. All analyses were performed using R software version 4.0.2 (R Foundation for Statistical Computing).

Ethical Aspects
This study is part of the Evaluation of the Metrological Reliability of Connected Objects in the Measurement of Medical Physiological Parameters (EvalExplo) study [NCT03803098]. Ethics approval was obtained from Comité de Protection des Personnes Sud Est VI (approval number AU 1443), and written nonopposition was obtained.

Patient Characteristics
The final sample for analysis included 53, 52, and 48 patients for each scale, respectively, after taking into account missing data (eg, smart scale not retrieving data despite several attempts for all but one who was excluded because of excessive weight for the scales). Patient characteristics are presented in Table 1. There were no significant differences between the 3 groups.

Accuracy of the Scales
All 3 scales gave rather accurate weights, with a median absolute difference of less than a kilogram compared with DEXA. The median absolute error for body weight was 0. Bland-Altman graphs are presented in Figure 1 for weight, body fat, and muscular mass for the 3 scales. They show a significant linear trend of body weight on measured weight, but absolute errors remain reasonable and compatible with clinical practice. However, significant linear trends for fat and muscular mass (scales 1 and 2, respectively) lead to very high errors.  Tables 2 and 3 show the factors associated with fat and muscular mass measurement error. Factors associated with fat mass measurement error were weight for scales 1 and 2 (P=.03 and P<.001, respectively), BMI for scales 1 and 2 (P=.034 and P<.001, respectively), body fat for scale 1 (P<.001), and muscular and bone mass for scale 2 (P<.001 and P<.001, respectively). Factors associated with muscular mass error were weight and BMI for scale 1 (P<.001 and P=.004, respectively), body fat for scales 1 and 2 (P<.001 and P<.001, respectively), and muscular and bone mass for scale 2 (P<.001 and P=.002, respectively).

Factors Associated With Measurement Error
We found no factor associated with measurement error of fat or muscular mass for scale 3. Sex did not show a significant influence on measurement error.

Principal Findings
To our knowledge, this is the first study to assess metrologic accuracy of commercially available smart scales (scale 1: Body Partner, scale 2: DietPack, scale 3: Body Cardio). We show that all scales were reasonably accurate for body weight but not body composition. Total body weight was associated with fat mass and muscular measurement error for scales 1 and 2, but we were not able to find factors associated with error measurement for scale 3.

Possible Explanations for the Lack of Accuracy
Smart scales combine a classic weight scale and an FFI, which has been widely available for almost three decades and has been compared with medical impedance meters and DEXA in several publications [5][6][7][8]. They have been shown to be more sensitive to differences in morphology than whole body impedance measurements, since their data depend upon an extrapolation of measurements made on the lower part of the body to the entire body. Indeed, Bousbiat et al [3] conducted an extensive technical study on FFIs. They found that measurement can be affected by surface contact (ie, foot size and width) and sweat but also by foot position on the scale and flexion of the legs. Surface contact with the electrodes will vary depending on the subject's foot length and width, and thus can be affected by total body weight and total height. Since there is no precise guidance on the scales for subjects on where to put their feet during body composition estimation, this may partly explain the differences between DEXA and scales. In clinical settings, clinicians or technicians should pay attention to the subject's position during measurement. In the same way, subjects should be advised not to bend their legs. At home, subjects should try to follow directions given by the scale as closely as possible and keep the same position on the scale for follow-up.
In FFIs, whole body FFM is calculated from a regression equation expressing resistance generally as a function of height, weight, and age determined by comparison with DEXA data, while FM is calculated as the weight of FFM. Each smart scale has its own regression equation. It is thus plausible that BMI can affect error measurement. We did not find any factor associated with error measurement for scale 3 despite a very high dispersion, which can have different explanations: our study may be underpowered to detect such association or unobserved variables not part of the secret regression implemented in the smart scale may explain the residual error.

Clinical Relevance of These Data
Weight was accurately measured by all 3 scales, but body fat was underestimated. For scales 1 and 2, we found a significant effect of higher body weight on fat mass error; this error remains small compared with total body weight in patients with obesity but can be of importance in patients with normal or underweight. Ross et al [9] compared the weight from in-person visits and BodyTrace brand smart scales in 58 patients and found a mean bias of 1.1 (SD 0.8) kg, 95% CI 0.5 to 2.6, but agreement seemed lower for patients weighing above 110 kg.
It is unlikely that body composition will be followed by DEXA in the clinical setting, since DEXA uses x-rays. Since smart scales are widely available, it is possible that patients will follow their body composition at home. Thus, if the first body composition is assessed by DEXA, clinicians and patients should be aware that there might be a difference in body composition, which can reach 1 kg. Also, follow-up in the clinical setting should use the same connected device to ensure repeatability of measurements.

Interest for Follow-Up
Despite very poor accuracy for estimating body composition, some authors reported potential interests of in-home use of smart scales. Indeed, in several randomized studies, when compared with commercial weight management programs or standard weight loss counseling, smart scale use was found to allow a greater proportion of participants to achieve significant weight loss after several months (3 to 12 months) [1,2,10]. In these studies, no data were available on body composition and its evolution. Also, several studies report a greater weight loss in patients using smart scales than in patients self-reporting weight loss or being weighted only during visits [11,12]. Last, data exist on the importance of weight variability in final weight loss and weight maintenance [13][14][15]; using smart scales at home with automated data treatment could help health care professionals and patients in achieving and maintaining greater weight loss [16].

Strengths and Limitations
To our knowledge, this is the first cross-sectional study assessing metrologic accuracy of commercially available smart scales. We compared these scales to DEXA, which is considered the gold standard for body composition assessment.
Included patients were evaluated for body composition either because of obesity or because of a chronic condition (steroids, chronic renal failure, etc) without obesity. Thus, our sample covers a wide range of body weight: underweight, lean, and obese patients. The setting of the study in a tertiary hospital explains the relatively small proportion of patients with obesity since the main reason for prescribing DEXA was long-term treatment with steroids. A specific study focusing on patients with obesity and extreme obesity would be required since they are likely to benefit the most from body composition evaluation and follow-up and since the error on fat mass is affected by total weight. This should be done keeping in mind the maximum weight supported by the scales (180 kg for Body Cardio and Body Partner and 160 kg for DietPack), while DEXA supports higher weight (230 kg on our machine).
Finally, although body composition estimation is quick with the smart scales, total experimental time was 15 to 30 minutes for each scale due to scale and app setup and patient information. Thus, it was not possible to evaluate the 3 scales on the same patient or replicate measures. However, due to the very poor accuracy of these scales shown in this study, comparisons between scales and estimation of intraindividual variability would have limited relevance. This study included only a limited number of patients. However, while it is always possible that the studied scales would perform better on an independent sample, the study results are so clear that it is also unlikely that a higher sample size could change our conclusion relative to the accuracy of body composition estimation.

Conclusion
Our study shows that although smart scales are accurate for total body weight, they should not be used routinely to assess body composition, especially in patients with severe obesity. Further studies are needed to clarify their utility in patient follow-up.