Availability, Quality, and Evidence-Based Content of mHealth Apps for the Treatment of Nonspecific Low Back Pain in the German Language: Systematic Assessment

doi:10.2196/47502

Original Paper

Faculty of Social Sciences, City University of Applied Sciences Bremen, Bremen, Germany

Corresponding Author:

Annika Schwarz, Prof Dr

Faculty of Social Sciences, City University of Applied Sciences Bremen

Am Brill 2-4

Bremen, 28195

Germany

Phone: 49 176 151 40 227

Email: annika.schwarz@hs-bremen.de

Background: Nonspecific low back pain (NSLBP) carries significant socioeconomic relevance and leads to substantial difficulties for those who are affected by it. The effectiveness of app-based treatments has been confirmed, and clinicians are recommended to use such interventions. As 88.8% of the German population uses smartphones, apps could support therapy. The available apps in mobile app stores are poorly regulated, and their quality can vary. Overviews of the availability and quality of mobile apps for Australia, Great Britain, and Spain have been compiled, but this has not yet been done for Germany.

Objective: We aimed to provide an overview of the availability and content-related quality of apps for the treatment of NSLBP in the German language.

Methods: A systematic search for apps on iOS and Android was conducted on July 6, 2022, in the Apple App Store and Google Play Store. The inclusion and exclusion criteria were defined before the search. Apps in the German language that were available in both stores were eligible. To check for evidence, the apps found were assessed using checklists based on the German national guideline for NSLBP and the British equivalent of the National Institute for Health and Care Excellence. The quality of the apps was measured using the Mobile Application Rating Scale. To control potential inaccuracies, a second reviewer resurveyed the outcomes for 30% (3/8) of the apps and checked the inclusion and exclusion criteria for these apps. The outcomes, measured using the assessment tools, are presented in tables with descriptive statistics. Furthermore, the characteristics of the included apps were summarized.

Results: In total, 8 apps were included for assessment. Features provided with different frequencies were exercise tracking of prefabricated or adaptable workout programs, educational aspects, artificial intelligence–based therapy or workout programs, and motion detection. All apps met some recommendations by the German national guideline and used forms of exercises as recommended by the National Institute for Health and Care Excellence guideline. The mean value of items rated as “Yes” was 5.75 (SD 2.71) out of 16. The best-rated app received an answer of “Yes” for 11 items. The mean Mobile Application Rating Scale quality score was 3.61 (SD 0.55). The highest mean score was obtained in “Section B–Functionality” (mean 3.81, SD 0.54).

Conclusions: Available apps in the German language meet guideline recommendations and are mostly of acceptable or good quality. Their use as a therapy supplement could help promote the implementation of home-based exercise protocols. A new assessment tool to obtain ratings on apps for the treatment of NSLBP, combining aspects of quality and evidence-based best practices, could be useful.

Trial Registration: Open Science Framework Registries sq435; https://osf.io/sq435

JMIR Mhealth Uhealth 2023;11:e47502

doi:10.2196/47502

Keywords

mobile health (1997); mobile apps (622); smartphone (914); nonspecific low back pain (3); German language (1); intervention (669); digital health (2291); home exercise (11); digital rehabilitation (6); workout (4); mobile phone (3553)

Background

Low back pain (LBP) is a major global health concern affecting millions of people, with an estimated 7.5% of the population or 577 million people experiencing LBP in 2017 [Wu A, March L, Zheng X, Huang J, Wang X, Zhao J, et al. Global low back pain prevalence and years lived with disability from 1990 to 2017: estimates from the Global Burden of Disease Study 2017. Ann Transl Med. Mar 2020;8(6):299. [FREE Full text] [CrossRef] [Medline]1]. Furthermore, the condition was the leading cause for years lived with disability from 1990 to 2017, worldwide [Wu A, March L, Zheng X, Huang J, Wang X, Zhao J, et al. Global low back pain prevalence and years lived with disability from 1990 to 2017: estimates from the Global Burden of Disease Study 2017. Ann Transl Med. Mar 2020;8(6):299. [FREE Full text] [CrossRef] [Medline]1]. In Germany, LBP affects 59.4% of the population and results in decreased work performance and pain persistence, with an average cost of €1322 (US $1456.10) per patient per year [Raspe H. Themenheft 53 "Rückenschmerzen". Robert Koch-Institut. 2012. URL: https://edoc.rki.de/handle/176904/3239 [accessed 2022-03-09] 2-National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4]. Physical exercise and a healthy lifestyle are recommended by national and international guidelines for the management of nonspecific LBP (NSLBP) [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3,National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4]. According to national guidelines, it should be emphasized that exercising does not cause harm but can help to alleviate symptoms in NSLBP [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3]. In addition, an understanding of the biopsychosocial model of illness should be developed [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3]. In this regard, several studies have shown promising evidence for the app, “Kaia Rückenschmerzen—Rückentraining für Zuhause,” which provides a multidisciplinary pain treatment approach [Huber S, Priebe JA, Baumann KM, Plidschun A, Schiessl C, Tölle TR. Treatment of low back pain with a digital multidisciplinary pain treatment app: short-term results. JMIR Rehabil Assist Technol. Dec 04, 2017;4(2):e11. [FREE Full text] [CrossRef] [Medline]5-Priebe JA, Utpadel-Fischler D, Toelle TR. Less pain, better sleep? The effect of a multidisciplinary back pain app on sleep quality in individuals suffering from back pain - a secondary analysis of app user data. J Pain Res. May 20, 2020;13:1121-1128. [FREE Full text] [CrossRef] [Medline]7]. The app is based on 3 principles, which are education, physical exercising, and mindfulness and relaxation techniques [Toelle TR, Utpadel-Fischler DA, Haas K, Priebe JA. App-based multidisciplinary back pain treatment versus combined physiotherapy plus online education: a randomized controlled trial. NPJ Digit Med. May 03, 2019;2:34. [FREE Full text] [CrossRef] [Medline]6]. This approach might even be superior to conventional physiotherapy [Toelle TR, Utpadel-Fischler DA, Haas K, Priebe JA. App-based multidisciplinary back pain treatment versus combined physiotherapy plus online education: a randomized controlled trial. NPJ Digit Med. May 03, 2019;2:34. [FREE Full text] [CrossRef] [Medline]6]. Furthermore, a systematic review focused on the treatment of chronic pain with eHealth and mobile health (mHealth) interventions showed its significant efficacy on short- and medium-term outcomes on pain intensity and depression, as well as short-term reductions in pain-catastrophizing [Moman RN, Dvorkin J, Pollard EM, Wanderman R, Murad MH, Warner DO, et al. A systematic review and meta-analysis of unguided electronic and mobile health technologies for chronic pain-is it time to start prescribing electronic health applications? Pain Med. Nov 01, 2019;20(11):2238-2255. [CrossRef] [Medline]8]. Due to their wide availability and low cost to patients, the authors of the systematic review encourage clinicians to use eHealth and mHealth interventions as an adjunct to their therapy [Moman RN, Dvorkin J, Pollard EM, Wanderman R, Murad MH, Warner DO, et al. A systematic review and meta-analysis of unguided electronic and mobile health technologies for chronic pain-is it time to start prescribing electronic health applications? Pain Med. Nov 01, 2019;20(11):2238-2255. [CrossRef] [Medline]8]. Various sources report an increasing shortage of physiotherapists in Germany [Engpassanalyse. Bundesagentur für Arbeit. 2021. URL: https://statistik.arbeitsagentur.de/DE/Navigation/Statistiken/Interaktive-Statistiken/Fachkraeftebedarf/Engpassanalyse-Nav.html9-Fachkräftemangel: keine Besserung in Sicht. Bundesverband selbstständiger Physiotherapeuten — IFK e. V. 2020. URL: https://ifk.de/artikel/fachkraeftemangel-keine-besserung-sicht11]. However, 88% of the population use smartphones [Tenzer F. Smartphone-Nutzung in Deutschland. Statista. URL: https://de.statista.com/statistik/studie/id/71707/dokument/smartphone-nutzung-in-deutschland/ [accessed 2022-03-18] 12]. Considering the prevailing lack of physiotherapists in Germany, mHealth and eHealth apps could be an addition to the management of patients with NSLBP [Moman RN, Dvorkin J, Pollard EM, Wanderman R, Murad MH, Warner DO, et al. A systematic review and meta-analysis of unguided electronic and mobile health technologies for chronic pain-is it time to start prescribing electronic health applications? Pain Med. Nov 01, 2019;20(11):2238-2255. [CrossRef] [Medline]8]. Guideline-based apps could support therapy and help to close gaps in therapy or continue to support patients after they have completed physiotherapy. In Germany, the use of digital health apps (Digitale Gesundheitsanwendungen [DiGA]) is regulated by the Digital Health Care Act (Digitale-Versorgung-Gesetz) [Bundesärztekammer; Kassenärztliche Bundesvereinigung. Gesundheits-Apps im klinischen Alltag: Handreichung für Ärztinnen und Ärzte. Ärztliches Zentrum für Qualität in der Medizin. 2020. URL: https://www.aezq.de/gesundheitsapps/ueberblick/# [accessed 2022-08-31] 13]. Health apps that are certified as a medical device of risk class I or IIa can be included in the so-called DiGA directory after a review process by the Federal Institute for Drugs and Medical Devices (Bundesinstitut für Arzneimittel und Medizinprodukte). In the review process, aspects of data protection, consumer protection, user-friendliness, and medical efficacy must be provided. Apps included in the DiGA directory can be reimbursed by health insurance companies after medical prescription [Bundesärztekammer; Kassenärztliche Bundesvereinigung. Gesundheits-Apps im klinischen Alltag: Handreichung für Ärztinnen und Ärzte. Ärztliches Zentrum für Qualität in der Medizin. 2020. URL: https://www.aezq.de/gesundheitsapps/ueberblick/# [accessed 2022-08-31] 13]. The apps listed in the DiGA directory—and thus subjected to a review process—can be considered safe, especially considering the risk classification that has taken place. However, commercially, there are many other health apps available that are not institutionally reviewed. Their quality can therefore be highly variable [Bundesärztekammer; Kassenärztliche Bundesvereinigung. Gesundheits-Apps im klinischen Alltag: Handreichung für Ärztinnen und Ärzte. Ärztliches Zentrum für Qualität in der Medizin. 2020. URL: https://www.aezq.de/gesundheitsapps/ueberblick/# [accessed 2022-08-31] 13].

Objective

Existing systematic reviews have evaluated the quality of apps in Australia [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14,Didyk C, Lewis LK, Lange B. Availability, content and quality of commercially available smartphone applications for the self-management of low back pain: a systematic assessment. Disabil Rehabil. Dec 2022;44(24):7600-7609. [CrossRef] [Medline]15], Spain, and the United Kingdom [Escriche-Escuder A, De-Torres I, Roldán-Jiménez C, Martín-Martín J, Muro-Culebras A, González-Sánchez M, et al. Assessment of the quality of mobile applications (Apps) for management of low back pain using the Mobile App Rating Scale (MARS). Int J Environ Res Public Health. Dec 09, 2020;17(24):9209. [FREE Full text] [CrossRef] [Medline]16]; therefore, the included apps were restricted to the English and Spanish languages. Evaluated using the Mobile Application Rating Scale (MARS) [Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17], apps with good quality are available from app stores in Australia, Spain, and the United Kingdom [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14-Escriche-Escuder A, De-Torres I, Roldán-Jiménez C, Martín-Martín J, Muro-Culebras A, González-Sánchez M, et al. Assessment of the quality of mobile applications (Apps) for management of low back pain using the Mobile App Rating Scale (MARS). Int J Environ Res Public Health. Dec 09, 2020;17(24):9209. [FREE Full text] [CrossRef] [Medline]16]. Most Australian apps follow the recommendations of the UK guideline on LBP and sciatica by the National Institute for Health and Care Excellence (NICE) [National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4,Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14,Didyk C, Lewis LK, Lange B. Availability, content and quality of commercially available smartphone applications for the self-management of low back pain: a systematic assessment. Disabil Rehabil. Dec 2022;44(24):7600-7609. [CrossRef] [Medline]15]. It has been shown that in-store user evaluations do not correlate with assessed quality [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14,Didyk C, Lewis LK, Lange B. Availability, content and quality of commercially available smartphone applications for the self-management of low back pain: a systematic assessment. Disabil Rehabil. Dec 2022;44(24):7600-7609. [CrossRef] [Medline]15]. Consequently, they are a poor indicator of app quality. To date, there is no comparable, objective analysis for the quality of apps in the German language that could help patients or clinicians to estimate the quality and guideline fidelity of the available apps. The objective of this assessment was to provide an overview of the availability and quality of apps for patients with NSLBP and to offer recommendations for clinicians in advising their patients with NSLBP.

Overview

The methods used in this study were adapted from the studies by Machado et al [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14] and Didyk et al [Didyk C, Lewis LK, Lange B. Availability, content and quality of commercially available smartphone applications for the self-management of low back pain: a systematic assessment. Disabil Rehabil. Dec 2022;44(24):7600-7609. [CrossRef] [Medline]15] and reported in accordance with the PRISMA (Preferred Reporting Items for Systematic Reviews and Meta-Analyses) statement [Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. Mar 29, 2021;372:n71. [FREE Full text] [CrossRef] [Medline]18]. A systematic search for smartphone apps on Apple iOS and Google Android app stores was conducted. To facilitate this process, we used the web scraping software Octoparse (version 8.5.2; Octoparse). All methods used were planned and publicly preregistered on the Open Science Framework Register before the searches were carried out (osf.io/sq435).

Inclusion and Exclusion Criteria

Previous research has shown that higher app price correlates with higher quality [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14,Didyk C, Lewis LK, Lange B. Availability, content and quality of commercially available smartphone applications for the self-management of low back pain: a systematic assessment. Disabil Rehabil. Dec 2022;44(24):7600-7609. [CrossRef] [Medline]15]. Therefore, no price limit was applied when including apps, and wherever available, the “pro” or “premium” version was considered for inclusion. The included apps were required to be available for download and use to the public, so that they could be accessed by the public and physiotherapists directly. According to the research objective, they should have been available in the German language. To ensure that the recommendations resulting from our research are as general as possible and applicable for use, regardless of the device and operating system, the identified apps should have been available for iOS and Android, as these are the most used smartphone systems in Germany [Android vs. iOS: Smartphone OS sales market share evolution. Kantar Worldpanel. URL: https://www.kantarworldpanel.com/global/smartphone-os-market-share [accessed 2022-08-22] 19]. Included apps had to be stand-alone and ready to use without accessories. The exceptions were a gym mat and resistance band. Apps had to be released or updated no later than 2021 to ensure technical support and compatibility with current software and devices [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14]. Apps were required to be targeted at patients and consumers and physically and mentally engaging, as recommended by the German national guideline (GNG) for NSLBP and the NICE guideline [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3,National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4]. For physical participation, we counted all forms of physical exercise. For mental participation, we counted interventions that incorporated mental or spiritual aspects, similar to the category of mind-body exercises in the NICE guideline.

Apps were excluded if they were designed only for diagnostic purposes (eg, the detection of risk factors). Finally, apps that explicitly addressed specific forms of LBP (eg, pregnancy-related LBP) and apps for general health promotion that did not address NSLBP were excluded.

Search

The search was performed on July 6, 2022. German synonyms for back pain were used as search terms: “Rückenschmerzen,” “Rückenschmerz,” “Kreuzschmerzen,” and “Kreuzschmerz.” A single search was performed for each search term. No search filters were used in either store. The metadata about the apps were collected from the browser-based view of the apps in each app store. These included the app name, developer name, last update of the app, app rating in the store, description of the app, and URL to the app. Duplicates were identified using these data. Once these were removed, a list was created for each store that contained all the apps available for the terms used.

Screening

The screening process can be divided into three phases: (1) identification of apps available in both stores; (2) screening of app names and descriptions from the stores according to the inclusion and exclusion criteria analogous to abstract screening [Polanin JR, Pigott TD, Espelage DL, Grotpeter JK. Best practice guidelines for abstract screening large-evidence systematic reviews and meta-analyses. Res Syn Meth. Jun 24, 2019;10(3):330-342. [FREE Full text] [CrossRef]20]; and (3) screening of apps after installation. Apps that met the criteria and those for which it remained unclear whether they would meet the criteria were installed during the third screening phase. This screening was conducted by one rater (LU) according to the inclusion and exclusion criteria. A table of the screening and therefore excluded apps can be found in

Multimedia Appendix 1

Inclusion and exclusion of apps from screening phase 1 and 2.

PDF File (Adobe PDF File), 76 KB Multimedia Appendix 1. After installation on an iPhone SE (2020 model; Apple Inc), the apps were used and examined for at least 10 minutes. If the criteria were answered as “Unclear,” the criterion was discussed with a second rater (PT and AS) until a consensus was reached to include or exclude the app.

Outcome Measures

Apps included in this study were assessed for evidence according to guidelines on the treatment of LBP and for quality using the MARS [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3,National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4,Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17].

To assess the consistency with guidelines, a checklist was created along the GNG chapters 4.1—Principles of nonspecific low back pain therapy and 4.2 Management of nonspecific low back pain [Nicht-spezifischer Kreuzschmerz. Programm für Nationale VersorgungsLeitlinien. Mar 02, 2017. URL: https://www.leitlinien.de/themen/kreuzschmerz [accessed 2022-07-06] 3]. This resulted in a list of 16 items, each containing 1 recommendation. The checklist is presented in

Multimedia Appendix 2

Original German national guideline checklist items and English translations.

PDF File (Adobe PDF File), 138 KB Multimedia Appendix 2 []. To evaluate apps, the question “Does the app meet the recommendation?” was asked for each recommendation or item. This could be answered with the response categories “Yes”, “No”, and “Unclear”. In addition, the exercises used in the apps were classified according to the classification of exercises used in the UK NICE guideline. These categories were “biomechanical exercise” (BE), “aerobic exercise,” “mind-body exercise,” and “mixed modality exercise” []. Apps had to use at least one exercise that could be assigned along with this classification.

App quality was assessed using the MARS Tool [Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17]. It contains 23 items divided into five categories: 4 categories with objective quality criteria (“section A—engagement,” “section B—functionality,” “section C—aesthetics,” and “section D—information quality”) and 1 category with subjective quality criteria. Each item was rated on a 5-point scale (1=inadequate to 5=excellent). A full description of the categories is described elsewhere [Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17]. The MARS Tool has demonstrated excellent internal consistency (Cronbach α=.90) and interrater reliability (intraclass correlation coefficient [ICC]=0.79) [Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17]. Moreover, the tool has been used in methodologically related work [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14-Escriche-Escuder A, De-Torres I, Roldán-Jiménez C, Martín-Martín J, Muro-Culebras A, González-Sánchez M, et al. Assessment of the quality of mobile applications (Apps) for management of low back pain using the Mobile App Rating Scale (MARS). Int J Environ Res Public Health. Dec 09, 2020;17(24):9209. [FREE Full text] [CrossRef] [Medline]16]. Both raters (LU and PT) have been trained and proceeded according to the MARS training video [MARS training video. Stoyan Stoyanov YouTube page. Jun 14, 2016. URL: https://www.youtube.com/watch?v=25vBwJQIOcE [accessed 2022-03-21] 21].

To control for potential inaccuracies and check for reliability, around 30% (3/8) of the apps were rated by a second rater (PT) who used both instruments (MARS and guideline checklist). This approach follows the example of Machado et al [Machado GC, Pinheiro MB, Lee H, Ahmed OH, Hendrick P, Williams C, et al. Smartphone apps for the self-management of low back pain: a systematic review. Best Pract Res Clin Rheumatol. Dec 2016;30(6):1098-1109. [CrossRef] [Medline]14], who checked a similar percentage of the MARS ratings. The second rater was trained by the first rater in the process and the use of the assessment instruments and also installed the apps on an iPhone SE (2020 model; Apple Inc). The control apps were randomly selected. For this purpose, a third person (AS), who was not involved in the evaluation process at the time, received a list of the included apps and created a randomization sequence using Research Randomizer [Urbaniak GC, Plous S. Research randomizer. 2013. URL: https://www.randomizer.org/ [accessed 2022-08-22] 22]. As far as possible, the first and second raters reached a consensus for the identified differences between ratings. Where no consensus could be reached by the 2 raters, a third rater (AS) was consulted for a final verdict.

Data Analysis and Synthesis

App name, developer, models available, model used, date of last update or release, MARS quality mean score, and classification of exercises were compiled. The classification of exercises according to the NICE guideline were presented without further analyses. The results from the GNG checklist are presented with descriptive statistics (mean, median, SD, and range). Only the objective items 1 to 19 of the MARS were evaluated, as they are needed to calculate the app quality mean score [Stoyanov SR, Hides L, Kavanagh DJ, Zelenko O, Tjondronegoro D, Mani M. Mobile app rating scale: a new tool for assessing the quality of health mobile apps. JMIR Mhealth Uhealth. Mar 11, 2015;3(1):e27. [FREE Full text] [CrossRef] [Medline]17]. In addition, an overview of the app characteristics is provided.

The agreement of the raters with the checklist was calculated using Cohen κ [McHugh ML. Interrater reliability: the kappa statistic. Biochem Med (Zagreb). 2012;22(3):276-282. [FREE Full text] [Medline]23]. For this purpose, each item on the GNG was considered as a case that raters could answer “Yes”, “No”, or “Unclear”. For agreement in the classifications according to the NICE guideline, each exercise class was considered a case in which raters could accept or reject each app. To calculate the interrater reliability of the MARS, the ICC was used as a 2-way mixed model with average measures and absolute agreement [Koo TK, Li MY. A guideline of selecting and reporting intraclass correlation coefficients for reliability research. J Chiropr Med. Jun 2016;15(2):155-163. [FREE Full text] [CrossRef] [Medline]24]. The mean values of the sections were used. SPSS (version 27; IBM Corp) was used to calculate the ICC. PSPP (version 3.0, 2007; GNU Project) was used for all other calculations.

Ethical Considerations

Ethical principles must be considered for medical research involving human subjects, including research on identifiable human material and data, according to Article 1 of the Preamble of the Declaration of Helsinki. As no patients were examined in this systematic assessment and only apps and data not requiring data protection were collected, no ethics vote is necessary according to the Declaration of Helsinki [WMA declaration of Helsinki – ethical principles for medical research involving human subjects. World Medical Association. Sep 06, 2022. URL: https://www.wma.net/policies-post/wma-declaration-of-helsinki-ethical-principles-for-medical-research-involving-human-subjects [accessed 2023-08-17] 25].

App Selection

A total of 20 apps available in both stores were identified. After the initial screening for the inclusion and exclusion criteria based on the name and description in the stores, 5 apps were excluded because they received their last update before 2021. After the second screening of the remaining 15 apps, a further 7 apps were excluded. Eight apps were included in the assessment. The screening process is depicted in a flowchart diagram based on the PRISMA statement (Figure 1) [Page MJ, McKenzie JE, Bossuyt PM, Boutron I, Hoffmann TC, Mulrow CD, et al. The PRISMA 2020 statement: an updated guideline for reporting systematic reviews. BMJ. Mar 29, 2021;372:n71. [FREE Full text] [CrossRef] [Medline]18].

**Figure 1.** Flowchart of the selection of apps. *Multiple criteria applicable.

App Characteristics

We detected a series of characteristic elements that were commonly used in combination with the assessed apps. Six apps delivered some form of educational content. One app used training videos to deliver the provided exercises. Three apps created individual exercise programs based on their algorithm. Seven apps provided an exercise tracking feature. Four apps suggested prefabricated workout plans that were customizable in 3 of these apps. One app provided motion detection of the exercising person using the camera of the smartphone. The characteristic elements and different combinations used for each app are listed in Table 1.

Table 1. Summary of characteristics of included apps.

App name (iOS; version)	App name (Android)	Developer	Used version	Published or last update at the time of assessment	Characteristics	MARS^a score, mean (SD)	Classification of exercises according to NICE^b
Dein Rückentraining (3.0)	Dein ganzheitliches Rückentraining	EBL Media Production OG	Purchase upon download (€24.99^c)	July 10, 2021	Education and train-along videos	2.7 (0.29)	BE^d and MBE^e
ViViRA bei Rückenschmerzen (2.41.0)	ViViRA bei Rückenschmerzen	Vivira Health Lab GmbH	Monthly subscription (€79.99)	July 6, 2022	Artificial intelligence–based program with education and tracked exercises	4.2 (0.19)	BE
Rückenschmerzen Übungen (1.0.99)	Rückenschmerzen Übungen	Vladimir Ratsev	“Pro”-version via in-app purchase (€2.99)	August 16, 2022	Exercise tracker with prefabricated workout plans and educational aspects	3.1 (0.59)	BE
ratiopharm Rückenschule (2.2.5)	ratiopharm Rückenschule für einen starken Rücken	ratiopharm GmbH	Free	October 13, 2021	Exercise tracker with prefabricated workout plans and educational aspects	3.1 (0.69)	BE
eCovery: Rücken, Hüfte & Knie (2.2.12)	eCovery: Rücken, Hüfte & Knie	eCovery GmbH	Free trial for 3 weeks	July 6, 2022	Artificial intelligence–based program with education and tracked exercises	4.1 (0.45)	BE and MBE
heyvie: Migräne & Resilienz (2.4.2)	heyvie: Resilienz & Migräne	HAIVE UG (haftungsbeschränkt)	Monthly subscription “Pro” (€9.99)	July 6, 2022	Artificial intelligence–based program with education and tracked exercises	4.2 (0.54)	BE and MBE
Rückentraining Gerade Haltung (1.2.1)	Rückentraining&Gerade Haltung	Nexoft Yazilim Limited Sirketi	Monthly subscription “Mitgliedschaft” (€4.49)	April 15, 2022	Exercise tracker with prefabricated plans	3.3 (0.25)	BE, AE^f, and MME^g
AmbiCoach (1.1.27)	Dein Rückentraining: AmbiCoach	AmbiGate GmbH	Monthly subscription “Premium” (€49.99)	October 4, 2021	Exercise tracker with prefabricated plans and motion detection	3.6 (0.36)	BE

^aMARS: Mobile Application Rating Scale.

^bNICE: National Institute for Health and Care Excellence.

^cA currency exchange rate of €1=US $1.02 is applicable.

^dBE: biomechanical exercise.

^eMBE: mind-body exercises.

^fAE: aerobic exercises.

^gMME: mixed modality exercises.

Consistency With Guidelines

All the included apps met some recommendations of the GNG. The mean value of items with the response “Yes” was 5.75 (SD 2.71). The mean value of items with the response “No” was 8.0 (SD 4.72). The mean value of items with the response “unclear” was 2.25 (SD 3.11). “Yes” was the most frequent response for item 14 (7/8, 88%), followed by item 2 (6/8, 75%) and then by items 8 and 13 (5/8, 62%). Items 6, 11, and 15 never received the response “Yes.” No item was never answered "No". Items 2, 8, and 14 never received the response “Unclear”. Item 11 received the response “Unclear” most frequently (3/8, 38%). The results by item are shown in Table 2.

Table 2. German national guideline checklist items in the included apps (outcomes in total and per app).

Item	Yes, n (%)^a	No, n (%)^b	Unclear, n (%)^c	Dein Rücken-training	ViViRa bei Rücken-schmerzen	Rücken-schmerzen Übungen	ratiopharm Rücken-schule	eCovery: Rücken, Hüfte & Knie	heyvie: Migräne & Resilienz	Rücken-training Gerade Haltung	AmbiCoach
1. Functional status	2 (25)	5 (62)	1 (12)	No	Yes	No	No	Yes	Unclear	No	No
2. Patient preferences	6 (75)	2 (25)	0 (0)	No	Yes	Yes	Yes	No	Yes	Yes	Yes
3. Physical activity is safe	4 (50)	2 (25)	2 (25)	Yes	Yes	Yes	Unclear	Yes	No	No	No
4. Health-conscious behavior	3 (38)	4 (50)	1 (12)	Yes	Yes	No	Yes	Unclear	No	No	No
5. Promote understanding	3 (38)	4 (50)	1 (12)	Yes	Yes	No	No	Unclear	Yes	No	No
6. Education on healthy lifestyle	0 (0)	6 (75)	2 (25)	No	Unclear	No	No	Unclear	No	No	No
7. Maintaining activities	2 (25)	5 (62)	1 (12)	No	Yes	No	Yes	Unclear	No	No	No
8. Strength and endurance	5 (62)	3 (38)	0 (0)	Yes	Yes	Yes	Yes	Yes	No	No	No
9. Importance of activity	4 (50)	3 (38)	1 (12)	Yes	Yes	Unclear	No	Yes	Yes	No	No
10. Loading and resting	2 (25)	5 (64)	1 (12)	No	Yes	No	Yes	Unclear	No	No	No
11. Performance and pain	0 (0)	5 (62)	3 (38)	No	Unclear	No	No	Unclear	Unclear	No	No
12. Appropriate activities	2 (25)	5 (62)	1 (12)	No	Unclear	No	No	Yes	No	Yes	No
13. Iatrogenic fixations	5 (62)	2 (25)	1 (12)	Yes	Yes	Yes	Yes	Unclear	Yes	No	No
14. Preventing passive role	7 (88)	1 (12)	0 (0)	Yes	Yes	Yes	No	Yes	Yes	Yes	Yes
15. Positive prognosis	0 (0)	6 (75)	2 (25)	No	No	No	No	Unclear	Unclear	No	No
16. Problematic patterns	1 (12)	6 (75)	1 (12)	No	No	No	No	Unclear	Yes	No	No

^aMean 5.75, SD 2.71; median 6.0; range 2.0-11.0.

^bMean 8.0, SD 3.11; median 9.0; range 1.0-14.0.

^cMean 2.25, SD 3.11; median 1.0; range 0.0-9.0.

The app “ViViRa bei Rückenschmerzen” met the most recommendations (“Yes”: 11/16, 69%; “No”: 2/16, 12%; and “Unclear”: 3/16, 19%). For the app “eCovery: Rücken, Hüfte & Knie,” the most frequent response to recommendations was “Unclear” (“Yes”: 6/16, 38%; “No”: 1/16, 6%; and “Unclear”: 9/16, 56%). The app “AmbiCoach” met the fewest recommendations (“Yes”: 2/16, 12%; “No”: 14/16, 88%). The results by app are shown in Table 2.

All apps contained at least one form of exercise according to the NICE guideline [National Guideline Centre. Low Back Pain and Sciatica in Over 16s: Assessment and Management. London, UK. National Institute for Health and Care Excellence; 2016.4]. All apps contained BE. Three apps also contained mind-body exercise. One app contained aerobic exercise and mixed modality exercise, in addition to BE. All forms of exercises are listed in Table 2.

App Quality

The overall mean MARS score for all apps included in the assessment was 3.61 (SD 0.55). “Section A–Engagement” surveyed whether apps were fun, engaging, and customizable in their use to increase users’ engagement. The overall mean score for this section was 3.65 (SD 0.72). “Section B–Functionality” surveyed the functionality of the apps, in terms of usability, navigation, logical structure, and motor-gestural handling. The overall mean score for this section was 3.81 (SD 0.54). Therefore, the highest mean score was obtained in section B. “Section C–Aesthetics” surveyed the esthetics of the apps in terms of graphic design, visual stimuli, color design, and stylistic unity. The overall mean score for this section was 3.5 (SD 0.95). “Section D–Information” surveyed the quality of the apps’ information and whether it was of high quality. The overall mean score for this section was 3.43 (SD 0.41), which was the lowest obtained mean score. The MARS scores for each section are shown in Figure 2.

**Figure 2.** Mobile Application Rating Scale (MARS) scores per section and mean scores over all Apps. Error bars indicate the SD of the mean values.