Maintenance Note

On Friday, August 31, 2018 at 12:00 pm Eastern Time, JMIR will be completing a server migration to improve site stability and user experience. We expect to be back online Friday, August 31, 2018 at 5:00 pm Eastern Time. Should any problems arise our technical team will be using the weekend to resolve them, and users will be able to access our site by Sunday, September 2, 2018 at 1:00pm Eastern Time.

Who will be affected?

Advertisement

Citing this Article

Right click to copy or hit: ctrl+c (cmd+c on mac)

Published on 30.05.17 in Vol 5, No 5 (2017): May

This paper is in the following e-collection/theme issue:

    Original Paper

    Human-Centered Design Study: Enhancing the Usability of a Mobile Phone App in an Integrated Falls Risk Detection System for Use by Older Adult Users

    1NUI Galway, Electrical and Electronic Engineering, School of Engineering & Informatics, Galway, Ireland

    2CÚRAM SFI Centre for Research in Medical Devices, Human Movement Laboratory, NUI Galway, Galway, Ireland

    3NUI Galway, Physiology, School of Medicine, Galway, Ireland

    4NUI Galway, General Practice, School of Medicine, Galway, Ireland

    5Consorci Sanitari del Garraf, Çlinical Research Unit, Vilanova i la Geltrú, Barcelona, Spain

    6CACP Center for Advanced Communications Policy Georgia Institute of Technology, North Avenue NW, GA 30332, Atlanta, GA, United States

    7NUI Galway, Irish Centre for Social Gerontology, Institute for Lifecourse and Society, Galway, Ireland

    Corresponding Author:

    Leo R Quinlan, BSc, PhD

    NUI Galway

    Physiology

    School of Medicine

    University Road

    Galway,

    Ireland

    Phone: 353 91 493710

    Fax:353 91 494544

    Email:


    ABSTRACT

    Background: Design processes such as human-centered design (HCD), which involve the end user throughout the product development and testing process, can be crucial in ensuring that the product meets the needs and capabilities of the user, particularly in terms of safety and user experience. The structured and iterative nature of HCD can often conflict with the necessary rapid product development life-cycles associated with the competitive connected health industry.

    Objective: The aim of this study was to apply a structured HCD methodology to the development of a smartphone app that was to be used within a connected health fall risk detection system. Our methodology utilizes so called discount usability engineering techniques to minimize the burden on resources during development and maintain a rapid pace of development. This study will provide prospective designers a detailed description of the application of a HCD methodology.

    Methods: A 3-phase methodology was applied. In the first phase, a descriptive “use case” was developed by the system designers and analyzed by both expert stakeholders and end users. The use case described the use of the app and how various actors would interact with it and in what context. A working app prototype and a user manual were then developed based on this feedback and were subjected to a rigorous usability inspection. Further changes were made both to the interface and support documentation. The now advanced prototype was exposed to user testing by end users where further design recommendations were made.

    Results: With combined expert and end-user analysis of a comprehensive use case having originally identified 21 problems with the system interface, we have only seen and observed 3 of these problems in user testing, implying that 18 problems were eliminated between phase 1 and 3. Satisfactory ratings were obtained during validation testing by both experts and end users, and final testing by users shows the system requires low mental, physical, and temporal demands according to the NASA Task Load Index (NASA-TLX).

    Conclusions: From our observation of older adults’ interactions with smartphone interfaces, there were some recurring themes. Clear and relevant feedback as the user attempts to complete a task is critical. Feedback should include pop-ups, sound tones, color or texture changes, or icon changes to indicate that a function has been completed successfully, such as for the connection sequence. For text feedback, clear and unambiguous language should be used so as not to create anxiety, particularly when it comes to saving data. Warning tones or symbols, such as caution symbols or shrill tones, should only be used if absolutely necessary. Our HCD methodology, designed and implemented based on the principles of the International Standard Organizaton (ISO) 9241-210 standard, produced a functional app interface within a short production cycle, which is now suitable for use by older adults in long term clinical trials.

    JMIR Mhealth Uhealth 2017;5(5):e71

    doi:10.2196/mhealth.7046

    KEYWORDS



    Introduction

    Utilizing a human-centered design (HCD) approach, such as that outlined in the International Standards Organization (ISO) 9241-210 [1], during the design of connected health devices ensures that the needs and requirements of the user are taken into consideration throughout the design process. HCD is a multi-stage process that allows for various iterations of a design and subsequent update to the requirements. The importance of involving end users in the design process of health products is recognized, and different approaches have been demonstrated in literature [2-8]. In this paper, we present the implementation of a structured HCD methodology, based on ISO-9241-210, which utilized standard, established techniques to assess and develop the usability and human factors of a smartphone interface with the full involvement of end users and stakeholders. The smartphone interface that was developed and tested is a component of the wireless insole for independent and safe elderly living (WIISEL) system, a system designed to continuously assess fall risk by measuring gait and balance parameters associated with fall risk. The system is also designed to detect falls. The architecture of the system is illustrated in Figure 1. It is proposed that the system can be worn at home by a user for a period of time in order to identify specific gait and balance patterns that may be affecting a user’s fall risk. The system is targeted at older adults who represent a high fall risk group. The system consists of a pair of instrumented insoles and a smartphone that are worn by the user. Data collected by embedded sensors in the insoles are sent to the smartphone, where they are then uploaded to a server in a clinic for processing and analysis. The smartphone represents a major interface in the system as this is how the home user will primarily interact with the WIISEL system with the WIISEL app, allowing the user to check the system status, sync with the insoles, send data to their local clinic, and monitor their daily activity.

    Figure 1. The wireless insole for independent and safe elderly living (WIISEL) system.
    View this figure

    The acquisition and comprehension of information from interfaces can become more difficult as a person progresses into older age. Interfaces in electronic health or medical apps can often be crowded with text and characters, have poor contrast, contain many different colors, and may not present adequate haptic or audio feedback. In terms of visual perception, age-related declines in acuity, contrast sensitivity, and ability to discriminate colors can affect reading rates, character and symbol identification, and button striking accuracy, even with optimal corrections in place [9]. Age-related cognitive decline in domains such as reasoning and memory can affect the ability of the user to comprehend the process they are perceiving on the interface [10]. Deterioration of psychomotor processes such as fine motor control and dexterity can cause problems for users attempting to interact with the physical hardware of the interface [4]. Typically between the ages of 60 and 80 years, individuals can expect up to a 50% decline in visual acuity (particularly in low luminance, low contrast, and glare environments), a reduction in hearing sensitivity by 20dBs, a 14% decline in short-term memory, and a 30% decline in power grip strength, all of which impact how one interacts with computer interfaces [11]. In addition to these physical considerations, older adults can also present a complex user group in terms of attitude toward and previous experience with technology [11].


    Methods

    A 3-stage HCD methodology was utilized to enhance the usability and user experience of the smartphone app. This methodology was previously described by Harte et al [12].

    Phase 1

    Use Case Development

    The use case document outlined 7 scenarios where the user must directly interact with the smartphone interface. These scenarios were (1) the user logs in to the app, (2) the user syncs the app to the insoles, (3) the user checks the system status, (4) the user uploads the data, (5) the user minimizes the app, (6) the user resets the app, and (7) the user triggers a fall alarm. The use case, which was termed paper prototype version 1, was exposed to 2 groups of stakeholders in the form of structured analysis in order to illicit their feedback [7,13,14].

    Expert Use Case Analysis

    A total of 10 experts were selected to analyze the use case. The experts were selected from National University of Ireland (NUI), Galway based on their involvement with work related to the use of technology by older adults. We sought multi-disciplinary perspectives, as advised in ISO-92410, and therefore the group consisted of nurses, occupational therapists, physiotherapists, general practitioners, gerontologists, and engineers. The precise expertise of each expert, as well as a self-reported measure of their knowledge of (1) usability and human factors and how it can influence technology use; (2) the end user, their capabilities, and their preferences for technology; and (3) connected health devices that are used in the home can be found in Table 1.

    In addition to filling out the Likert statements at the end of each scenario, the expert was instructed to engage in a think-aloud protocol as they walked through each scenario [15]. All feedback was captured by an audio recorder.

    End User Representatives Use Case Analysis

    A total of 12 older adults were recruited using a typical purposive sample (Inclusion: age 65+ years, community dwelling; Exclusion: profound hearing or vision loss, psychiatric morbidities, and severe neurological impairments) to analyze the use case. The same protocol and interview structure was used to expose the use case document to the older adults and was carried out in the home of the participant. Ethical approval to carry out the interviews and assessments was approved by University Hospital Galway (UHG) research ethics committee. For this analysis, we sought to measure, where applicable, the capabilities a user would call upon to successfully use an interface, so that we could be satisfied that test participants were representative of the target end-user population.

    Table 1. Experts involved in use case analysis. Each of the experts was asked to mark out of 10 where they felt their own expertise of usability, the end user, and connected health lay.
    View this table

    We measured the cognitive and visual capabilities of the user and the components of the processes we measured are illustrated in Figure 2.

    We used a short battery of standardized tests to measure each of the capabilities presented in Figure 2. The tests and their relevance to the analysis are listed in Table 2.

    High contrast acuity (HCA) was measured using a Snellen chart at a distance of 3m. Low contrast acuity (LCA) was measured for 5% and 25% contrast using SLOAN letter charts at a distance of 3m. Standardized illumination was provided for these 2 tests using a light box from Precision Vision (precision-vision.com). Constrast sensitivity (CS) was measured using a MARS chart at a distance of 40cm, whereas low contrast acuity in low luminance (LCALL) was measured with a SKI chart at a distance of 40cm. Color discrimination (CD) was measured using a Farnsboro D-15 test. Reading acuity (RA) was measured using a Jaeger chart at a distance of 40cm. Each participant also completed 2 cognitive performance tests based on the Whitehall study [22]. Spatial reasoning was assessed using the Alice Heim 4-I (AH4-I). The AH4-I tests inductive reasoning, measuring one’s ability to identify patterns, and to infer principles and rules [24]. Short-term memory was assessed with a 20-word free recall test. Expected values of each test per age group and the actual measured can be found in Tables 3 and 4.

    Figure 2. Physiological capabilities required to interact with use case.
    View this figure
    Table 2. Battery of tests.
    View this table
    Table 3. Average visual performance metrics measured and split by age group. The average is compared with the expected score for that age group. Data presented in each column as expected or measured
    View this table
    Table 4. Expected scores and mean measured scores for cognitive tests for all 12 participants. The average is compared with the expected score for that age group. Data presented in each column.
    View this table
    Identification and Categorisation of Usability Problems

    The audio feedback acquired during the analysis of the use case document by the experts and end users was “intelligently” transcribed [25] and clearly defined usability problems were extracted from the transcript. All of the problems identified by each expert, and end user were collated for each scenario. All problems were documented and illustrated in a structured usability and human factors problems report [26] and were accompanied by selected testimony from a corresponding expert or end user who elaborated on the nature of the problem for the purpose of the design team. This report was analyzed by system designers who provided potential solutions to each problem where possible.

    Phase 2

    In response to the feedback from phase 1, a new paper prototype was developed (paper prototype version 2) and made available for expert inspection. A working version of the app with accompanying user manuals was also developed on a Google Nexus 5 smartphone (working prototype version 1) and made available for expert walkthrough. We returned to the original experts and carried out a 2-part usability inspection. First, the experts inspected the solutions to the problems they had identified in phase 1 using a new version of the use case (paper prototype version 2) as a guide. This use case only presented the problems that the experts identified in their original analysis and showed how the problems had been addressed. Second, they inspected the prototype app (working prototype version 1) utilizing a cognitive and contextual walkthrough methodology.

    Phase 3

    The new manuals and updated interface (working prototype version 2) were exposed to the 10 older adults who had previously analyzed the use case (2 of the 12 subjects who had originally analyzed the use case were unavailable in phase 3 testing). After measuring the time taken to complete each task and the number of errors made, the after scenario questionnaire (ASQ) and the NASA Task Load Index (NASA-TLX) were administered to the participant after the task was completed. The ASQ is a Likert scale that interrogates a user’s perception of efficiency, ease of use, and satisfaction with manual support [27]. The NASA-TLX is a multi-dimensional rating procedure that provides an overall workload score based on a weighted average of ratings on 6 subscales: (1) mental demands, (2) physical demands, (3) temporal demands, (4) own performance, (5) effort, and (6) frustration [28].


    Results

    This section presents the summary of results from each phase, as well as the changes made to the interface and support documentation after each phase.

    Phase 1: Use Case Analysis (Paper Prototype Version 1)

    The combined expert analysis and end user analysis identified 21 problems. We have provided 13 examples of problems, which are presented in Table 5. These 13 problems were chosen for illustration because they represent unique problems, the other 8 problems were considered repetitions or derivatives of the other 13, and therefore, we felt it was not important to describe them. The problem ID number assigned to each problem was used for the remainder of the design process to allow for easier problem tracking throughout the process.

    The problems from Table 5 are presented in Table 6 in order of severity rating based on the mean Likert scores assigned by the experts. The maximum individual score that was given by the 10 experts is also included to highlight the fact that some experts may have given a more severe rating than what the mean or standard deviation indicates. The heuristic category to which each problem belongs is also included.

    Table 5. List of identified problems and which use case scenario it was identified in.
    View this table
    Table 6. Problems uncovered by experts and rated based on mean Likert scores.
    View this table
    Table 7. Problems uncovered by end users and rated based on mean Likert scores.
    View this table

    The older adult end user analysis found 14 problems, all of which were problems that had been identified by the expert group (the same problem ID number is used). Of the 13 problems listed in Table 6, 9 were uncovered by end users. These are presented in Table 7 in order of severity (as in Table 6).

    Testimony from experts and users alike were used to provide insight into the problems and help designers better understand the problem. Themes were sought from the transcripts to uncover which characteristics of the interface experts and users most commonly found problematic. For example, regarding the login sequence for the smartphone app:

    If not absolutely necessary this sequence should be removed from the use of the phone. At the very least it should be made sure that this only needs to be carried out by the clinician in the clinic once.
    Maybe a voice password could be used or simply a pin number that only requires numerical values and does not require an email address.

    Insufficient screen feedback and prompts for the user when carrying out certain tasks was identified as a recurring theme:

    There should be a prompt to upload the data. When he (the user) presses the back button it should prompt the user that the data is about to be uploaded. The warning sign on the Exit pop-up box will cause anxiety and should be avoided.
    I suggest that the interface should have one indicator saying if everything is working OK and if not, the interface should say specifically what the issue is.
    The battery icon needs to change colour/shape when it is decreasing.; There needs to be a message which appears on the screen telling the user to initiate this (connection) sequence (PLEASE PRESS HERE TO ATTEMPT CONNECTION) and an indicator on the screen should tell them where to press.
    [Recommended by Expert 8]

    The size of screen elements such as icons, buttons and text were identified as being problematic:

    (Made in reference to the pop up boxes in particular, for example “Invalid mail or Password” during login,) the screen needs to be utilised better, pop up boxes need to be bigger and more prominent.
    There is no reason why the large screen space could not be utilised more effectively for these buttons (referring to exit pop-up buttons).
    [Expert 1]
    This (referring to an icon in top left hand corner to show that the app is running) is a good idea, but it is just too small for older adult users.

    The results of the expert analysis and the end user analysis were compiled separately and then were presented in a problem report for system developers, with all problems listed with severity ratings and related testimony. The developers returned a proposal on how each problem could be solved, which were then reviewed by the usability engineering team. Examples of proposals that were accepted by the usability engineers are shown in Table 8.

    Not all identified problems could be easily fixed by the system developers. Some aspects of the interface were built into the Android operating system (OS) and therefore could not be changed, whereas some problems could not be solved within the time constraints of the project. Where it was clear that the developer could not affectively address a problem through interface changes, the usability team proposed an alternative as to how the problem severity could be at least reduced if not completely eliminated. Some of examples of these problems are shown in Table 9.

    Table 8. Problems that were directly addressed by system developers.
    View this table
    Table 9. Problems that could not be directly addressed by system developers and which in turn had a proposed solution by the usability team.
    View this table
    Update of Paper Prototype and Development of First Working Prototype

    Based on this communication between the development team and the usability engineers, a working app prototype for the Google Nexus 5 smartphone was developed as well as a full set of user manuals based on the use cases and the feedback from the use case analyses. The use case was also updated to reflect the changes to the interfaces. Figures 3 and 4 show examples of how the updated interface (paper prototype version 2) compares with the paper prototype version 1. In Figure 3, we see how color indicators have been introduced to enhance the feedback on the system status screen. Text size has been increased and some elements have been removed from the interface to reduce crowding. Figure 4 shows how the login screen has been updated with a decrypted password as well as increased text size and button size.

    Figure 3. (a) The old interface showing the system status. Experts did not like the dull colors and crowded interface. Some users did not like the fact that there was no change of colors to indicate low battery, weak signal etc; (b) The updated interface with color indicators for connection, signal strength, and battery life, as well as increased text size and contrast.
    View this figure
    Figure 4. (a) Experts were concerned with the small button size and the fact that the password was encrypted meaning an older adult might lose their place when typing. This problem was also identified by end users; (b) Increased text size and a larger, more prominent sign in button as well as a decrypted password.
    View this figure
    Figure 5. One side of the basic instruction sheet (short form manual) describing the connection and uploading sequences.
    View this figure

    Where the problems identified by the experts could not be addressed by an interface change, user manuals were created to offset any confusion of difficulties the user might encounter with the interface. In order to create an effective user manual, the original use case was updated with all the interface changes made by designers. Each use case scenario now became a section of the user manual with the same chronological order maintained where applicable. For example, the use case scenario where the user connects to the insoles became a “how to connect” section in the user manual and was followed by a “how to upload” section, as in the use case. Two forms of manual were created, a short form manual entitled the “basic instruction sheet” which contained basic instructions on a double-sided laminated sheet, and a longer form manual laid out in similar style to the use case that elaborated on the instructions provided in the basic instruction sheet and provided additional instructions for procedures that would not be considered routine. Another version of these 2 forms were also created for clinicians with additional information on how to set up the system for the user, change settings, calibrate insoles, and adjust fall detection settings. A selected sections of the manual is presented in Figure 5.

    Phase 2: Expert Inspection Results

    Use Case Inspection of Paper Prototype Version 2

    Table 10 presents examples of how the various problems uncovered during the use case analysis in phase 1 were addressed and compares the problem rating it received from the first use case analysis (paper prototype version 1) with the new rating it received from the analysis of the updated interface in phase 2 (paper prototype version 2).

    The inspection found that of the 21 original problems identified by the experts, 3 had now received a rating of 0 from the experts, 17 had received decreased ratings, and 1 (ID# 11) had received an increased rating.

    Table 10. Comparison of problem ratings between paper prototype V1 problems and the updated interface (paper prototype V2). The max individual score that was given by the 10 experts is also included to highlight the fact that some experts may have given a more severe rating than the mean or standard deviation indicates.
    View this table
    Table 11. Average metrics and consensus for 9 experts. After scenario questionnaire (ASQ) scores range from 1-7, where 1 is the most satisfied and 7 is the least satisfied the user can be.
    View this table
    Expert Cognitive or Contextual Walkthrough With Working Prototype Version 1

    Table 11 shows the captured average metrics from each scenario, with the time and errors made metric captured. Accompanying the metrics are a selection of comments from experts.

    Of the 8 scenarios, three achieved a score of “satisfied”and four achieved a score of “somewhat satisfied,” whereas one achieved a neutral score. No scenarios scored a perfect score of 1, indicating that all scenarios require some improvement, particularly regarding the clarity and flow of the supporting documentation. These data are best illustrated in a radar chart (Figure 6). A radar chart allows for multiple data series to be displayed across common variables, each variable having its own axis (the dotted line). The axis values go from low to high as you read toward the center of the chart, with lower scores indicating a better outcome (data points near the edge of the chart). The chart in Figure 6 shows how the 3 individual components of the ASQ score, satisfaction with ease of completion, time taken, and effect of supporting documentation.

    In response to comments by the experts during the inspection, the user manuals were updated, and several minor changes were made to the interface. These updates are listed in Table 12.

    Figure 6. All basic scenarios scored consistently well regarding ease of completion (blue) with just slight superficial changes, the more challenging scenarios such as login and reset registered higher (worse) scores. Only one scenario, connection routine, scored poorly in the time taken (red) metric, owing to the length of time it takes the insoles to sync with the app. Several experts were confused by some of the layout and instructions in the manuals (green), with improvement required for several scenarios, particularly the instructions for the fall alarm sequence.
    View this figure
    Table 12. Changes made to the user manuals and interface based on expert inspection.
    View this table
    Figure 7. (a) Fall alarm interface before expert inspection, the red and green caused confusion as the red was associated with “cancel” as you would find on a phone call interface; (b) Fall alarm interface after expert inspection, a more appropriate symbol was introduced for the help button whereas the cancel button was changed to a more neutral blue with appropriate labeling.
    View this figure

    These changes led to working prototype version 2 and a new set of user manuals that now contained 4 laminated sheets. Figure 7 shows an example of how the fall alarm interface has been updated.

    Phase 3: Usability Testing With End Users

    Table 13 shows the average metrics for the 10 test participants during the usability testing of working prototype version 2, whereas Figure 8 illustrates the breakdown of the ASQ metric in terms of satisfaction with ease of completion, time taken, and support documentation (there was some confusion with the reset and login sequences in the user manual (green) which is explained further in Table 13).

    The results of the NASA-TLX was performed on paper and the metrics are shown in Table 14. A score of 100 indicates maximum burden on the user, whereas a score of 0 indicates no burden. The first 4 tasks scored very well, indicating little to no burden on the user. The login and reset procedures, due to the number of steps involved, created the most mental, physical, and effort burden, as well as the most frustration, particularly the login procedure. The most temporal burden was created by the fall alarm procedure, due to the timer on the screen, forcing the user to make a hasty choice.

    Table 13. Performance metrics for each scenario during user testing with working prototype 2, with related commentary as observed during the testing. The after scenario questionnaire (ASQ) score ranges from 1-7, where 1=best score possible and 7=worst score possible.
    View this table
    Table 14. NASA Task Load Index (NASA-TLX) scale breakdown by scenario. The NASA-TLX score ranges from 1-100, where 1=worst score possible and 100=best score possible.
    View this table
    Figure 8. All scenarios scored maximum for ease of completion (blue) apart from the fall alarm 1 which caused slight confusion. Time taken (red) was not considered a major issue for any of the scenarios, with the connection routine not scoring maximum due to the nature of the syncing process, whereas the unfamiliarity with typing caused some users to mark down the login sequence. There was some confusion with the reset and login sequences in the user manual.
    View this figure
    Table 15. Likert items severity rating (range 0-4, 0=no problem, 4=most severe problem) for interface ergonomics by scenario. Some Likert items did not apply to certain scenarios. An x indicates that there was no Likert statement for that particular interface aspect for that scenario.
    View this table
    Table 16. Presents the evolution of three distinct problems through the testing lifecycle with the usability metrics taken at each stage.
    View this table
    Table 17. System usability scale (SUS) metric, split into overall usability and learnability, captured at each phase.
    View this table

    Table 15 shows the Likert response for different aspects of the interface in each scenario. The severity rating is calculated in the same manner as phase 1 and 2.

    Summary of Results

    With combined expert and end-user analysis of a comprehensive use case having originally identified 21 problems with the system interface, we have only seen observed 3 of these problems in user testing (problem ID 1, 2, and 12). Satisfactory ASQ ratings were obtained during validation testing by both experts and end users, and final testing by users shows the system requires low mental, physical, and temporal demands according to the NASA-TLX. Table 16 shows how three of the problems (problems involving flow, consistency, and feedback) have evolved over the testing cycle. Problem 2 and 6 show a clear linear improvement from phase 1-3, with problem 2 an example of a problem that despite best efforts remained a cause of potential user frustration due to the unfamiliar style of touchscreen keyboards. Problem 6 represents an example of a problem that was effectively mitigated through interface changes and manual support. Problem 11 is an example of a problem that was actually exasperated by an interface change, causing greater confusion to users, although this was effectively identified and mitigated between phase 2 and 3.

    The system usability scale (SUS) metrics after each phase are presented in Table 17. The SUS is split into 2 scales: (1) overall usability and (2) learnability [29]. Early phases showed widely variable SUS scores, particularly among experts, whereas phase 3 scores showed agreement among end users that the interface had achieved some level of acceptability.


    Discussion

    Overview

    We have presented a multi-phase, mixed-method HCD approach to improve the user experience of a smartphone interface, which forms part of a connected health system. Our approach was designed to uncover and mitigate any usability problems as early as possible, before they were exposed to end users during usability testing and in formal clinical trials. This paper presents one full cycle of our HCD process, with each phase representing an iteration where a design update or refinement took place. Our approach has met the specific recommendations for a HCD process [30]. We have adopted the input of multi-disciplinary skills and perspectives by eliciting the feedback of both an end-user group and an appropriately experienced expert group throughout the process. We have sought to gain an explicit understanding of users, tasks, and environments and consideration of the whole user experience through the adoption of a use case that provided context of use for system tasks and scenarios and through the examination of the perceptual and cognitive needs of the target end user. We utilized a user-centered evaluation driven design using standard usability evaluation metrics at each point in the cycle. We involved users throughout the design process, at both early and later stages. Finally, we employed an iterative process, split into 3 stages or phases that allowed for user feedback to be worked into design updates.

    Principal Findings

    From our observation of older adults’ interactions with smartphone interfaces, there were some recurring themes. Clear and relevant feedback as the user attempts to complete a task is critical (in line with contemporary literature) [31,32]. Feedback should include pop-ups, sound tones, color or texture changes or icon changes to indicate that a function has been completed successfully, such as for the connection sequence (problem ID# 9). For text feedback, clear and unambiguous language should be used so as not to create anxiety, particularly when it comes to saving data such as in the data upload sequence (problem ID# 6). Older adults not familiar with technology are often afraid that they might delete something by accident or fail to save important data properly. Warning tones or symbols, such as a caution symbol, should only be used if absolutely necessary. For audio feedback, clear and low frequency tones should be used. Login sequences where the user is required to input text with a QWERTY keyboard should be avoided (problem ID 2), particularly for those who have no previous touchscreen experience. If a login sequence is considered necessary for security or identification purposes, it should be ensured that a login process is made as simple as possible (do not hide password, be clear about what username is required, supply ample support documentation for process). For simple interface elements, text sizes should be at least 10pts (Didot system), whereas button sizes should have a surface area of no less than approximately 200mm2 [11,33].

    In terms of metrics, we used 4 different subjective measurement systems (Likert scales, ASQ, NASA-TLX, and SUS) to assess the usability of the interface at different stages. The Likert scales allowed for quick satisfaction ratings of the perceived ease of use of each task in the use case and of the suitability of interface elements such as text and button size. The ASQ was more suitable for postscenario ratings when the user had actually completed the task, whereas the NASA-TLX was used to supplement the ASQ to provide further details on what kind of burden, be it physical or cognitive, the task placed on the user. The SUS was utilized when the user had completed a full use of the system and carried out all tasks. We observed that all of these metrics are providing the similar information, just in slightly different resolutions, and that a mixture of metrics allows us different insights into user perceptions of usability. For example, in phase 3, from looking at the ASQ scores of the login sequence, we could conclude that the user was satisfied with the ease of the task. However, when we looked at the NASA-TLX scores, we observed that the task was creating a large mental demand on them. These 2 metrics, whereas showing us seemingly conflicting pieces of information, may be telling us that the user judged the task as being easy simply because they completed it successfully, regardless of the difficulty they encountered or the time it had taken them. It is only when they think about the task in terms of the NASA metrics that they become honest about what kind of burden the task placed on them. The SUS was a useful general indicator of overall usability but its wide variability (Table 17) suggests that it is best used with larger sample sizes. High SUS scores do not guarantee the system will not suffer usability problems in the field [34]. These metrics are probably best used to supplement more objective metrics such as task times and error rates.

    Procedural Observations

    In terms of efficiency, our methodology proved to be successful. The utilization of the use case analysis activities during phase 1 provided a focus for all stakeholders on the context of and the intended use of the system. The time it took for each individual to analyze and provide feedback was on average 1 h. Within this hour, the individual was experiencing and commenting on context, was being formally interviewed, was filling out questionnaires, and was providing opinions on interface concepts. Therefore in one session the use case analysis provides multiple streams of data, whereas in previous literature, this kind of feedback would need to be gathered across multiple activities, such as surveys, interviews, and ethnographic observations. In phase 2, the use of expert inspection groups also proved highly efficient. We recommend that research groups and design teams maintain an inspection group who can carry out on hand inspections of new system versions. This group, which can comprise 4-6 members, need not necessarily be qualified usability engineers but can be trained in techniques such as heuristic evaluations and cognitive walkthroughs. In terms of how long it took to complete each phase, as this was a case study as part of a research project, the amount of time spent on each phase was probably drawn out longer than it would be in a more industrial setting. In all, the 3 phases together took approximately 12 months, with phase 1 taking the bulk of the time (approximately 6 months) as use cases were developed and redeveloped and end users were interviewed and tested. After the app was developed and testable, the phases became shorter, with phase 2 and 3 taking approximately 3-4 months each. As the methodology is applied in future, it will become more refined, allowing for quicker development cycles.

    Limitations

    Time and technology constraints meant that not all design requirements could be implemented. For example, the replacement of the manual data upload with an automatic periodic data upload could not be implemented in time by the engineering team. Similarly, the structure of the Android OS meant that some user and expert recommendations could not be implemented, particularly regarding the positioning of pop-ups or the nature of data storage. Some design changes led to a decrease in user experience, particularly for the fall alarm sequence (problem ID# 11). It became clear during user testing that the use of red and green in an emergency situation may not be the best practice, with some users confusing the red emergency button for a cancel button, like it may be presented on a phone call screen (red for “hang-up”). In this case, the design team failed to take into account the recommendation of one expert who predicted that a red or green option may cause confusion. We can conclude from this that taking on board opinions from different stakeholders can present a challenge for designers. However, the nature of our iterative methodology meant that this problem was identified and addressed between phase 2 and 3.

    In phase 1, the older adult end users tended to be very optimistic about how they would handle the system and the smartphone interface, overall giving higher scores in response to Likert statements and for the overall SUS score. Experts tended to be more pessimistic but this was probably due to their vast experience with older adults and technology. Most experts conceded that the use case analysis was a hypothetical one and that the capabilities of the older adult population are extremely variable, however, they felt that it was an extremely useful exercise in identifying major potential problems and addressing them early in the design process. Despite the difference in outlook between the experts and older adults, both groups reached agreement on most problems, particularly about the perceived difficulty of the login process and the lack of clear feedback when checking the system status and during the data upload process. We can conclude from this that utilizing multiple perspectives from different groups is an important feature of a good human-centered design process.

    Conclusions

    The HCD Methodology we have designed and implemented based on the principles of ISO 9241-210 has produced a functional app interface that is now suitable for exposure to older adults in long term clinical trials. We have applied appropriate testing techniques given the context of the interface being assessed. We would consider this a thorough and robust method for testing and informing design changes of all types of interactive connected health systems.

    Acknowledgments

    This work was part funded by the EU FP7 project Wireless Insole for Independent and Safe Elderly Living (WIISEL), project number FP7-ICT-2011-288878.

    Authors' Contributions

    The methodology for this study was conceived and designed by RH, LRQ, and GOL. The experiments were carried out by RH with the support of LG, TS, and ARM, each of whom contributed both usability and medical knowledge to the testing. The data was compiled and analyzed by RH, LRQ, and GOL and reviewed by LG, ARM, and PMAB. All authors contributed equally to the introduction and discussion sections of the paper. The paper as a whole was reviewed and edited where necessary by all authors before submission.

    Conflicts of Interest

    None declared.

    References

    1. International Organization for Standardization. ISO. 2010. Ergonomics of human-system interaction -- Part 210: Human-centred design for interactive systems   URL: https://www.iso.org/standard/52075.html [accessed 2017-05-18] [WebCite Cache]
    2. Martínez-Pérez B, de la Torre-Díez I, Candelas-Plasencia S, López-Coronado M. Development and evaluation of tools for measuring the quality of experience (QoE) in mHealth applications. J Med Syst 2013 Oct;37(5):9976. [CrossRef] [Medline]
    3. Shah Syed Ghulam Sarwar, Robinson I, AlShawi S. Developing medical device technologies from users' perspectives: a theoretical framework for involving users in the development process. Int J Technol Assess Health Care 2009 Oct;25(4):514-521. [CrossRef] [Medline]
    4. Sesto ME, Irwin CB, Chen KB, Chourasia AO, Wiegmann DA. Effect of touch screen button size and spacing on touch characteristics of users with and without disabilities. Hum Factors 2012 Jun;54(3):425-436. [Medline]
    5. Abugabah AJ, Alfarraj O. Issues to consider in designing health care information systems: a user-centred design approach. E-Journal of Health Inform 2015;9(1):8 [FREE Full text]
    6. Borycki E, Kushniruk A, Nohr C, Takeda H, Kuwata S, Carvalho C, et al. Usability methods for ensuring health information technology safety: evidence-based approaches. contribution of the IMIA working group health informatics for patient safety. Yearb Med Inform 2013;8:20-27. [Medline]
    7. Vermeulen J, Neyens JC, Spreeuwenberg MD, van RE, Sipers W, Habets H, et al. User-centered development and testing of a monitoring system that provides feedback regarding physical functioning to elderly people. Patient Prefer Adherence 2013 Aug;7:843-854 [FREE Full text] [CrossRef] [Medline]
    8. Developing an insole for elderly fall prevention. WIISEL: Wireless Insole for Independent and Safe Elderly Living   URL: http://www.wiisel.eu/ [accessed 2017-03-16] [WebCite Cache]
    9. Echt KV, Burridge AB. Predictors of reported internet use in older adults with high and low health literacy: the role of socio-demographics and visual and cognitive function. Phys Occup Ther Geriatr 2011;29(1):23-43.
    10. Wagner N, Hassanein K, Head M. The impact of age on website usability. Comput Hum Behav 2014;37:270-282.
    11. Harte RP, Glynn LG, Broderick BJ, Rodriguez-Molinero A, Baker PM, McGuiness B, et al. Human centred design considerations for connected health devices for the older adult. J Pers Med 2014 Jun 04;4(2):245-281 [FREE Full text] [CrossRef] [Medline]
    12. Harte R, Glynn L, Rodriquez-Molinero A, Baker PM, Scharf T, Quinlan LR, et al. A human-centered design methodology to enhance the usability, human factors, and user experience of connected health systems: a three-phase methodology. JMIR Hum Factors 2017;4(1):e8. [CrossRef] [Medline]
    13. Hull E, Jackson K, Dick J. Requirements Engineering. London: Springer; 2011.
    14. Khajouei R, Peute LW, Hasman A, Jaspers MW. Classification and prioritization of usability problems using an augmented classification scheme. J Biomed Inform 2011 Dec;44(6):948-957 [FREE Full text] [CrossRef] [Medline]
    15. Cooke L. Assessing concurrent think-aloud protocol as a usability test method: a technical communication approach. IEEE Trans Profess Commun 2010 Sep;53(3):202-205 [FREE Full text] [CrossRef]
    16. Balcer LJ, Galetta SL, Polman CH, Eggenberger E, Calabresi PA, Zhang A, et al. Low-contrast acuity measures visual improvement in phase 3 trial of natalizumab in relapsing MS. J Neurol Sci 2012 Jul 15;318(1-2):119-124. [CrossRef] [Medline]
    17. Findl O, Leydolt C. Meta-analysis of accommodating intraocular lenses. J Cataract Refract Surg 2007 Mar;33(3):522-527. [CrossRef] [Medline]
    18. Pineles SL, Birch EE, Talman LS, Sackel DJ, Frohman EM, Calabresi PA, et al. One eye or two: a comparison of binocular and monocular low-contrast acuity testing in multiple sclerosis. Am J Ophthalmol 2011 Jul;152(1):133-140 [FREE Full text] [CrossRef] [Medline]
    19. Thayaparan K, Crossland MD, Rubin GS. Clinical assessment of two new contrast sensitivity charts. Br J Ophthalmol 2007 Jun;91(6):749-752 [FREE Full text] [CrossRef] [Medline]
    20. Schneck ME, Haegerstrom-Portnoy G, Lott LA, Brabyn JA. Comparison of panel D-15 tests in a large older population. Optom Vis Sci 2014 Mar;91(3):284-290 [FREE Full text] [CrossRef] [Medline]
    21. Sofi F, Valecchi D, Bacci D, Abbate R, Gensini GF, Casini A, et al. Physical activity and risk of cognitive decline: a meta-analysis of prospective studies. J Intern Med 2011 Jan;269(1):107-117 [FREE Full text] [CrossRef] [Medline]
    22. Singh-Manoux A, Kivimaki M, Glymour MM, Elbaz A, Berr C, Ebmeier KP, et al. Timing of onset of cognitive decline: results from Whitehall II prospective cohort study. BMJ 2012;344:d7622 [FREE Full text] [Medline]
    23. Bundesen C, Habekost T, Kyllingsbæk S. A neural theory of visual attention and short-term memory (NTVA). Neuropsychologia 2011 May;49(6):1446-1457. [CrossRef] [Medline]
    24. Heim AW. SJDM. 1968. AH4 group test of intelligence   URL: http://www.sjdm.org/dmidi/AH4_intelligence_test.html [accessed 2017-03-16] [WebCite Cache]
    25. Isaac C. Weloty. 2017. Intelligent verbatim transcription   URL: https://weloty.com/intelligent-verbatim-transcription/ [accessed 2017-03-16] [WebCite Cache]
    26. Sears A, Jacko JA. Human-Computer Interaction: Development Process. New York: CRC Press; 2009.
    27. Lewis JR. Psychometric evaluation of an after-scenario questionnaire for computer usability studies: the ASQ. ACM SIGCHI Bulletin 1991;23(1):78-81.
    28. Hart SG. Development of NASA-TLX (Task Load Index): results of empirical theoretical research. Adv Psychol 1988;52:139-183. [CrossRef]
    29. Lewis JR, Sauro J. The factor structure of the system usability scale. 2009 Presented at: Proceedings of the 1st International Conference on Human Centered Design; 2009; Berlin, Heidelberg.
    30. Giacomin J. What is human centred design? Design J 2014 Dec 01;17(4):606-623. [CrossRef]
    31. Fisk D. Designing for older adults. In: Principles and Creative Human Factors Approaches. New York: CRC Press; 2009.
    32. Zhou J, Rau PP, Salvendy G. Use and design of handheld computers for older adults: a review and appraisal. Int J Hum Comput Interact 2012 Dec;28(12):799-826. [CrossRef]
    33. Jin ZX, Plocher T, Kiff L. Touch screen user interfaces for older adults: button size spacing. : Springer; 2007 Presented at: Universal Access in Human Computer Interaction. Coping with Diversity; 2007; Berlin p. 933-941. [CrossRef]
    34. Bangor A, Kortum PT, Miller JT. An empirical evaluation of the system usability scale. Int J Hum Comput Interact 2008;24(6):594.


    Abbreviations

    ADL: acitivities of daily living
    ASQ: after scenario questionnaire
    CD: color discrimination
    CS: color sensitivity
    GP: general practitioner
    HCA: high contrast acuity.
    HCD: human-centered design
    HCI: human-computer interation
    HRB: health research board
    IT: information technology
    LCA: low contrast acuity
    LCALL: low contrast acuity in low luminance
    OS: operating system
    RA: reading acuity
    SUS: system usability scale
    UHG: University Hospital Galway


    Edited by A Keepanasseril; submitted 24.11.16; peer-reviewed by E Afari-kumah, R Berenbaum; comments to author 27.12.16; revised version received 21.03.17; accepted 21.03.17; published 30.05.17

    ©Richard Harte, Leo R Quinlan, Liam Glynn, Alejandro Rodríguez-Molinero, Paul MA Baker, Thomas Scharf, Gearóid ÓLaighin. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 30.05.2017.

    This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mhealth and uhealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.