This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.
Ecological momentary assessment (EMA) enables individuals to self-report their subjective momentary physical and emotional states. However, certain conditions, including routine observable behaviors (eg, moods, medication adherence) as well as behaviors that may suggest declines in physical or mental health (eg, memory losses, compulsive disorders) cannot be easily and reliably measured via self-reports.
This study aims to examine a method complementary to EMA, denoted as
We conducted two studies of 4 weeks each, collecting self-reports from 20 participants about their stress, fatigue, anxiety, and well-being, in addition to collecting peer-reported perceptions from 27 of their peers.
Preliminary results showed that some of the peers reported daily assessments for stress, fatigue, anxiety, and well-being statistically equal to those reported by the participant. We also showed how pairing assessments of participants and peers in time enables a qualitative and quantitative exploration of unique research questions not possible with EMA-only based assessments. We reported on the usability and implementation aspects based on the participants’ experience to guide the use of the PeerMA to complement the information obtained via self-reports for observable behaviors and physical and emotional states among healthy individuals.
It is possible to leverage the PeerMA method as a complement to EMA to assess constructs that fall in the realm of observable behaviors and states in healthy individuals.
The ecological momentary assessment (EMA) [
In clinical settings,
Consider an illustrative case of Bob in the early stages of developing an obsessive-compulsive disorder (OCD) [
Inspired by the principles of EMA, we evaluated the peer-ceived momentary assessment (PeerMA) method, previously defined by Berrocal and Wac [
We explored 2 research aims in this study: (1) to evaluate the feasibility of the PeerMA method for studying real-life phenomena in healthy populations and (2) to identify the critical operational aspects and human factors that influence the quality of the data collected and their potential scaling. Toward this end, we conducted 2 in-the-wild studies (ie, outside the laboratory) using the PeerMA method.
Both the ESM and the EMA methods were introduced in psychology [
In psychology, self-assessment and other-assessment methods (also referred to as proxy, observer, informant, or peer assessments) have been used in various research contexts. For example, Vazire [
Gosling et al [
Balsis et al [
In clinical settings, in a group of older adults (
Focusing more on the characteristics of observers, Watson et al [
Finally, although not using any form of EMA or PeerMA as presented in this paper, other studies showed empirical evidence to support 2 assumptions underlying PeerMA. Namely, (1) people often rely on their peers and trust essential information to them [
Following the trend of
However, despite its notable value, passively sensed data from smartphone sensors do not always enable accurate modeling of the perceptions of highly subjective individuals [
We examined informant, proxy, observer reports from the literature and observed 3 main characteristics:
They captured individual and observer assessments using long surveys or instruments.
These assessments are usually carried out infrequently. Sometimes, these are one time–only assessments or are carried out every few months or years.
Proxies are usually involved for patients in clinical settings (due to physical or cognitive impairments).
We researched the use of PeerMA instead by (1) specifically using short surveys, for example, single-item or few-item questionnaires capturing one variable; (2) conducting frequent assessments, from >1 per day to just a few per week or month (in the case of longitudinal studies); and (3) exploring its value by focusing on healthy populations (ie, not having been diagnosed with a disease).
Our research makes a unique contribution by exploring the use of PeerMA [
This section describes the experimental design of the studies. To explore research aim 1 on the feasibility of the PeerMA method to study real-life phenomena in healthy populations, we collected variables such as user retention during the study, overall agreement between the EMA and PeerMA assessments, and the experimental value of the method by enabling the study of self-assessments and observer assessments paired in time. Moreover, to explore research aim 2 on operational aspects and human factors that influence the quality of the collected data, we gathered qualitative elements such as user reflections after using the method, difficulty in using the technology, and reliability of the technologies that can influence the quality of the collected data.
As explained in this section,
We implemented the PeerMA method by leveraging the
To be included in this study, participants and peers had to be >18 years old and own a data-enabled smartphone with Android version 8.1+ or iOS version 7+.
For
For
This section explains the entry, ambulatory, and exit surveys from each study. These are summarized in
Study participants: type and gender distribution.
Studies | Participants, n (%) | Peers, n (%) | |
|
|||
|
Male | 7 (54) | 12 (60) |
|
Female | 6 (46) | 8 (40) |
|
|||
|
Male | 2 (20) | 3 (43) |
|
Female | 8 (80) | 4 (57) |
aTotal number of participants is 13, and total number of peers is 20.
bTotal number of participants is 10, and total number of peers is 7.
Study design: surveys and ecological momentary assessment/peer-ceived momentary assessment content in each study.
Types of survey | Study A | Study B |
Entry surveys | Study entry, GERTa, PSSb, and SDSc | Study entry, PSSb, and SDSc |
Daily surveys: EMAd and PeerMAe | Stress, fatigue, and anxiety; frequency: 8 times a day (9 AM-9 PM); silent push notification; expires after 40 min | Stress, fatigue, anxiety, and well-being; frequency: 3 times a day (8 AM-8 PM); silent push notification; expires after 30 min |
Exit survey | Study exit | Study exit |
aGERT: Geneva Emotion Recognition Test (0-42; higher scores reflect higher ability).
bPSS: Perceived Stress Scale (0-40; higher scores reflect higher perceived stress).
cSDS: Social Desirability Scale (0-10; higher scores reflect higher social approval concern).
dEMA: ecological momentary assessment.
ePeerMA: peer-ceived momentary assessment.
Participants and peers initially completed the entry surveys before beginning the daily EMA/PeerMA. The study entry survey had 2 parts: (1) socioeconomic status, including gender, age range, education, marital status, and employment status and (2) open-ended questions asking participants whether they considered themselves stressed, what causes their stress, and whether they think others notice when they are stressed. For peers, open-ended questions asked whether they noticed when people around them project stress, what signs they observe in those projecting stress, how they react when someone around projects stress, and whether they get stressed or change their behavior when exposed to someone who is stressed. Peers also indicated their relationship with the participant (eg, friend, spouse) and whether they cohabit with the participant.
We employed the 42-item Geneva Emotion Recognition Test (GERT) that measures a person’s ability to recognize someone’s emotions from facial, voice, and body inputs (higher scores reflect higher ability) [
For
In both studies, we used single-item questions proposed by Rosenzveig et al [
For
For
Because the explicit well-being question had been adopted only in
Examples of ambulatory assessment: (a) Self-assessment (ecological momentary assessment) of stress, (b) peer- assessment (peer-ceived momentary assessment) of stress, (c) confidence assessment required from peers. Assessments of fatigue, anxiety, and well-being followed the same approach.
At the end of the study, both participants and peers completed an exit survey commenting on usability aspects of the mobile device app (eg, usability, positive and negative aspects perceived). The survey also asked how participants felt about reflecting on their states during the day, whereas peers answered how they felt about reflecting on their peers’ states during the day.
At the beginning of the study, participants had a 15-min web-based or face-to-face meeting with the researcher with the following objectives: (1) explain the nature of the study, (2) hand out the informed consent, (3) train the participant to use the
During this meeting, the researcher explained to the participants that, given the nature of the study, peers had to be people with whom they had regular contact (at least daily), either face-to-face or virtually, using communication tools. We explained that peers could be spouses (significant others), close relatives (family), or friends from school or work. After the meeting, participants would complete the entry surveys, enroll their peers, and explain to them how to use the app. In these 2 studies, the researcher had no interaction with peers. After enrolling their peers and completing the entry surveys, participants pushed a button in the app to start the study and receive daily EMAs and PeerMAs. The researchers were in touch with the participants remotely to follow-up with them about these steps, if needed.
The first part, the section on
The second part, the section on
We present the type of quantitative and qualitative data that are being obtained with PeerMA as a method and tool in the 2 observational studies, and not necessarily the strength of the results regarding stress, fatigue, anxiety, and well-being that have been explored as use cases. Nevertheless, we also present a detailed examination of the results to support the findings and observations that come along the analyses.
The first part of the dataset, extracted from the entry surveys, describes the samples in each study.
In
In
Participants’ socioeconomic characteristics by study.
Variables | Study A | Study B | ||||
|
Participants, n (%) | Peers, n (%) | Participants, n (%) | Peers, n (%) | ||
|
||||||
|
Male | 6 (46) | 8 (40) | 2 (29) | 3 (43) | |
|
Female | 7 (54) | 12 (60) | 5 (71) | 4 (57) | |
|
||||||
|
18-20 | 1 (8) | 2 (10) | 0 (0) | 0 (0) | |
|
21-29 | 9 (69) | 8 (40) | 3 (43) | 3 (43) | |
|
30-39 | 2 (15) | 4 (20) | 4 (57) | 4 (57) | |
|
40-49 | 1 (8) | 6 (30) | 0 (0) | 0 (0) | |
|
||||||
|
Single | 10 (77) | 12 (60) | 3 (43) | 3 (43) | |
|
Married | 2 (15) | 5 (25) | 4 (57) | 4 (57) | |
|
Other | 1 (8) | 3 (15) | 0 (0) | 0 (0) | |
|
||||||
|
Undergraduate | 8 (62) | 11 (55) | 1 (14) | 3 (43) | |
|
Graduate | 5 (38) | 9 (45) | 6 (86) | 4 (57) | |
|
||||||
|
Yes | 3 (23) | 9 (45) | 3 (43) | 6 (86) | |
|
No | 10 (77) | 11 (55) | 4 (57) | 1 (14) | |
|
||||||
|
Yes | 9 (69) | 15 (75) | 7 (100) | 6 (86) | |
|
No | 4 (31) | 5 (25) | 0 (0) | 1 (14) |
Survey scores of participants and peers by study.
Instruments and roles | Study A | Study B | |||||
|
Minimum score | Maximum score | Mean (SD) | Minimum score | Maximum score | Mean (SD) | |
|
|||||||
|
Participant | 18 | 33 | 26 (5.2) | N/Ab | N/A | N/A |
|
Peer | 20 | 31 | 25 (3.8) | N/A | N/A | N/A |
|
|||||||
|
Participant | 12 | 31 | 23 (5.9) | 19 | 31 | 24 (4.6) |
|
Peer | 0 | 36 | 22 (8.2) | 12 | 28 | 21 (5.9) |
|
|||||||
|
Participant | 3 | 12 | 7 (2.5) | 0 | 10 | 5 (3.2) |
|
Peer | 0 | 11 | 6 (3.0) | 1 | 11 | 6 (3.4) |
|
|||||||
|
Participant | 2.8 | 9.8 | 6.1 (2.4) | 3.1 | 9.0 | 6.2 (2.4) |
|
Peer | 0.3 | 8.7 | 4.5 (2.6) | 1.9 | 7.9 | 5.0 (1.8) |
|
|||||||
|
Participant | N/A | N/A | N/A | 4.0 | 8.9 | 5.7 (1.7) |
|
Peer | 3.9 | 8.2 | 6.9 (1.07) | 5.2 | 10 | 7.5 (1.5) |
aGERT: Geneva Emotion Recognition Test.
bN/A: not applicable.
cPSS: Perceived Stress Scale.
dSDS: Social Desirability Scale.
Summary of the engagement of participants and peers.
Studies, participant ID | Peer ID | Days | Ecological momentary assessments or peer-ceived momentary assessments triggered | Response rate, % | Peer-participant relationship | |
|
||||||
|
S1 | N/Aa | 27 | 220 | 61.8 | N/A |
|
N/A | S1P1 | 27 | 127 | 87.4 | 3: parent |
|
S2 | N/A | 31 | 147 | 68.0 | N/A |
|
N/A | S2P1 | 31 | 100 | 61.0 | 3: parent |
|
S3 | N/A | 29 | 243 | 97.9 | N/A |
|
N/A | S3P1 | 29 | 220 | 35.5 | 4: friend |
|
S4 | N/A | 28 | 211 | 36.5 | N/A |
|
N/A | S4P1 | 27 | 165 | 15.2 | 3: parent |
|
S5 | N/A | 29 | 265 | 47.9 | N/A |
|
N/A | S5P1 | 27 | 252 | 23.0 | 3: sibling |
|
S6 | N/A | 29 | 243 | 97.5 | N/A |
|
N/A | S6P1 | 22 | 183 | 61.7 | 2: boyfriend |
|
S7 | N/A | 29 | 245 | 70.6 | N/A |
|
N/A | S7P1 | 20 | 122 | 30.3 | 4: friend |
|
N/A | S7P2 | 29 | 199 | 50.8 | 3: parent |
|
S8 | N/A | 28 | 250 | 94.0 | N/A |
|
N/A | S8P1 | 26 | 214 | 32.2 | 4: friend |
|
N/A | S8P2 | 28 | 213 | 61.5 | 2: boyfriend |
|
S9 | N/A | 34 | 225 | 44.9 | N/A |
|
N/A | S9P1 | 33 | 130 | 36.9 | 4: friend |
|
N/A | S9P2 | 33 | 133 | 45.9 | 3: sibling |
|
S10 | N/A | 29 | 267 | 31.1 | N/A |
|
N/A | S10P1 | 30 | 102 | 41.2 | 2: girlfriend |
|
N/A | S10P2 | 16 | 152 | 4.6 | 4: friend |
|
S11 | N/A | 29 | 250 | 82.0 | N/A |
|
N/A | S11P1 | 22 | 153 | 86.3 | 4: friend |
|
N/A | S11P2 | 17 | 100 | 50.0 | 3: sibling |
|
S12 | N/A | 28 | 252 | 77.4 | N/A |
|
N/A | S12P1 | 26 | 198 | 13.1 | 3: sibling |
|
N/A | S12P2 | 26 | 206 | 50.0 | 4: friend |
|
S13 | N/A | 30 | 268 | 35.1 | N/A |
|
N/A | S13P1 | 15 | 130 | 9.2 | 3: parent |
|
N/A | S13P2 | 14 | 79 | 79.7 | 4: friend |
|
||||||
|
S1 | N/A | 31 | 93 | 69.9 | N/A |
|
N/A | S1P1 | 31 | 93 | 172.0 | 4: friend |
|
S2 | N/A | 28 | 84 | 78.6 | N/A |
|
N/A | S2P1 | 28 | 84 | 77.4 | 4: friend |
|
S3 | N/A | 28 | 84 | 108.3 | N/A |
|
N/A | S3P1 | 28 | 84 | 60.7 | 1: spouse |
|
S4 | N/A | 28 | 84 | 69.0 | N/A |
|
N/A | S4P1 | 28 | 84 | 59.5 | 1: spouse |
|
S5 | N/A | 28 | 84 | 50.0 | N/A |
|
N/A | S5P1 | 27 | 81 | 28.4 | 1: spouse |
|
S6 | N/A | 17 | 51 | 96.1 | N/A |
|
N/A | S6P1 | 16 | 48 | 102.1 | 4: friend |
|
S7 | N/A | 27 | 81 | 91.4 | N/A |
|
N/A | S7P1 | 28 | 84 | 40.5 | 4: friend |
aN/A: not applicable.
As noted in
Additionally, in
For each participant and peer, we normalized the EMA/PeerMA assessments to 0 to 1 based on the highest and lowest assessment given by each person.
Study A: Summary of the ecological momentary assessment and peer-ceived momentary assessment values. Each row shows the median, mean, and SD for the corresponding participant or peer calculated from all the assessments issued by that person.
Participant ID | Peer ID | Stress (0-1) | Fatigue (0-1) | Anxiety (0-1) | Computed well-being (0-1) | ||||
|
|
Median | Mean (SD) | Median | Mean (SD) | Median | Mean (SD) | Median | Mean (SD) |
S1 | N/Aa | 0.35 | 0.41 (0.27) | 0.22 | 0.27 (0.23) | 0.30 | 0.37 (0.27) | 0.69 | 0.65 (0.21) |
N/A | S1P1 | 0.70 | 0.62 (0.27) | 0.62 | 0.65 (0.27) | 0.70 | 0.57 (0.29) | 0.35 | 0.39 (0.27) |
S2 | N/A | 0.17 | 0.28 (0.29) | 0.29 | 0.31 (0.26) | 0.19 | 0.31 (0.33) | 0.65 | 0.70 (0.23) |
N/A | S2P1 | 0.72 | 0.62 (0.30) | 0.58 | 0.59 (0.28) | 0.37 | 0.34 (0.23) | 0.42 | 0.48 (0.23) |
S3 | N/A | 0.61 | 0.51 (0.30) | 0.61 | 0.62 (0.27) | 0.37 | 0.39 (0.24) | 0.40 | 0.49 (0.22) |
N/A | S3P1 | 0.52 | 0.49 (0.27) | 0.39 | 0.40 (0.21) | 0.48 | 0.50 (0.27) | 0.55 | 0.54 (0.22) |
S4 | N/A | 0.18 | 0.26 (0.26) | 0.30 | 0.36 (0.25) | 0.35 | 0.43 (0.32) | 0.71 | 0.65 (0.17) |
N/A | S4P1 | 0.65 | 0.60 (0.30) | 0.65 | 0.50 (0.36) | 0.79 | 0.63 (0.33) | 0.35 | 0.42 (0.28) |
S5 | N/A | 0.11 | 0.23 (0.28) | 0.35 | 0.36 (0.24) | 0.00 | 0.10 (0.23) | 0.82 | 0.77 (0.17) |
N/A | S5P1 | 0.64 | 0.58 (0.32) | 0.60 | 0.58 (0.33) | 0.12 | 0.19 (0.28) | 0.53 | 0.55 (0.27) |
S6 | N/A | 0.37 | 0.44 (0.28) | 0.32 | 0.34 (0.23) | 0.40 | 0.43 (0.22) | 0.74 | 0.60 (0.19) |
N/A | S6P1 | 0.49 | 0.49 (0.23) | 0.38 | 0.45 (0.32) | 0.46 | 0.43 (0.24) | 0.55 | 0.55 (0.24) |
S7 | N/A | 0.13 | 0.17 (0.20) | 0.36 | 0.37 (0.25) | 0.12 | 0.15 (0.20) | 0.80 | 0.77 (0.16) |
N/A | S7P1 | 0.55 | 0.58 (0.22) | 0.64 | 0.64 (0.25) | 0.68 | 0.60 (0.23) | 0.36 | 0.39 (0.22) |
N/A | S7P2 | 0.31 | 0.34 (0.22) | 0.65 | 0.57 (0.29) | 0.34 | 0.39 (0.29) | 0.60 | 0.57 (0.21) |
S8 | N/A | 0.08 | 0.18 (0.25) | 0.52 | 0.48 (0.25) | 0.07 | 0.14 (0.22) | 0.75 | 0.73 (0.18) |
N/A | S8P1 | 0.70 | 0.68 (0.19) | 0.62 | 0.65 (0.18) | 0.67 | 0.57 (0.23) | 0.34 | 0.37 (0.17) |
N/A | S8P2 | 0.43 | 0.41 (0.25) | 0.58 | 0.55 (0.27) | 0.47 | 0.47 (0.25) | 0.50 | 0.52 (0.24) |
S9 | N/A | 0.00 | 0.16 (0.26) | 0.38 | 0.43 (0.30) | 0.26 | 0.27 (0.29) | 0.74 | 0.71 (0.19) |
N/A | S9P1 | 0.42 | 0.43 (0.31) | 0.67 | 0.61 (0.29) | 0.68 | 0.56 (0.25) | 0.42 | 0.47 (0.21) |
N/A | S9P2 | 0.50 | 0.55 (0.27) | 0.51 | 0.55 (0.24) | 0.45 | 0.58 (0.34) | 0.45 | 0.44 (0.20) |
S10 | N/A | 0.58 | 0.61 (0.32) | 0.58 | 0.57 (0.27) | 0.68 | 0.61 (0.28) | 0.37 | 0.40 (0.12) |
N/A | S10P1 | 0.36 | 0.38 (0.28) | 0.34 | 0.34 (0.20) | 0.34 | 0.44 (0.28) | 0.63 | 0.62 (0.22) |
N/A | S10P2 | 0.19 | 0.39 (0.43) | 0.77 | 0.61 (0.39) | 0.42 | 0.48 (0.46) | 0.52 | 0.51 (0.35) |
S11 | N/A | 0.48 | 0.52 (0.23) | 0.18 | 0.29 (0.28) | 0.41 | 0.42 (0.22) | 0.62 | 0.59 (0.18) |
N/A | S11P1 | 0.49 | 0.49 (0.23) | 0.40 | 0.45 (0.31) | 0.52 | 0.51 (0.25) | 0.52 | 0.52 (0.24) |
N/A | S11P2 | 0.48 | 0.42 (0.34) | 0.63 | 0.57 (0.37) | 0.50 | 0.50 (0.24) | 0.52 | 0.50 (0.28) |
S12 | N/A | 0.36 | 0.46 (0.29) | 0.60 | 0.55 (0.27) | 0.07 | 0.19 (0.28) | 0.63 | 0.60 (0.17) |
N/A | S12P1 | 0.67 | 0.56 (0.33) | 0.77 | 0.65 (0.30) | 0.57 | 0.49 (0.31) | 0.31 | 0.43 (0.29) |
N/A | S12P2 | 0.10 | 0.24 (0.32) | 0.53 | 0.52 (0.25) | 0.00 | 0.19 (0.29) | 0.74 | 0.68 (0.20) |
S13 | N/A | 0.38 | 0.40 (0.30) | 0.03 | 0.16 (0.24) | 0.30 | 0.31 (0.25) | 0.74 | 0.71 (0.18) |
N/A | S13P1 | 0.43 | 0.47 (0.36) | 0.45 | 0.50 (0.30) | 0.41 | 0.38 (0.33) | 0.56 | 0.55 (0.28) |
N/A | S13P2 | 0.51 | 0.55 (0.27) | 0.42 | 0.43 (0.26) | 0.61 | 0.62 (0.22) | 0.48 | 0.47 (0.24) |
aN/A: not applicable.
Study B: Summary of ecological momentary assessment and peer-ceived momentary assessment values. Each row shows the median, mean, and SD for the corresponding participant or peer calculated from all the assessments issued by that person.
Participant ID | Peer ID | Stress (0-1) | Fatigue (0-1) | Anxiety (0-1) | Reported well-being |
Computed well-being |
|||||
|
|
Median | Mean (SD) | Median | Mean (SD) | Median | Mean (SD) | Median | Mean (SD) | Median | Mean (SD) |
S1 | N/Aa | 0.49 | 0.36 (0.55) | 0.58 | 0.37 (0.60) | 0.58 | 0.66 (0.32) | 0.76 | 0.72 (0.24) | 0.39 | 0.52 (0.30) |
N/A | S1P1 | 0.44 | 0.44 (0.21) | 0.50 | 0.47 (0.23) | 0.57 | 0.54 (0.21) | 0.41 | 0.41 (0.21) | 0.51 | 0.49 (0.21) |
S2 | N/A | 0.35 | 0.34 (0.29) | 0.42 | 0.44 (0.27) | 0.42 | 0.43 (0.25) | 0.73 | 0.63 (0.28) | 0.66 | 0.63 (0.28) |
N/A | S2P1 | 0.76 | 0.69 (0.24) | 0.59 | 0.57 (0.20) | 0.66 | 0.64 (0.25) | 0.61 | 0.56 (0.29) | 0.26 | 0.35 (0.22) |
S3 | N/A | 0.50 | 0.49 (0.27) | 0.60 | 0.58 (0.27) | 0.39 | 0.42 (0.23) | 0.48 | 0.47 (0.26) | 0.37 | 0.43 (0.28) |
N/A | S3P1 | 0.47 | 0.43 (0.24) | 0.58 | 0.53 (0.32) | 0.59 | 0.59 (0.23) | 0.79 | 0.76 (0.20) | 0.39 | 0.43 (0.29) |
S4 | N/A | 0.50 | 0.53 (0.25) | 0.56 | 0.59 (0.25) | 0.42 | 0.44 (0.31) | 0.64 | 0.56 (0.24) | 0.56 | 0.52 (0.25) |
N/A | S4P1 | 0.63 | 0.66 (0.28) | 0.75 | 0.67 (0.25) | 0.69 | 0.59 (0.32) | 0.65 | 0.62 (0.28) | 0.31 | 0.39 (0.30) |
S5 | N/A | 0.22 | 0.27 (0.26) | 0.15 | 0.24 (0.27) | 0.33 | 0.34 (0.33) | 0.78 | 0.66 (0.32) | 0.67 | 0.67 (0.27) |
N/A | S5P1 | 0.44 | 0.40 (0.29) | 0.31 | 0.41 (0.32) | 0.43 | 0.43 (0.30) | 0.63 | 0.54 (0.29) | 0.59 | 0.56 (0.29) |
S6 | N/A | 0.54 | 0.56 (0.27) | 0.53 | 0.54 (0.24) | 0.55 | 0.59 (0.29) | 0.48 | 0.51 (0.23) | 0.45 | 0.39 (0.26) |
N/A | S6P1 | 0.49 | 0.47 (0.28) | 0.51 | 0.50 (0.29) | 0.62 | 0.59 (0.32) | 0.55 | 0.49 (0.32) | 0.47 | 0.48 (0.29) |
S7 | N/A | 0.00 | 0.21 (0.33) | 0.54 | 0.56 (0.20) | 0.26 | 0.35 (0.20) | 0.62 | 0.59 (0.30) | 0.81 | 0.66 (0.27) |
N/A | S7P1 | 0.36 | 0.35 (0.34) | 0.55 | 0.56 (0.28) | 0.37 | 0.41 (0.36) | 0.60 | 0.53 (0.26) | 0.66 | 0.59 (0.30) |
aN/A: not applicable.
As this study primarily focused on assessing the feasibility of the method, we started the data analysis with the least complex visualization of raw datasets. We wanted to plot the values reported by the participants and their peers and understand the magnitude of agreement/disagreement in their ratings in time. We also imputed the missing PeerMA values using a spline function of order 4.
To illustrate, sample plots from
Finally, the 5 plots at the bottom of
Ecological momentary assessment/peer-ceived momentary assessment. Plots from Study A. The x-axis represents days in the study, and the y-axis represents the magnitude of the normalized assessments for stress, fatigue, anxiety, and computed well-being.
Ecological momentary assessment/peer-ceived momentary assessment. Plots from Study B. The x-axis represents days in the study, and the y-axis represents the magnitude of the normalized assessments for stress, fatigue, anxiety, and computed well-being.
Various techniques can be used to quantify the daily agreement between the EMAs and PeerMAs. For instance, residual analysis such as mean absolute percent error (MAPE) or more robust alternatives such as mean arctangent absolute percentage error (MAAPE) [
Therefore, we analyzed the daily agreement between the EMAs and the PeerMAs as follows. As a first approach, we reported the mean directional accuracy (MDA), which measures the agreement (ie, we report
In the 2 studies, the number of assessments between participants and peers differed every day. Hence it was not possible to calculate the MDA for each individual EMA/PeerMA. Thus, we calculated the daily average for stress, fatigue, anxiety, and well-being for each participant and peer using all the assessments given that day. We then counted the number of days with
Mean Directional Accuracy. “Same Day" is the average percentage of days that participants and peers agreed in the directional change of their assessments the same day (close to chance). “+1 Day" is the average percentage of days that participants and peers agreed in the directional change of their assessments the same day or the day after.
To further investigate the values of EMA and PeerMA, we conducted a correlation analysis for EMA/PeerMA values in both studies. By focusing on the correlation, we did not assume that the EMA and the PeerMA measure the same constructs; we investigated it later in this paper. Therefore, in this study, we applied the Spearman rank correlation method because (1) participant and peer assessments are not independent, they both refer to a state of the participant and (2) the Shapiro-Wilk, D’Agostino
For
Study A: ecological momentary assessment/peer-ceived momentary assessment Spearman correlations calculated throughout the study. Each row shows the correlation between the participants’ and peers’ assessments.
Participant ID | Peer ID | Stress | Fatigue | Anxiety | Computed well-being | |||||
|
|
rs | rs | rs | rs | |||||
S1 | S1P1 | 0.44 | .02 | 0.28 | .15 | 0.57 | .002 | 0.58 | .001 | |
S2 | S2P1 | 0.50 | .003 | 0.26 | .15 | 0.24 | .17 | 0.63 | <.001 | |
S3 | S3P1 | –0.18 | .36 | –0.16 | .40 | –0.09 | .64 | –0.15 | .45 | |
S4 | S4P1 | 0.06 | .76 | –0.10 | .62 | –0.09 | .64 | 0.09 | .67 | |
S5 | S5P1 | –0.10 | .62 | –0.23 | .24 | –0.35 | .07 | –0.19 | .33 | |
S6 | S6P1 | –0.23 | .23 | –0.11 | .57 | 0.27 | .16 | –0.08 | .66 | |
S7 | S7P1 | 0.19 | .33 | –0.01 | .96 | –0.31 | .11 | –0.13 | .49 | |
S7 | S7P2 | –0.01 | .96 | 0.28 | .14 | 0.19 | .32 | 0.36 | .05 | |
S8 | S8P1 | 0.11 | .57 | 0.08 | .68 | 0.22 | .25 | 0.26 | .18 | |
S8 | S8P2 | 0.32 | .10 | 0.59 | <.001 | 0.33 | .08 | 0.63 | <.001 | |
S9 | S9P1 | –0.25 | .16 | –0.05 | .80 | –0.28 | .10 | –0.37 | .03 | |
S9 | S9P2 | 0.02 | .91 | 0.29 | .10 | –0.09 | .60 | –0.16 | .38 | |
S10 | S10P1 | 0.29 | .13 | –0.18 | .35 | 0.04 | .84 | 0.35 | .06 | |
S10 | S10P2 | 0.14 | .48 | 0.05 | .80 | –0.15 | .45 | –0.14 | .46 | |
S11 | S11P1 | 0.39 | .03 | 0.39 | .03 | –0.06 | .75 | 0.28 | .13 | |
S11 | S11P2 | 0.07 | .70 | 0.03 | .89 | –0.06 | .75 | 0.15 | .42 | |
S12 | S12P1 | 0.32 | .10 | 0.27 | .16 | 0.21 | .28 | 0.56 | .002 | |
S12 | S12P2 | 0.07 | .73 | –0.24 | .22 | –0.02 | .94 | 0.07 | .71 | |
S13 | S13P1 | 0.41 | .02 | –0.19 | .30 | 0.41 | .02 | 0.27 | .14 | |
S13 | S13P2 | 0.36 | .05 | –0.34 | .06 | 0.43 | .02 | 0.27 | .15 |
Study A: summary of ecological momentary assessment/peer-ceived momentary assessment Spearman correlations.
Correlation strength | Stress, n (%) | Fatigue, n (%) | Anxiety, n (%) | Computed well-being, n (%) | Total, n (%) |
Highly positivea | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
Moderately positiveb | 5 (25) | 2 (10) | 3 (15) | 6 (30) | 16 (20) |
Weakly positivec | 10 (50) | 8 (40) | 7 (35) | 7 (35) | 32 (40) |
Weakly negatived | 5 (25) | 9 (45) | 9 (45) | 6 (30) | 29 (36) |
Moderately negativee | 0 (0) | 1 (5) | 1 (5) | 1 (5) | 3 (4) |
Highly negativef | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
a(0.67 to 1.00): Values of the spearman correlation inside this interval are considered highly positive.
b(0.34 to 0.66): Values of the spearman correlation inside this interval are considered moderately positive.
c(0.00 to 0.33): Values of the spearman correlation inside this interval are considered weakly positive.
d(−0.33 to 0.00): Values of the spearman correlation inside this interval are considered weakly negative.
e(−0.66 to −0.34): Values of the spearman correlation inside this interval are considered moderately negative.
f(−1.00 to −0.67): Values of the spearman correlation inside this interval are considered highly negative.
For
Study B: ecological momentary assessment/peer-ceived momentary assessment Spearman correlations calculated throughout the study. Each row shows the correlation between the participants’ and peer’s assessments calculated throughout the study.
Participant ID | Peer ID | Stress | Fatigue | Anxiety | Reported well-being | Computed well-being | ||||||
|
|
rs | rs | rs | rs | rs | ||||||
S1 | S1P1 | –0.39 | .02 | –0.36 | .03 | –0.33 | .05 | 0.18 | .30 | –0.30 | .08 | |
S2 | S2P1 | –0.31 | .10 | 0.33 | .08 | 0.26 | .18 | 0.40 | .03 | 0.13 | .51 | |
S3 | S3P1 | 0.37 | .04 | 0.44 | .01 | 0.04 | .84 | –0.08 | .68 | 0.44 | .01 | |
S4 | S4P1 | 0.24 | .21 | 0.16 | .42 | 0.10 | .60 | –0.17 | .38 | 0.37 | .048 | |
S5 | S5P1 | 0.04 | .84 | 0.03 | .88 | 0.06 | .76 | 0.28 | .13 | –0.01 | .95 | |
S6 | S6P1 | 0.44 | .07 | –0.06 | .81 | 0.24 | .34 | 0.83 | <.001 | 0.26 | .30 | |
S7 | S7P1 | 0.41 | .03 | 0.20 | .29 | –0.25 | .19 | 0.58 | <.001 | –0.58 | .001 |
Study B: summary of ecological momentary assessment/peer-ceived momentary assessment Spearman correlations.
Correlation strength | Stress | Fatigue | Anxiety | Reported well-being | Computed well-being | Total, n (%) |
Highly positivea | 0 (0) | 0 (0) | 0 (0) | 1 (14) | 0 (0) | 1 (3) |
Moderately positiveb | 3 (43) | 1 (14) | 0 (0) | 2 (29) | 2 (29) | 8 (23) |
Weakly positivec | 2 (29) | 4 (57) | 5 (71%) | 2 (29) | 2 (29) | 15 (43) |
Weakly negatived | 1 (14) | 1 (14) | 2 (29%) | 2 (29) | 2 (29) | 8 (23) |
Moderately negativee | 1 (14) | 1 (14) | 0 (0) | 0 (0) | 1 (14) | 3 (9) |
Highly negativef | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) | 0 (0) |
a(0.67 to 1.00): Values of the spearman correlation inside this interval are considered highly positive.
b(0.34 to 0.66): Values of the spearman correlation inside this interval are considered moderately positive.
c(0.00 to 0.33): Values of the spearman correlation inside this interval are considered weakly positive.
d(−0.33 to 0.00): Values of the spearman correlation inside this interval are considered weakly negative.
e(−0.66 to −0.34): Values of the spearman correlation inside this interval are considered moderately negative.
f(−1.00 to −0.67): Values of the spearman correlation inside this interval are considered highly negative.
We conducted 3 more correlation analyses relevant to these studies, including the entry survey reports (GERT, PSS, and SDS) as collected within the studies. We again chose the Spearman ranked correlation method because of the small number of samples—17 for
The correlation between participants’ and peers’ SDS, PSS, and self-considered stressed score (from
For both participants and peers, we derived the correlation between the median of the following pairs of states: stress-fatigue, stress-anxiety, stress–computed well-being, fatigue-anxiety, fatigue–computed well-being, and anxiety–computed well-being.
The correlation between the participants’ median of stress/fatigue/anxiety/well-being and the peers’ median of the perceived state was assessed. This is important to understand whether there is high or low agreement in the assessments of the states at the sample level. We used the median because, at the individual level, the assessments are not independent and are not normally distributed. In
After the correlations evaluated within the previous sections, we focus on the statistical agreement of the EMA/PeerMA assessments by which we assume that the EMA and PeerMA measure the same constructs. To quantify the overall agreement between EMA and PeerMA, we applied the Wilcoxon signed-ranked test to determine whether the medians of the 2 sets (EMAs from participants and PeerMAs from peers) are statistically equal. The justification for choosing this test is that (1) participants’ and peers’ assessments are not independent; (2) the participant and peers’ assessments are paired; and (3) in our datasets, not all individual and peer assessments are normally distributed. The null hypothesis,
For
The results for
Study A: Wilcoxon signed-ranked significance tests for ecological momentary assessment/peer-ceived momentary assessment (
Peer ID | ||||
S1P1 | .001 | <.001 | <.001 | .34 |
S2P1 | <.001 | <.001 | .58 | .002 |
S3P1 | .74 | .004 | .21 | <.001 |
S4P1 | <.001 | .19 | .07 | .002 |
S5P1 | <.001 | .03 | .09 | <.001 |
S6P1 | .96 | .63 | .09 | .61 |
S7P2 | .003 | .002 | .001 | .18 |
S8P2 | <.001 | .13 | <.001 | <.001 |
S9P1 | <.001 | .02 | <.001 | .23 |
S9P2 | <.001 | .05 | .001 | <.001 |
S10P1 | <.001 | <.001 | .02 | .28 |
S10P2 | .36 | .01 | .29 | .72 |
S11P1 | .37 | .003 | .40 | <.001 |
S11P2 | .77 | <.001 | .10 | <.001 |
S12P1 | .22 | .01 | <.001 | <.001 |
S12P2 | .03 | .82 | .94 | <.001 |
S13P1 | .06 | <.001 | .12 | .002 |
S13P2 | .13 | .43 | .80 | .19 |
Study B: Wilcoxon signed-ranked significance tests for ecological momentary assessment/peer-ceived momentary assessment (
Peer ID | |||||
S1P1 | .003 | .03 | .01 | .07 | .42 |
S2P1 | .001 | .02 | .002 | .30 | <.001 |
S3P1 | .21 | .18 | .01 | <.001 | .50 |
S4P1 | .01 | .29 | .07 | .64 | .05 |
S5P1 | .02 | .02 | .35 | .03 | .13 |
S6P1 | .06 | .50 | .56 | .45 | .06 |
S7P1 | <.001 | .03 | .01 | .07 | .01 |
In general, users found the app easy to use and liked the brief surveys. Some users said that the app helped them to be more aware of their emotions. We quote some of their comments,
For
We asked participants how they felt about reflecting on their own emotional states. For
I had the impression of being more aware of my anxiety during the study, before, I did not pay special attention to it.
For
It allowed me to anchor myself and see what is causing me to feel stressed.
Loved it. It made me more aware of my emotional states than ever before.
The higher sampling frequency in
We asked peers how they felt about reflecting on someone else’s emotional states during the study. In both studies, some peers said that the task was challenging to complete at times. Others said that participating in the study allowed them to learn more about emotional states. For example, for
I have the feeling to take stress problems more seriously, not like everyone is stressed and it is normal, but to understand that stress can block life of some people.
It made me think if she is doing well and this experience made me write to her more often.
For
It was a little harder than I had expected. I typically use facial expression and tone to determine my friends emotional states. On days that I don’t see her, I’d have to rely on how she texts.
It allowed me to learn about his mood every day and know him better and any problems going on.
This section summarizes the results related to the technology choices in the 2 studies and how they may influence the methods’ feasibility. In the 2 studies, we assumed that both EMA and PeerMA have the same technological requirements: (1) the mechanism to trigger the questions to participants at desired moments and (2) the channel to trigger the questions to the user and collect the answers reliably. As explained in the
To trigger the questions, van Berkel et al [
Regarding the channel itself, smartphones are commonly used for mobile human studies as they are often close to the owner [
The following are recommendations made by users (participants and peers) in our studies. First, they wanted to have some kind of dashboard to see their previous assessments and track how many they completed each day. This may have positive effects on response compliance; however, it may imply higher reactivity to the study itself, where a momentary, ecologically valid EMA may be influenced by the number or content of the past EMAs (depending on the dashboard design). The participants also wanted greater freedom to select the moods or states that they felt confident about reporting at a given moment. Such a level of freedom is possible, but it should be done carefully to salvage the collection of ecologically valid EMAs and PeerMAs, assuring the joint understanding of participants/peers of the state being assessed (eg, snacking on foods throughout a day) as well as to ensure that the data being collected directly relates to the main goal of the study. Additionally, such a participant-driven EMA/PeerMA study design may result in bias stemming from collecting only the states the participants want to report.
One peer in
In summary, our first research aim was to explore the feasibility of applying the PeerMA method to involve peers to assess phenomena such as mental states in healthy individuals. The second aim was to determine operational aspects and human factors that need to be taken into account to most effectively use the PeerMA method. We reflect upon the overall results to answer these questions.
In this section, to address the first research aim, we summarize the experience from both studies related to the tangible contributions from participants and peers using both the EMA and the PeerMA methods.
Regarding the users’ recruitment and participation, we conclude the following: we first noticed that some participants had difficulties finding a peer. In
Once enrolled, participant retention was high. Although recruiting participants with peers required more effort than recruiting participants for EMA-only studies, we successfully completed two field studies of 4 weeks’ duration in two geographically distant locations, as described here. In both studies, we observed that almost all participants who enrolled with a peer from the beginning were able to continue in the study till the end, on average, 29 (SD 1
When it comes to the overall agreement between EMA and PeerMA assessments, the evidence for the feasibility of the PeerMA method is as follows. Although not conclusive or generalizable, we observed a strong, close to statistically significant correlation between participant and peer’s assessments of stress and anxiety (for
Additionally, we recall the overall percentage of peers who reported daily assessments statistically equal to those reported by their participants. In
Anxiety is a complex state and highly involuntary [
In addition, we consider that detecting fatigue in peers is less challenging as people tend to talk about it openly. Nevertheless, one person may be highly fatigued after 1 night of poor sleep, whereas another person may not reach that same level after 2 or 3 nights of poor sleep. For this purpose, psychometric models at the individual level help make comparisons among the assessments [
Finally, stress has a social component affected by stereotypes [
In summary, there are challenges and open questions regarding interoceptive awareness (ability to consciously sense the inner state of the body) to consider when using EMA to study emotional and physical states [
We address the second research aim by summarizing operational aspects and human factors derived from the experience of using the PeerMA method in the two studies. We discuss the implications of certain technological choices and offer recommendations for researchers who wish to include the method in their studies.
To begin with, recruiting participants for studies is a known challenge [
We found that one recommendation is to start applying the PeerMA method in cohorts for which reaching out to a peer becomes less complicated, and more motivating and valuable for both the participants and the peers. For instance, at the time of this writing, we are conducting one study with adult patients of the Stanford Medical Center recovering from a liver transplant. In this case, the patients answer EMAs, whereas their support person answers PeerMAs. Recruiting peers for this particular group was more straightforward because of the anticipated clinical value of such a study.
Another recommendation, which relates to the technological choices influencing the feasibility of the methods, is to include gamification techniques (eg, scoring points, winning prizes, and solving puzzles) as part of the study dynamics, which can provide an incentive for users to contribute data along the study duration and complete the study [
On the contrary, based on the experience during these two exploratory studies, we found it worth exploring how the EMA and PeerMA can be combined during a study to provide accurate, timely information about the observed participant state (and its changes). One suggestion is to modulate the administration of EMA and PeerMA based on prior knowledge about the participants’ state of interest and individual just-in-time answers. This would imitate the computerized adaptive tests, in which the questions are tailored to the past answers of the individual. In our case, the question of well-being (or a similarly discriminating question) could always be first. From our study design and results, we know that well-being encompasses stress, fatigue, or anxiety, and potentially, other states. Then, the next question would be chosen based on the answer (high or low well-being); for low well-being, relevant question(s) for the current state (eg, stress, fatigue, or anxiety) would be triggered to the participant and the peer. For high well-being, the survey may be completed. This approach may reduce the participants’ burden of monotonously answering questions when there may not be new information to provide. In other cases, the peers could be asked only to validate the response of the participant (eg, “Have you taken your medication?”). Nevertheless, some participants may feel their privacy is at stake when their peers answer a PeerMA every time they answer an EMA. Participants could have the option to decide—in real time—not to send the PeerMA to peers if they want to regain control over their privacy.
Another variation for the PeerMA design is to give users the options to slightly customize the time windows when they feel available and willing to answer certain types of questions. Some of them may be more engaged if they can choose the type of signal as well as the time window when they are better prepared to take an EMA or PeerMA (in a similar way as they make other daily choices like taking a cup of coffee or making a personal phone call). On the other hand, this could be detrimental to the research if participants are preparing to give a specific answer, knowing that the questions are being triggered at specific times, or if they choose to not answer questions at times when they feel more stressed, causing data loss and result bias. Additionally, some studies with PeerMA may allow users to take both the roles of participants and peers simultaneously, which allows them to report on each other’s states, which may increase engagement.
Overall, as an exploratory method, there are yet many opportunities to design studies leveraging PeerMA. The main methodological question relates to what information or measure about the user state could be collected in reliable and minimally obtrusive ways from the participant and his/her peers. If chosen properly, we believe that such information collected from participants and peers simultaneously could enable further understanding of the observed state.
One limitation of these studies is the lack of ground truth of the assessed conditions (stress, fatigue, or anxiety). Despite the data presented in this preliminary analysis, we were unable to determine whether the EMA or the PeerMA were closer to the actual state of the participants. The limitation is inherent to any self-report, EMA-based study. To further examine the reliability of PeerMA, more research is needed to incorporate more modalities, such as heart rate variability for physiological stress.
Another limitation of our work is the small sample size of participants and peers. As indicated earlier, a larger sample is necessary to further investigate the reliability of the assessments as well as the effects introduced by sample characteristics such as the amount of time peers interact with the participants during a day, type of relationship between participant and peer, or coresidence as reported by Neumann et al [
Additionally, research is needed to examine whether PeerMA affects the usual behavior of the participant during a study. Namely, if the participant, knowing that he/she is being observed, explicitly changes his/her state and behavior when interacting with the peer. Similarly, further research is needed to include the state of the peer at the moment of answering a PeerMA, for example, to understand the possible effects or biases related to the ego depletion theory [
Finally, our future work includes a case by case examination of how peers’ reported level of confidence (ie, low, moderate, or high), as well as other socioeconomic characteristics, influence the results derived from the EMA and PeerMA agreement.
We presented results from two user studies conducted in the participants’ natural daily life environments, evaluating the first version of a platform implementing the PeerMA method deployed on users’ personal smartphones. The studies showed encouraging results from a total of 20 participants and 27 peers contributing multiple daily assessments for approximately 4 weeks each. In the studies, we collected empirical evidence regarding the feasibility of the method. We discussed the methodological and human aspects related to the application of the PeerMA method to study real-life phenomena, including those related to mental health. We demonstrated that users accepted the method and provided valuable feedback. We identified and discussed improvement opportunities that could lead to higher user engagement as well as more elaborate methodological options for researchers to explore when leveraging PeerMA in their studies. We discussed technical aspects to consider for a reliable, technology agnostic, and minimally obtrusive implementation of the PeerMA method.
We believe that the PeerMA method evaluated in this study opens a new perspective to study an individual’s state based on frequent and possibly paired observations from trusted peers beyond the information traditionally obtained with EMA. As an independent observation, it has value for applications in clinical settings to evaluate the severity of and support treatment of mental disorders such as OCD or addictions. However, more research is needed to guarantee reliable utilization with sufficient control to manage potential emergent biases stemming from either the participants or the peers or the momentary context in which PeerMA is triggered.
ecological momentary assessment
Experience Sampling Method
Geneva Emotion Recognition Test
mean arctangent absolute percentage error
mean absolute percent error
mean directional accuracy
obsessive-compulsive disorder
peer-ceived momentary assessment
Perceived Stress Scale
Social Desirability Scale
University of Geneva
The authors acknowledge the Swiss Federal Commission for Scholarships for Foreign Students (2016-19), H2020 WellCo (no. 769765, 2018-21), AAL CoME (2014-7-127), and the University of Costa Rica.
None declared.