This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mhealth and uhealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.
Tracking individuals in environmental epidemiological studies using novel mobile phone technologies can provide valuable information on geolocation and physical activity, which will improve our understanding of environmental exposures.
The objective of this study was to assess the performance of one of the least expensive mobile phones on the market to track people's travel-activity pattern.
Adults living and working in Barcelona (72/162 bicycle commuters) carried simultaneously a mobile phone and a Global Positioning System (GPS) tracker and filled in a travel-activity diary (TAD) for 1 week (N=162). The CalFit app for mobile phones was used to log participants’ geographical location and physical activity. The geographical location data were assigned to different microenvironments (home, work or school, in transit, others) with a newly developed spatiotemporal map-matching algorithm. The tracking performance of the mobile phones was compared with that of the GPS trackers using chi-square test and Kruskal-Wallis rank sum test. The minute agreement across all microenvironments between the TAD and the algorithm was compared using the Gwet agreement coefficient (AC1).
The mobile phone acquired locations for 905 (29.2%) more trips reported in travel diaries than the GPS tracker (
The use of mobile phones running the CalFit app provides better information on which microenvironments people spend their time in than previous approaches based only on GPS trackers. The improvements of mobile phone technology in microenvironment determination are because the mobile phones are faster at identifying first locations and capable of getting location in challenging environments thanks to the combination of assisted-GPS technology and network positioning systems. Moreover, collecting location information from mobile phones, which are already carried by individuals, allows monitoring more people with a cheaper and less burdensome method than deploying GPS trackers.
Environmental exposures are crucial determinants of people's health [
Mobile phone technology may help to overcome the previous limitations because of its widespread use around the world and the combination of assisted GPS technology and network positioning systems [
In this context, the aim of this study was to assess the performance of mobile phone technology in tracking people's travel-activity pattern in a dense city while they perform their daily life activities.
This is a concurrent validation study comparing the tracking and travel-activity determination of a mobile phone versus a GPS tracker and a travel-activity diary (TAD), respectively. The study is nested in the Transportation, Air Pollution and Physical Activities (TAPAS) Travel Survey study [
For this study, a convenience sample of 178 participants from TAPAS Travel Survey study was used. The TAPAS sample was composed of 815 healthy participants recruited following stratified sampling according to commute mode (bicycle vs motorized commuters) in 4 randomized spatiotemporal sampling points, for each of the 10 districts of Barcelona [
The study protocol was approved by the Clinical Research Ethical Committee of the Parc de Salut Mar (CEIC-Parc de Salut Mar), and written informed consent was obtained from all participants.
Participants were instructed to wear a belt with a Samsung Galaxy Y S5360 mobile phone (Samsung Electronics Co Ltd, Suwon, South Korea) and a GlobalSat BT-335 GPS tracker (GlobalSat WorldCom Corp, Taipei, Taiwan) during waking hours for 7 consecutive days and to fill in a TAD for all of their trips throughout the day.
The Samsung Galaxy Y S5360 mobile phone was selected because it has a built-in accelerometer and GPS sensor, it was available in many countries, and it was among the cheapest mobile phones on the market when the study began. It uses Android 2.3.6 and operates with a Broadcom BCM21553 chipset and a BCM4751 GPS module. The Broadcom BCM4751 is a single-chip GPS receiver with 12 channels all-in-view tracking receiver [
CalFit is a software for Android mobile phones developed by the University of California, Berkeley [
Once data collection was completed, each geographical coordinate provided by the mobile phone was assigned to 1 of the 4 predefined microenvironments (home, work or school, in transit, and others) using a newly developed spatiotemporal map-matching algorithm. This map-matching algorithm was developed for this study because of the absence of available algorithms for postprocessing the clouds of geographical coordinates generated when participants are at a place. The chosen cutoff points are based on the extensive revision of the mobile phones’ location data. In brief, the algorithm computes the azimuth between sequential coordinates and calculates the circular variance within groups of 30 coordinates in less than 100 m in linear distance. When the circular variance is greater than 0.7, the group of coordinates is identified as a potential place. Then, all coordinates within 30 minutes and 150 m are considered to belong to this spatiotemporal place. Finally, these spatiotemporal places are assigned to a specific microenvironment when distance between the group of coordinates and the geocoded microenvironment is less than 150 m. The rest of the groups of coordinates that do not belong to previous microenvironments are classified as other microenvironments and their central coordinates are calculated.
The GlobalSat BT-335 GPS tracker was selected because of its good performance in the study by Wu and colleagues [
At the end of the study week participants returned the TAD, which was checked, day by day and trip by trip, ensuring that all trips and destinations and their durations and addresses were congruent, helping the participant to correct any illogical situation found. The main travel mode of all multimodal trips of the TAD (n=177 trips) was defined as the most motorized travel mode according to the following ranking: car> motorcycle> bus> metro> bicycle> walk. The geographical coordinates of both mobile phone–based CalFit and the GPS tracker that did not belong to the European continent or with a speed of ≥200 km/h were flagged. Finally, owing to schedule incompatibilities, not all participants were sampled for a week (3 had less than 7 days and 25 more than 7 days). As a result, the total number of monitored days was 1173. Among these, 187 (16%) days were excluded because of the following reasons: (1) the sensors were worn less than 10 hours during waking hours according to the wearing time estimates derived from CalFit physical activity measurements [
For the analysis, 2 datasets were generated. The first dataset was a trip-level spatiotemporal dataset, with the geographical coordinates of both mobile phone and GPS tracker at 10-second resolution for the episodes identified as trips by the TAD to compare their tracking performance and accuracy. Tracking performance of the trips reported in the TAD was measured by 2 dimensions, identifiability and traceability. Identifiability of TAD trips was defined as having ≥30% of trip duration with geolocation information because it was understood as the minimum cutoff point to distinguish between a real displacement and a measurement error. Traceability of TAD trips was quantified for each identifiable trip by the percentage of the trip duration with geolocation information. On the other hand, the tracking accuracy was quantified by the distance between the geographical coordinates of mobile phone and GPS tracker throughout TAD trips. This distance was calculated between concomitant locations (locations with a difference in time of <10 seconds between both monitors) and corrected for time difference and traveling speed. The second dataset was a microenvironment-level dataset, with information on whether participants were at home, work or school, in transit, or other locations at 1-minute resolution to assess the agreement and variability between map-matching algorithm and TAD.
Finally, other measurements to contextualize participants’ characteristics and built environment around the home included sociodemographic characteristics (eg, age, sex, civil and working status), health status (the question “In general, would you say your health is: Excellent, Very Good, Good, Fair, or Poor” from the SF-36 Health Survey [
To assess the validity of the tracking performance of the mobile phone, the identifiability and average traceability of the mobile phone for all trips and for each travel mode were compared with that of the GPS tracker using chi-square test and Kruskal-Wallis rank sum test, respectively. The validity of mobile phone tracking accuracy was assessed by the distance between the concomitant geographical coordinates of the mobile phone and GPS tracker. The tracking accuracy of each mobile phone location was overlapped on a Catalonia street map and a district map of Barcelona city to inspect the spatial coverage and distribution.
On the other hand, the validity of our map-matching algorithm to determine the time in each microenvironment (home, work or school, in transit, and others) was estimated by building a misclassification matrix versus the TAD. From this matrix, the sensitivity (recall), positive predictive value (precision), specificity, negative predictive value, F-score, and Gwet agreement coefficient (AC1) statistics were computed. F-score is the harmonic mean of recall and precision. The AC1 is similar to the multicategory kappa statistic but circumvents the known weakness of kappa [
Finally, two sensitivity analyses were performed. The first one was a comparison between the used geolocation accuracy (based on distance to GPS tracker) and the usual geolocation accuracy (based on distance to nearest street), using only a subset of mobile phone geolocations. The subset includes the geographical locations between the latitudes 41.59 and 41.62 and longitudes 2.605 and 2.645, which belong to the village of Sant Pol de Mar and its surroundings. In the second sensitivity analysis, we assessed the effect of participants' characteristics on the performance of our travel-activity algorithm and the need for specific calibration. The characteristics of participants studied were as follows: (1) main travel mode for commuting; (2) weekdays versus weekend days; (3) median distance from home to work; and (4) working versus studying status. The effect of the characteristics on the performance of the algorithm was assessed by comparing the agreement between abovementioned characteristics using Kruskal-Wallis rank sum test.
All analyses were conducted during 2014-2015, using R 3.1.3 (The R Foundation for Statistical Computing), Python 2.7 (Python Software Foundation), NumPy ≥ 1.6.1 (Travis Oliphant), Pandas ≥ 0.12 (Wes McKinney), SQLite ≥ 3.7.13 (D. Richard Hipp), and SpatiaLite ≥ 4.0.0-RC1 (Alessandro Furieri).
The 162 participants were on average 33 years old, 50% were female, 20% were single, 40% had at least 1 child, 77% were currently employed, and 50% were bicycle commuters (
The mobile phone obtained locations for 905 (29%) more TAD trips than the GPS tracker (
Description of participants’ sociodemographic and home characteristics according to the main commute mode.
Characteristics | All Participants (N=162) | Bicycle (n=72) | Car, motorcycle, or bus (n=47) | Underground (n=43) | ||
Sex, female | 83 (51.2) | 34 (47.2) | 26 (55.3) | 23 (53.5) | ||
Age in years, median (25th-75th) | 33 (26-41) | 34 (29-41) | 34 (28-42) | 27 (21-39) | ||
Civil status: single | 29 (19.3) | 17 (25.4) | 5 (11.6) | 7 (17.1) | ||
Has at least 1 child: yes | 59 (39.3) | 22 (32.8) | 22 (51.2) | 15 (37.5) | ||
Education level: more than secondary | 100 (66.9) | 54 (80.6) | 25 (58.1) | 21 (53.7) | ||
Working status: yes | 115 (76.8) | 59 (88.1) | 31 (72.1) | 25 (63.4) | ||
Nationality: Spanish | 133 (88.7) | 59 (88.1) | 40 (93.0) | 34 (85.4) | ||
Smoking status: current smoker | 42 (28.0) | 21 (31.3) | 15 (34.9) | 6 (15.0) | ||
Body mass index, ≥25 | 37 (24.5) | 14 (20.9) | 15 (34.9) | 8 (19.5) | ||
High stress level: yesb | 33 (22.1) | 10 (15.2) | 12 (27.9) | 11 (27.5) | ||
Health status: very good or excellent | 73 (48.3) | 35 (52.2) | 20 (46.5) | 18 (43.9) | ||
Deprivation index, |
−0.2 (0.7) | −0.3 (0.7) | −0.1 (0.7) | 0.0 (0.7) | ||
Population densitya, persons/km2 | 30295 (12212) | 32002 (11372) | 29575 (12753) | 28175 (12859) | ||
Distance to work, kilometers | 3.4 (1.8) | 2.8 (1.4) | 3.3 (1.8) | 4.6 (2.0) | ||
Slope, % | 4.0 (5.3) | 3.4 (3.7) | 4.5 (6.7) | 4.4 (5.8) | ||
Altitude, meters | 41 (42.7) | 37 (28.8) | 44 (54.2) | 44 (48.2) | ||
Walkability indexa | 0.4 (2.1) | 0.6 (2.1) | 0.4 (2.1) | 0.0 (2.0) |
aVariables Age, Civil status, Has at least 1 child, Education level, Working status, Nationality, Smoking status, Body mass index and Health status have 12 missing values, High stress level has 14 missing, and Population density and Walkability index have 1 missing.
bHigh stress levels: having a score of ≥4 in each question of the short form of the Perceived Stress Scale [
Comparison of Global Positioning System and mobile phone tracking performance and description of mobile phone tracking accuracy.
Measures | Travel mode from travel-activity diary | ||||||||
All | Motorcycle | Walk | Metro | Bicycle | Car | Bus | Others | ||
Travel diary | |||||||||
No. of trips, n | 3098 | 358 | 706 | 581 | 839 | 409 | 199 | 6 | |
Duration, minutes, mean (SD) | 28 (12) | 21 (9) | 27 (13) | 35 (11) | 26 (11) | 29 (13) | 32 (11) | 32 (11) | |
Identifiabilitya | |||||||||
GPSb logger, n (%) | 1803 (58.2) | 280 (78.2) | 409 (57.9) | 161 (27.7) | 574 (68.4) | 257 (62.8) | 121 (60.8) | 1 (16.7) | |
Mobile phone, n (%) | 2708 (87.4) | 311 (86.9) | 623 (88.2) | 511 (88.0) | 766 (91.3) | 320 (78.2) | 172 (86.4) | 5 (83.3) | |
<.001 | <.001 | <.001 | <.001 | <.001 | <.001 | <.001 | .08 | ||
Traceabilityc | |||||||||
GPS logger, median (25th-75th) | 74 (55-88) | 80 (65-94) | 74 (55-88) | 44 (36-53) | 76 (61-90) | 79 (61-89) | 74 (54-87) | 86 (86-86) | |
Mobile phone, median (25th-75th) | 76 (58-90) | 77 (60-92) | 79 (60-91) | 53 (47-62) | 85 (71-95) | 64 (46-83) | 65 (51-84) | 87 (87-87) | |
.009 | .60 | .009 | <.001 | <.001 | <.001 | .05 | .32 | ||
Overall, m, median (25th-75th) | 24 (10-51) | 22 (11-47) | 21 (9-46) | 22 (11-43) | 23 (10-49) | 34 (13-76) | 25 (12-51) | 12 (4-29) | |
Satellite, m, median (25th-75th) | 22 (10-47) | 21 (10-43) | 20 (8-43) | 21 (10-40) | 22 (9-45) | 29 (11-61) | 23 (11-45) | 12 (4-28) | |
Network, m, median (25th-75th) | 97 (26-574) | 66 (22-290) | 42 (19-183) | 104 (29-609) | 69 (23-223) | 464 (80-2311) | 54 (22-372) | 177 (104-242) |
aIdentifiability of travel-activity diary trips was defined as having ≥30% of trip duration with location information.
bGPS: Global Positioning System.
cTraceability of travel-activity diary trips was quantified among the identifiable trips by the percentage of the trip duration with location information.
dTracking accuracy was quantified by the distance between the geographical coordinates of the mobile phone and the GPS tracker throughout travel-activity diary trips. This distance was calculated between concomitant geographical coordinates (geolocations with a difference in time of <10 seconds between both monitors) and corrected for time difference and traveling speed. Overall includes satellite and network locations, while satellite and network refer to the specific accuracy for each signal.
The comparison of the overall time in each microenvironment between map-matching algorithm and TAD showed that there is overall a good agreement on time spent in microenvironments, with only 0.1% (work) to 1.2% (other) difference estimated in each type of microenvironment.
The confusion matrix (
Travel-activity confusion matrix between the travel-activity diary and our map-matching algorithm and its interrelationship statistics (cluster defined as 150 m; Gwet agreement coefficient AC1=81%).
Map-matching |
Travel-activity diary | Sensa |
Specb |
PPVc |
NPVd |
ACCe |
||||
Home | Work | Others | Trip | |||||||
Home | 758495 | 5235 | 29575 | 26580 | 94 | 90 | 93 | 91 | 92 | 93 |
Work | 10320 | 257200 | 10110 | 9140 | 85 | 97 | 90 | 96 | 95 | 87 |
Others | 20575 | 23000 | 104315 | 15485 | 61 | 95 | 64 | 95 | 91 | 62 |
Trip | 20755 | 15880 | 27445 | 79160 | 61 | 95 | 55 | 96 | 92 | 58 |
aSens: sensitivity.
bSpec: specificity.
cPPV: positive predictive value.
dNPV: negative predictive value.
eACC: Accuracy
Catalonia street map and Barcelona district map with the spatial distribution of the mobile phone tracking accuracy among the 986 person-days monitored. Gray points represent those locations without concomitant locations from Global Positioning System (GPS) tracker to estimate accuracy. In the district map of Barcelona (inset), the median geolocation accuracy of the mobile phone is shown in the 10 districts of Barcelona city.
Comparison of distances between smartphone and Global Positioning System (GPS) tracker concomitant locations and to nearest street, while travelling, through the highway C-32 and N-II, at the surroundings of the village of Sant Pol de Mar.
The main findings of this study are that (1) the mobile phone obtained locations for 905 (29%) more trips than a commercial GPS tracker; (2) mobile phone had enough geolocation accuracy to locate the participants at the street level; and (3) the developed map-matching algorithm was able to determine people's travel-activity pattern with an overall accuracy of 83% and in-transit time with a recall of 61% and precision of 55%.
To our knowledge, this is the first study describing and comparing tracking performance and accuracy between a mobile phone and a GPS tracker in free-living conditions. Previous studies were mainly focused on evaluating the geolocation accuracy of mobile phones and were conducted in more car-dependent environments and through experimental designs [
In this deployment, the mobile phone–based CalFit obtained locations for 905 (29%) more trips than the GPS tracker. According to previous literature, this could be a consequence of the faster time to first fix position and the use of network positioning systems [
The geolocation accuracy of mobile phones using only satellite signal in previous dynamic experimental studies was between 2 m and 8 m [
This is the first study to monitor a large sample of adults during a full week while they are performing real-life activities using mobile phone technology. Previous studies mainly focused on commercial GPS trackers and experimental or quasi-experimental designs (
The performance of the map-matching algorithm to determine the time spent at home or work has been shown to be very sensitive and precise, which is consistent with previous research [
The use of the GlobalSat BT-335 as the GPS tracker, which was found by Wu and colleagues [
The interpretation of the results on tracking performance of TAD trips by mobile phone–based CalFit calls for prudence because this tracking definition is based on the percentage of the trip duration with location information, which does not take into account geolocation accuracy. On the other hand, the present assessment of geolocation accuracy is based on the comparison against a GPS tracker, and it is well known that GPS trackers are affected by environmental factors (ie, visibility and geometry of satellites) [
The mobile phone–based CalFit, together with our map-matching algorithm, provides a clean tracking of people's activities, which provides researchers with the opportunity to determine and understand the causal and temporal relationship of natural and urban environments with health-related behaviors and exposures as well as physical and mental health conditions. Moreover, this study is the basis for future studies aiming to assess if this map-matching algorithm of mobile phone geolocation shows the same feasibility and precision in other built environments.
Finally, future improvements in personal monitoring must include making the apps downloadable from the Internet and transferring the measurements through the Internet directly to a cloud server, which we believe will minimize efforts during the deployment and the burden on participants and will increase participants' compliance. Furthermore, future developments should also add automatic algorithms for travel mode recognition and outdoor time determination, probably using additional recorded information from location provider (ie, number of satellites in view, number of satellites used during location determination, and HDOP) and from other mobile phone built-in sensors (ie, barometer and light and sound sensors).
Therefore, the use of mobile phones running the CalFit app provides better information on which microenvironments people spend their time in than previous approaches based only on GPS trackers. The improvements of mobile phone technology in microenvironment determination are because the mobile phones are faster at identifying first locations and capable of getting location in challenging environments thanks to the combination of assisted-GPS technology and network positioning systems. Moreover, collecting location information from mobile phones, which are already carried by individuals, allows monitoring more people with a cheaper and less burdensome method.
Comparison of sample, monitoring duration, setting, travel behavior, and time-activity microenvironments definition across studies focused on time-activity pattern.
Global Positioning System
horizontal dilution of precision
travel-activity diary
Transportation, Air Pollution and Physical Activities
The study participants were from the Europe-wide project Transportation, Air Pollution and Physical Activities: an integrated health risk assessment program of climate change and urban policies. The research leading to these results has received funding from the National Institutes of Health under the NIEHS (National Institute of Environmental Health Sciences) Grant Agreement number R01-ES020409—the CAVA project. The funders did not have any role in study design, data collection, analysis and interpretation of data, and the writing of this paper and the decision to submit it for publication. All researchers are independent from funders.
None declared.