Accepted for/Published in: Online Journal of Public Health Informatics
Date Submitted:
Open Peer Review Period: -
Date Accepted:
Date Submitted to PubMed:
- Mehrab B, Ian W H, Kimmo K, Chenglin H, Cory C, Elizabeth S C W, Callisto B, Alexandra C A, Elizabeth A Y, Majid S
- Identifying Substance Use and High-Risk Sexual Behavior Among Sexual and Gender Minority Youth by Using Mobile Phone Data: Development and Validation Study
- Online Journal of Public Health Informatics
- DOI: 10.2196/11848
- PMID: 30303485
- PMCID: 6352016
Identifying Substance Use and High-Risk Sexual Behavior Among Sexual and Gender Minority Youth by Using Mobile Phone Data: Development and Validation Study
Abstract
background
Sexual and gender minority (SGM) individuals are at heightened risk for substance use and sexually transmitted infections than their non-SGM peers. Collecting mobile phone usage data passively may open new opportunities for personalizing interventions, as behavioral risks could be identified without user input.
objective
Our objectives were to determine (1) whether passively sensed mobile phone data can be used to identify substance use and sexual risk behaviors for STI and HIV transmission among young SGM who have sex with men, (2) which outcomes can be predicted with a high level of accuracy, and (3) which passive data sources are most predictive of these outcomes.
methods
We developed a mobile phone app to collect participants’ messaging, location, and app use data and trained a machine learning model to predict risk behaviors for STI and HIV transmission. We used Scikit-learn to train logistic regression and gradient boosting classification models with simple linear model specification to predict participants substance use and sexual behaviors (i.e. condomless anal sex, number of sexual partners, and methamphetamine use), which were validated using self-report questionnaires. F1 scores were used to quantify prediction accuracy of the model utilizing different data sources (and combinations of these sources) for prediction. Differences between text, location, app use, and Linguistic Inquiry and Word Count (LIWC) domains by outcome were investigated using Independent t-tests where associations were considered significant at p<0.05.
results
Among participants (n=82) who identified as SGM, were sexually active, and reported recent substance use, our model was highly predictive of methamphetamine use and having 6+ sexual partners (F1 scores as high as 0.83 and 0.69 respectively). The model was less predictive of condomless anal sex (highest F1 score 0.38). Overall, text-based features were found to be most predictive, but app use and location data improved predictive accuracy, particularly for detecting 6+ sexual partners. Methamphetamine use was significantly associated with dating app use (p=0.01) and use of sex-related words (p=0.002). Having six or more sex partners was associated with dating app use (0.02), use of sex-related words (p=0.001), and traveling a further distance from home (p=0.03), on average, compared to participants with fewer sex partners. Methamphetamine users were more likely to use social (p=0.002) and affect words (p=0.003) and less likely to use drive-related words (p=0.02). People having 6 or more partners were more likely to use social, affect words, and cognitive process-related words (p=0.003 and 0.004 respectively).
conclusions
Our results show that passively collected mobile phone data may be useful in detecting sexual risk behaviors. Expanding data collection may improve the results further, as certain behaviors, such as injection drug use, were quite rare in the study sample. These models may be used to personalize STI and HIV prevention as well as substance use harm reduction interventions.
International Registered Report
RR2-10.2196/58448
Copyright
© The authors. All rights reserved. This is a privileged document currently under peer-review/community review (or an accepted/rejected manuscript). Authors have provided JMIR Publications with an exclusive license to publish this preprint on it’s website for review and ahead-of-print citation purposes only. While the final peer-reviewed paper may be licensed under a cc-by license on publication, at this stage authors and publisher expressively prohibit redistribution of this draft paper other than for review purposes.