Published on 31.05.19 in Vol 7, No 5 (2019): May
Preprints (earlier versions) of this paper are available at http://preprints.jmir.org/preprint/11461, first published Jul 01, 2018.
An App Detecting Dengue Fever in Children: Using Sequencing Symptom Patterns for a Web-Based Assessment
Background: Dengue fever (DF) is one of the most common arthropod-borne viral diseases worldwide, particularly in South East Asia, Africa, the Western Pacific, and the Americas. However, DF symptoms are usually assessed using a dichotomous (ie, absent vs present) evaluation. There has been no published study that has reported using the specific sequence of symptoms to detect DF. An app is required to help patients or their family members or clinicians to identify DF at an earlier stage.
Objective: The aim of this study was to develop an app examining symptoms to effectively predict DF.
Methods: We extracted statistically significant features from 17 DF-related clinical symptoms in 177 pediatric patients (69 diagnosed with DF) using (1) the unweighted summation score and (2) the nonparametric HT person fit statistic, which can jointly combine (3) the weighted score (yielded by logistic regression) to predict DF risk.
Results: A total of 6 symptoms (family history, fever ≥39°C, skin rash, petechiae, abdominal pain, and weakness) significantly predicted DF. When a cutoff point of >–0.68 (P=.34) suggested combining the weighted score and the HT coefficient, the sensitivity was 0.87, and the specificity was 0.84. The area under the receiver operating characteristic curve was 0.91, which was a better predictor: specificity was 10.2% higher than it was for the traditional logistic regression.
Conclusions: A total of 6 simple symptoms analyzed using logistic regression were useful and valid for early detection of DF risk in children. A better predictive specificity increased after combining the nonparametric HT coefficient with the weighted regression score. A self-assessment using patient mobile phones is available to discriminate DF, and it may eliminate the need for a costly and time-consuming dengue laboratory test.
JMIR Mhealth Uhealth 2019;7(5):e11461
Symptoms of Dengue Fever
Dengue fever (DF) is one of the most common arthropod-borne viral diseases worldwide , especially in South East Asia, Africa, the Western Pacific, and the Americas [ , ].
However, there is no accurate and speedy diagnostic screening test for DF at an early stage, as its signs and symptoms—for example, fever, headache, and myalgia—are similar to those of other illnesses [- ]. Some studies [ , ] that used a univariate analysis report that the presumptive diagnosis of DF is imprecise. Multivariate logistic regressions also do not significantly distinguish patients with dengue from those with other febrile illnesses [ ]. The multivariate discrimination analyses reported sensitivity and a specificity 0.76 and an area under the receiver operating characteristic (ROC) curve (AUC) of 0.93, but costly laboratory tests (Dengue Duo Immunoglobulin M and Rapid Strips, Panbio, Queensland, Australia) [ - ] were needed before DF was serologically confirmed.
Assessment of Dengue Fever
DF symptoms are usually assessed using a dichotomous (ie, absent vs present) evaluation. The dependent variable (DF+ vs DF−) predicted using independent evaluations with a weighted summation score is more accurate than that predicted using simple evaluations with an unweighted summation score. So far, there has been no published study that has reported using the specific sequence of symptoms reported or observed in specific patients suspected of having DF. All published studies to date still report results using only a standard group of symptoms with an unweighted summation score, and they merely apply their results to a general group of patients who might have DF.
The HT Fit Statistic Applied to Detect Dengue Fever
The nonparametric HT fit statistic has been used in education and psychometrics to identify aberrant test respondents [, ]. It is a transposed formulation of a scalability coefficient for items (eg, symptoms in this study), and it is the best among 36-person fit statistics for detecting abnormal behaviors [ ].
In this study, we used the HT coefficient combined with weighted and unweighted variables to examine whether these combinations provide a valid and reliable approach for the early detection of DF in children.
Sample and Clinical Symptoms
The sample of 177 pediatric patients (≤16 years old; DF+: 69; DF−: 108) was the same as in our previous paper  (see data in ). Guided by the literature [ - ], we collected 19 DF-related clinical symptoms from the patients’ medical records to develop the initial set of items—designated as 0=“absent” or 1=“present”—to screen for DF infection: (1) personal history of DF, (2) family history of DF, (3) mosquito bites within the previous 2 weeks, (4) fever ≥39°C, (5) biphasic fever, (6) rash, (7) petechiae, (8) retroorbital pain, (9) bone pain (arthralgia), (10) headache, (11) myalgia, (12) abdominal pain, (13) anorexia, (14) occult hematuria, (15) stool occult blood, (16) cough, (17) sore throat, (18) soft (watery) stool, and (19) flushed skin. Data from these patients’ charts were obtained and approved by the Research Ethics Review Board of the Chi-Mei Medical Center.
The HT Fit Statistic
HT is defined for the persons of a dichotomous dataset with L items (in columns) and N persons (in rows) [- ], where Xni is the scored (0,1) response of person n to item i, and Pn=Sn/L. Here, Sm is the raw score for person m, and Sn is the raw score for person n.
HT is the sum of the covariances between person n and the other persons divided by the maximum possible sum of those covariances so that the range of HT is from −1 to +1, see formula (1) in. When the responses by person n are positively correlated with those of all the other persons, then HT (n) will be positive. In contrast, when the responses by person n are negatively correlated with those of all the other persons, then HT (n) will be negative. When person n ’s responses are random, HT (n) will be close to zero [ ]. We hypothesized that DF+ patients have different HT coefficients than DF− patients. All DF+ group members were sequenced to the DF− group members to obtain an HT coefficient using formula (1) in .
Selecting Symptoms and Determining Predictor Variables
All symptoms were examined by the probability of Type 1 error using the following 3 steps into determine predictor variables. First, each symptom was separately examined by the univariate approach using a Chi-square test and logistic regression, respectively, for identifying a significant association with DF. Second, 2 models (ie, the univariate and the multivariate approaches) were investigated for determining valid predictor variables associated with DF when the probability of Type 1 error was less than .05. Third, the predictor variables were used in a weighted combination for discriminating patients suspected with dengue virus infection.
Detecting Dengue Fever: A Comparison of Three Models
The efficacy of 3 models (A, B, and C) for detecting dengue fever was examined: (1) A comparison was made using univariate logistic regression in Model A to examine effects through the AUC, yielded by unweighted (ie, summed item) scores, weighted (ie, logistic regression) scores, and HT coefficients, respectively. (2) Multivariate logistic regression with the 3 aforementioned factors combined was used in Model B. (3) After selecting the significant variables in Model B, the combined predictive variables were analyzed using multivariate logistic regression in Model C to obtain effective weighted coefficients. (4) Finally, we wanted to use a single continuous variable yielded by the combined predictive variables in Model C to compare the AUC with the counterparts in Model A and C.
Moreover, we provide the F-measure for evaluating the predictive effect , which is calculated by following equations: precision=True Positives/(True Positives+False Positives); recall=True Positives/(True Positives+False Negatives); F-measure=(2×precision×recall)/(precision+recall).
Statistical Tools and Data Analyses
SPSS 15.0 for Windows (SPSS Inc) and MedCalc 184.108.40.206 for Windows (MedCalc Software) were used to calculate (1) the probability of false positives (Type 1 error) using a Chi-square test and logistic regression, (2) Youden J index (the higher, the better), AUC, sensitivity, specificity, and the cutoff point at maximal summations of specificity and sensitivity, (3) correlation coefficients among variables of unweighted, weighted, and HT scores.
Demographic Characteristics of the Study Sample and the Likelihood of Dengue Fever
A total of 69 pediatric patients clinically diagnosed with DF and 108 pediatric patients with no evidence of DF infection were included in this study (). A Chi-square test and logistic regression analyses showed that only 6 symptoms (family history, fever ≥39°C, skin rash, petechiae, abdominal pain, and weakness) were significant for assessing the likelihood of DF ( ).
|Demographical variables||Dengue fever (–)a, n (%)||Dengue fever (+)b, n (%)||Total, n (%)||P valuec|
|Female||47 (43.5)||29 (42)||76 (42.9)||.84|
|Male||61 (56.5)||40 (58)||101 (57.1)||—d|
|0-4||48 (44.4)||11 (16.2)||59 (33.5)||.005|
|5-9||24 (22.2)||20 (29.4)||44 (25)||—|
|9-16||36 (33.3)||37 (54.4)||73 (41.5)||—|
aDengue fever (–): patients with a negative dengue fever strip test.
bDengue fever (+): patients with a positive dengue fever strip test.
cP values were determined by the Chi-square test.
|Symptom variables and presence||Dengue fever (–)a, n (%)||Dengue fever (+)b, n (%)||Total, n (%)||Chi-square (df)||P valuec||Logistic regression|
|No||79 (73.1)||40 (58.0)||119 (67.2)||3.7(2)||.053||1.35||.002|
|Yes||29 (26.9)||29 (42.0)||58 (32.8)||—d||—||—||—|
|High fever of 39°C|
|No||87 (80.6)||37 (53.6)||124 (70.1)||13.3(2)||<.001||1.48||.048|
|Yes||21 (19.4)||32 (46.4)||53 (29.9)||—||—||—||—|
|No||82 (75.9)||20 (29.0)||102 (57.6)||36.1(2)||<.001||2.63||.000|
|Yes||26 (24.1)||49 (71.0)||75 (42.4)||—||—||—||—|
|No||106 (98.1)||60 (87.0)||166 (93.8)||7.3(2)||.007||2.34||.026|
|Yes||2 (1.9)||9 (13.0)||11 (6.2)||—||—||—||—|
|No||104 (96.3)||53 (76.8)||157 (88.7)||14.1(2)||<.001||2.89||.000|
|Yes||4 (3.7)||16 (23.2)||20 (11.3)||—||—||—||—|
|No||90 (83.3)||48 (69.6)||138 (78.0)||3.9(2)||.049||0.98||.048|
|Yes||18 (16.7)||21 (30.4)||39 (22.0)||—||—||—||—|
aDengue fever (–): patients with a negative dengue fever strip test.
bDengue fever (+): patients with a positive dengue fever strip test.
cP values were determined by the Chi-square test and the Wald test of logistic regression.
Comparisons of the Area Under Receiver Operating Characteristic Curve for the Three Study Models
Comparisons of the AUCs for the 3 study models (A, B, and C) showed that the weighted variable (derived by the Logistic regression) and the HT coefficient could be jointly used for predicting DF risk using equation (2):
( Logit=−3.32+0.93 x weighted _ score + 1.92 × HT ¬_ coefficient) (2)
The risk probability can be computed using the transformed formula 3:
P=exp (log it)/ (1+exp(log it)) (3)
where logit denotes a unit of log odds.
A cutoff point of >–0.68 (P=.34) was determined using the combined predictive variables in Model C: sensitivity=0.91, specificity=0.76, AUC=0.88, and the highest F-measure=0.82 (seeand ). Predictive power was better: specificity was 10.2% (ie, 84.30–74.10, shown in ) higher than when using traditional logistic regression, that is, the independence variable=sum (weighted score for each symptom x the respective symptom response, 1 or 0, predicting the dependence variable, 1 or 0 for DF). Even if AUC using the HT coefficient was slightly lower (0.72) than when using the unweighted (0.84) and the weighted (0.87) variables (Table3), and the HT coefficients related to the weighted and unweighted scores were 0.26 and 0.22, respectively, the weighted score had a higher correlation coefficient than the unweighted score to the HT coefficients, and the combined strategy of Model C or the single continuous variable yielded by the combined predictor variables ( ) are verified and available for use in practice. More importantly, the sensitivity is more critical than the specificity in clinical settings, as we would not miss any 1 case with fatal diseases.
|Approach and steps||Logistic regression||Receiver operating characteristic curve analysis||F-measure|
|Ba||P value||Area under receiver operating characteristic curve||Youden Jb||Cut point||Sensitivity||Specificity|
|Comparison of models|
|Model A:Univariate approach with a single variable compared with the dengue fever using Logistic regression and receiver operating characteristicanalysis|
|Model B:Multivariate approach with combined these three variables in regressing the dengue fever using Logistic regression|
|Model C: Combined these 2 significant predictor variables using Logistic regression|
|A single continuous variableyielded by the combined predictor variables in Model C|
|The predictive effect: precision recall|
|The combined model||—||.87||0.78||—||—||—||—||0.82|
aB: coefficient of logistic regression.
bYouden J index.
cItem-score summation method.
fMultiplying item score with the weighted regression coefficient.
gSeefor the HT equation
hUsing the 2 combined variables to predict patient’s dengue fever.
A snapshot on a mobile phone responding to questions (, top) was generated, and the results for assessing whether the patient has DF ( , bottom) were determined, which indicated that patients suspected of having DF could directly scan the Quick Response Code to obtain their DF logit scores (or the risk probability) and examine whether these 6 symptoms are useful for predicting a high DF risk (>−1.03 logits or P ≥.26=exp(−1.03 logits)/(1+exp(-1.03 logits)). Interested readers are recommended to see the demonstration in using a MP4 video to display.
We found that using the HT coefficient yielded predictions that were 10.2% more specific (ie, 84.30–74.10, shown in) than those of traditional logistic regression. The HT index is promising when the patient sequence symptom pattern is compared with the DF+ group to detect dengue fever in children. It can be combined with the weighted summation score to jointly predict the DF risk and then report that risk on mobile phones.
The HT coefficient has been used in education and psychometrics to identify aberrant test respondents [- ]. Although some have used item response theory fit statistics (eg, outfit mean square error >2.0) to select abnormal responses that indicate cheating, careless responding, lucky guessing, creative responding, or random responding [ ], our literature review revealed no published papers that reported using the HT coefficient in medical settings, especially for detecting individual aberrant response patterns different from the study reference sample, or, like this study, identifying the DF risk by comparing their sequence symptom pattern with that of the DF+ group.
What This Knowledge Adds to What We Already Knew
A diagnosis of DF is usually confirmed by 3 steps: (1) observing DF-related symptoms, (2) testing laboratory data, such as white blood cells and platelets, and (3) serologically verifying DF using dengue Immunoglobulin M and Immunoglobulin G antibodies, polymerase chain reaction analysis, and virus isolation tests . The latter 2 are relatively expensive. It is needed to develop a self-assessment approach (eg, scanning Quick Response Code, responding questions, and obtaining the DF risk on his/her smartphone), (1) helping patients for consultation at an earlier stage and (2) prompting doctors for sampling patient laboratory data when his/her DF risk reaches a cut point of P=.26 (=exp(−1.03 logits)/(1+exp(−1.03 logits)).
We found that the weighted score was a better predictor than the unweighted score (see Model A and Model B in). However, we still see so many scales in a medical setting using unweighted summation scores to determine the presence or absence of disease. Along with the mobile phones popularly used in the technical age, the way of obtaining the DF risk on mobile phones using the combined HT coefficient and weighted scores is available and worth recommending to health care providers to use for detecting the risk for DF.
Limitations and Future Study
This study has some limitations. First, the DF cut point based on the symptoms of this study sample might be biased toward that population. Moreover, we did not remove abnormal data when the HT coefficient was less than the critical value of 0.22, which best identifies aberrantly responding examinees . Second, although the sample size was small, using the HT coefficient combined with the AUC yielded highly accurate discriminatory screening. However, this finding requires confirmation in prospective studies of other regions with a substantial incidence of DF. Third, the study sample size (=177) is too small to make the inference reliable and supportable. More DF patients collected in a study are required to be considered in the discernable future. Particularly, artificial intelligence (AI) has become increasingly prevalent in recent years.
Analyzing 6 simple symptoms using logistic regression is useful and valid for the early detection of DF risk in children. Combining the HT coefficient with the weighted score yields a prediction that is 10.2% more specific than that yielded by traditional logistic regression. A self-assessment app using patient mobile phones is available to help people suspected of having DF, and it might eliminate the need for costly and time-consuming laboratory tests.
TWC conceived and designed the study, performed the statistical analyses, and was in charge of recruiting study participants. CC and TWC helped design the study, collected information, and interpreted data. WC monitored the research. All authors read and approved the final article. This research was supported by the grant Chi-Mei Foundation Hospital research CMFCR10593 from the Chi-Mei Medical Center.
Conflicts of Interest
Data for the sample of 177 pediatric patients used in this study.XLSX File (Microsoft Excel File), 48 KB
How to run the check on DF online .MP4 File (MP4 Video), 4848 KB
Response to the editors.DOCX File , 14 KB
- World Health Organization. Geneva; 2002. Dengue and Dengue Haemorrhagic Fever. Fact sheet N°117 URL: http://apps.searo.who.int/PDS_DOCS/B5318.pdf [accessed 2019-05-06] [WebCite Cache]
- Henchal EA, Putnak JR. The dengue viruses. Clin Microbiol Rev 1990 Oct;3(4):376-396 [FREE Full text] [Medline]
- Gubler DJ. Dengue and dengue hemorrhagic fever. Clin Microbiol Rev 1998 Jul;11(3):480-496 [FREE Full text] [Medline]
- Phuong HL, de Vries PJ, Nga TT, Giao PT, Hung LQ, Binh TQ, et al. Dengue as a cause of acute undifferentiated fever in Vietnam. BMC Infect Dis 2006 Jul 25;6:123 [FREE Full text] [CrossRef] [Medline]
- Nunes-Araújo FR, Ferreira M, Nishioka S. Dengue fever in Brazilian adults and children: assessment of clinical findings and their validity for diagnosis. Ann Trop Med Parasitol 2003 Jun;97(4):415-419. [CrossRef] [Medline]
- Hammond S, Balmaseda A, Pérez L. Differences in dengue severity in infants, children, and adults in a 3-year hospital-based study in Nicaragua. Am J Trop Med Hyg 2005;73:1063-1070. [Medline]
- Potts J, Rothman A. Clinical and laboratory features that distinguish dengue from other febrile illnesses in endemic populations. Trop Med Int Health 2008 Nov;13(11):1328-1340 [FREE Full text] [CrossRef] [Medline]
- Lai W, Chien T, Lin H, Su S, Chang C. A screening tool for dengue fever in children. Pediatr Infect Dis J 2013 Apr;32(4):320-324. [CrossRef] [Medline]
- Lai W, Chien T, Lin H, Kan W, Su S, Chou M. An approach for early and appropriate prediction of dengue fever using white blood cells and platelets. HealthMED 2012;6(3):806-812 [FREE Full text]
- Kittigul L, Suankeow K. Use of a rapid immunochromatographic test for early diagnosis of dengue virus infection. Eur J Clin Microbiol Infect Dis 2002 Mar;21(3):224-226. [CrossRef] [Medline]
- Vaughn D, Nisalak A, Kalayanarooj S, Solomon T, Dung NM, Cuzzubbo A, et al. Evaluation of a rapid immunochromatographic test for diagnosis of dengue virus infection. J Clin Microbiol 1998 Jan;36(1):234-238 [FREE Full text] [Medline]
- Sijtsma K. A coefficient of deviant response patterns. Kwantitatieve Methoden 1986;7:131-145 [FREE Full text]
- Linacre J. A comment on the HT person fit statistic. Rasch Meas Trans 2012;26(1):1358 [FREE Full text]
- Karabatsos G. Comparing the aberrant response detection performance of thirty-six person-fit statistics. Appl Meas Educ 2003;16(4):277-298 [FREE Full text] [CrossRef]
- Lin C, Hsu C, Lou Y, Yeh S, Lee C, Su S, et al. Artificial intelligence learning semantics via external resources for classifying diagnosis codes in discharge notes. J Med Internet Res 2017 Dec 6;19(11):e380 [FREE Full text] [CrossRef] [Medline]
- Linacre J. Optimizing rating scale category effectiveness. J Appl Meas 2002;3(1):85-106 [FREE Full text] [Medline]
|AI: artificial intelligence|
|AUC: area under receiver operating characteristic curve|
|DF: dengue fever|
|ROC: receiver operating characteristic|
Edited by G Eysenbach; submitted 01.07.18; peer-reviewed by S Su, I Yoo; comments to author 08.10.18; revised version received 02.12.18; accepted 07.04.19; published 31.05.19
©Tsair-Wei Chien, Julie Chi Chow, Willy Chou. Originally published in JMIR Mhealth and Uhealth (http://mhealth.jmir.org), 31.05.2019.
This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mhealth and uhealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.