This is an open-access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work, first published in JMIR mHealth and uHealth, is properly cited. The complete bibliographic information, a link to the original publication on http://mhealth.jmir.org/, as well as this copyright and license information must be included.
According to the World Health Organization, achieving targets for control of leprosy by 2030 will require disease elimination and interruption of transmission at the national or regional level. India and Brazil have reported the highest leprosy burden in the last few decades, revealing the need for strategies and tools to help health professionals correctly manage and control the disease.
The main objective of this study was to develop a cross-platform app for leprosy screening based on artificial intelligence (AI) with the goal of increasing accessibility of an accurate method of classifying leprosy treatment for health professionals, especially for communities further away from major diagnostic centers. Toward this end, we analyzed the quality of leprosy data in Brazil on the National Notifiable Diseases Information System (SINAN).
Leprosy data were extracted from the SINAN database, carefully cleaned, and used to build AI decision models based on the random forest algorithm to predict operational classification in paucibacillary or multibacillary leprosy. We used Python programming language to extract and clean the data, and R programming language to train and test the AI model via cross-validation. To allow broad access, we deployed the final random forest classification model in a web app via shinyApp using data available from the Brazilian Institute of Geography and Statistics and the Department of Informatics of the Unified Health System.
We mapped the dispersion of leprosy incidence in Brazil from 2014 to 2018, and found a particularly high number of cases in central Brazil in 2014 that further increased in 2018 in the state of Mato Grosso. For some municipalities, up to 80% of cases showed some data discrepancy. Of a total of 21,047 discrepancies detected, the most common was “operational classification does not match the clinical form.” After data processing, we identified a total of 77,628 cases with missing data. The sensitivity and specificity of the AI model applied for the operational classification of leprosy was 93.97% and 87.09%, respectively.
The proposed app was able to recognize patterns in leprosy cases registered in the SINAN database and to classify new patients with paucibacillary or multibacillary leprosy, thereby reducing the probability of incorrect assignment by health centers. The collection and notification of data on leprosy in Brazil seem to lack specific validation to increase the quality of the data for implementations via AI. The AI models implemented in this work had satisfactory accuracy across Brazilian states and could be a complementary diagnosis tool, especially in remote areas with few specialist physicians.
Leprosy is an infectious disease caused by
Currently, the conventional diagnosis of leprosy is typically based on clinical evaluation alone, especially when histopathological analysis is not available. The clinical diagnosis is based on cardinal signs such as the presence of skin lesions (often with loss of sensitivity), thickening of the nerves, and presence of the pathogen in a skin smear or histological tissue samples. Based on this information, classifications are applied to aid in the understanding and treatment of the disease [
The Madrid classification divides leprosy patients into the following four categories: indeterminate, tuberculoid, borderline, and lepromatous [
Currently, multidrug therapy is the main treatment for leprosy, which is based on schemes supported by the operational classification. The Guidelines Development Group, established by the WHO in 2018, recommends the same regimen of three drugs (rifampicin, dapsone, and clofazimine) for all leprosy patients, with a 6-month duration for paucibacillary cases and a 12-month duration for multibacillary cases. Some evidence suggests a potential increase in the risk of relapse for patients with paucibacillary leprosy using the previous two-drug regimen [
Laboratory diagnosis can help differentiate leprosy from other dermatological/neurological diseases, especially in cases of suspected recurrence, and determine an appropriate treatment duration. In these cases, microscopic examination of the dermal smear is the method most commonly used because it is easy to perform and is of low cost. The bacilloscopy index (BI) is negative (0) in the tuberculoid and indeterminate forms, is strongly positive in the lepromatous type, and reveals a variable result in borderline cases [
In 2020, the WHO outlined the goal to interrupt leprosy transmission at the national or regional level by 2030. However, to achieve this goal, it is necessary to routinely implement active case detection and contact tracing. Therefore, it is urgent to improve the tools used for an early and precise diagnosis of new cases [
Brazil uses the Sistema de Informação Nacional de Agravos de Notificação/National Notifiable Diseases Information System (SINAN) to deal with epidemiological aspects of diseases with compulsory notification. A leprosy-specific form has to be filled out for each confirmed case, which involves information about the patient, including the number of lesions and affected nerves, grade of physical disability, and demographic variables, among others. All of these data are stored in SINAN’s online database and are available for epidemiological studies [
The app proposed in this study was based on an in-depth analysis of the SINAN database using machine learning, a research area that focuses on how computers acquire knowledge from data. Machine learning can be subclassified into two general types: unsupervised learning and supervised learning. Unsupervised learning does not have a focus on a predictable output, as its main objective is to identify data patterns. By contrast, supervised learning focuses on an outcome such as determining if an animal in a picture is a cat or a dog [
There are some existing apps that were also designed to help diagnose neglected tropical diseases (NTDs). According to the WHO in 2018, an app was developed to facilitate the diagnosis of NTDs of the skin, including leprosy. This app allows health care workers and the public to obtain information about a specific disease, such as its clinical features, management, and geographical distribution, and provides a list of potential diagnoses. The training guide is now updated with recent information and has been translated into an easy-to-use interactive mobile app available free of charge on both Android and iOS platforms [
Given the above background, the main objective of our study was to develop a cross-platform app for leprosy screening based on AI. This app was designed to recognize patterns in leprosy cases registered in the SINAN database and to classify new cases as paucibacillary or multibacillary, thereby reducing the probability of misclassification by the health center.
We divided the stages of app construction into two steps: (1) processing raw data and obtaining a decision matrix, and (2) using the decision matrix to build the app for classifying a case given an input. An overview of the entire process is shown in
Flow diagram summarizing the data-processing and app-building steps. SINAN: Sistema de Informação Nacional de Agravos de Notificação (National Notifiable Diseases Information System); DATASUS: DATASUS: Sistema Único de Saúde (Unified Health System) data portal; RF: random forest; csv: Comma Separated Value.
Initially, we downloaded all SINAN records related to leprosy cases from 2014 to 2019, which were converted to a single Comma Separated Values file. This procedure resulted in a 54-column file containing data on 174,871 cases, with each column corresponding to a specific variable reported by Brazilian health professionals about the leprosy cases notified.
Many of these columns (variables) were not relevant to our study. Therefore, we removed those that did not fulfill the criteria as shown in
Exclusion criteria and justifications.
Exclusion criteria | Justifications |
Columns with more than 25,000 “NA” (not available) | The objective was to remove variables that many professionals have not declared the value of, as a large amount of missing data may impair processing. The number 25,000 was arbitrarily defined, focusing on not drastically reducing the total amount of data |
Categorical variable with more than 53 input possibilities | R shows an alert when a categorical variable with more than 53 input possibilities is being used, given that the greater the number of input possibilities, the smaller the meaning of each input to the model |
Variables that may induce a result | Some variables imply an operational classification (eg, “g-MB” therapeutic scheme implies that the patient has a case of multibacillary leprosy), causing bias to the model |
Variables with no apparent correlation with the prediction. | Boruta [ |
Redundant variables | Redundant variables do not provide additional information to the model, and therefore there is no reason to keep both. An analysis using Python showed that some variables had almost 100% correspondence with another (eg, the state where the case was notified and the state where the patient lives). The Boruta algorithm is also useful to remove redundant variables. |
The remaining dataset was composed of the variables age, gender, race, education, grade of disability, operational classification, BI, number of affected nerves, clinical form, municipality ID, number of household contacts, and the number of skin lesions. After this processing, we removed all lines with any entry of “NA” (not applicable), leaving a total of 123,054 cases.
The Brazilian Practical Guide on Leprosy was reviewed to define the following criteria to remove cases with any inconsistency: (i) samples with a positive BI are always cases of multibacillary leprosy; (ii) patients with paucibacillary leprosy should have five or fewer skin lesions; (iii) indeterminate and tuberculoid are always paucibacillary forms; (iv) borderline and lepromatous cases are always multibacillary forms; (v) indeterminate cases have no disability; and (vi) a maximum of approximately 18 nerve trunks are evaluated in clinical examinations [
The Instituto Brasileiro de Geografia e Estatística (Brazilian Institute of Geography and Statistics), responsible for conducting the census, provides tables containing the estimated population by year and by municipality [
Method to calculate the error rate.
Confidence intervals of these municipality inconsistencies were calculated for each state of Brazil. We calculated the median of household contacts by city and year. All data processing up to this point was performed using Python 3.8 and WPS spreadsheets.
After initial processing with Python, the RF algorithm was applied to the resulting data using the R software package Random Forest. In addition to RF, there are several other machine-learning classification algorithms that could be appropriate for this task, such as naive Bayes [
After several tests to improve model accuracy, the following subset of variables was used to predict the operational classification of each case: region, state, city, age, number of skin lesions, affected nerves, household contacts, and bacilloscopy. Prediction of operational classification was chosen as the metric for evaluation instead of prediction of the clinical form for two main reasons: (1) Brazilian treatment is based on the operational classification [
This multiplatform app was designed to meet the scientific demand for technological innovation concurrently with the lack of safe and accurate diagnoses in remote regions of Brazil, where training in clinical practice does not always match the international standards recommended by the WHO. In this sense, we incorporated only clinical variables reported by SINAN so that the app would be useful in the Sistema Único de Saúde (SUS; Unified Health System) throughout Brazil.
The decision forest obtained by the RF algorithm from the R package was deployed in a ShinyApp environment, which is a web service for constructing a friendly user interface.
Screenshot representing the R ShinyApp input and output flows. The layout of the app may eventually change to improve user experience. ROC: receiver operating characteristic; AUC: area under the curve; FPR: false positive rate.
A good model must enhance the TPR and decrease the FPR. Thus, the quality of the model may be represented by the area under the ROC curve (AUC) value [
The value of each variable in the database has a different weight for the model. Some values approximate a paucibacillary classification, whereas other values approximate toward a multibacillary classification. Representing the distance between paucibacillary and multibacillary as a scale from 0 to 1, we can choose different values on this scale as the limit between the two classifications. These possible values are represented by the colored scale to the right of the ROC curve in
Number of occurrences per inconsistency.
Inconsistency | Number of occurrences |
Operational classification does not match the clinical form | 8545 |
Indeterminate with disability | 4867 |
Indeterminate with affected nerves | 3785 |
Paucibacillary with positive bacilloscopy | 2825 |
Paucibacillary with more than 5 skin lesions | 938 |
Patients with more than 18 affected nerves | 93 |
Before data processing, the SINAN database had 35,616 lines with at least one NA, 26,539 lines with at least one unknown item (gender, race, schooling, grade of disability, operational classification, bacilloscopy, or clinical form), and 15,473 lines with both NA and an unknown item. There were a total of 77,628 cases with missing data, accounting for 44.39% of the total 174,871 cases.
Based on available data in SUS, after cleaning the dataset, it was possible to geographically visualize the dispersion of leprosy incidence in Brazil over the period from 2014 to 2018 (
Geographic distribution of new annual cases of leprosy in Brazilian municipalities. inhab: inhabitants; NA: not available.
In addition to showing the geographical extent of the disease, the cleaned database included individuals from 4 to 106 years old (mean of 44 years), with an average of seven lesions, two affected nerves, and three household contacts per case. The BI was not calculated in 37.00% (32,853/88,783) of cases, and for the remaining cases, 41.18% (23,034/55,930) of the BI results were positive. In total, the database reported multibacillary leprosy in 76.66% (68,061/88,783) of patients, who were scattered throughout the Brazilian territory (
Distribution of leprosy cases in the Brazilian states from 2014 to 2018.
Geographic leprosy misclassification distribution in Brazilian municipalities. NA: not available.
Starting from the assumption that the error of clinical diagnosis is properly characterized, it was possible to develop a support model based on AI. The RF algorithm presented the smallest mean of misclassification error compared with the other algorithms using 10-fold cross-validation with default hyperparameters considered in the mlr R package [
Comparison of algorithms according to the mean of misclassification error (MMCE).
We next sought to determine if AI can assist in choosing the correct treatment for leprosy. Since the Brazilian Ministry of Health and the WHO suggest patterns (an algorithm) to classify the disease, our group was able to develop this new strategy to help control leprosy. A different model for each Brazilian state was used to improve the prediction in the remaining states; that is, a cross-validation strategy was applied to avoid overfitting by training in one state but testing in others. The number of lesions, incidence, and affected nerves were among the most important variables in the three best models (Mato Grosso, Rio Grande do Sul, and Paraná).
Quality in the classification of leprosy cases by artificial intelligence models in Brazilian states.
Importance (in percent) of each variable utilized in the models that represent the highest accuracy.
Variable | Meaning | Mato Grosso model | Rio Grande do Sul model | Paraná model |
INCIDÊNCIA | Incidence | 15.0 | 9.5 | 6.1 |
NU_IDADE_N | Age | 9.3 | 8.1 | 5.5 |
CS_SEXO | Gender | 1.6 | 1.7 | 1.9 |
CS_RACA | Race | 2.7 | 6.9 | 1.2 |
CS_ESCOL_N | Educational level | 4.7 | 6.4 | 3.4 |
NU_LESOES | Number of skin lesions | 23.6 | 37.0 | 41.9 |
AVALIA_N | Grade of disability | 4.4 | 4.2 | 5.0 |
BACILOSCOP | Bacilloscopy | 6.1 | 4.5 | 23.7 |
CONTREG | Number of household contacts | 5.1 | 5.9 | 2.8 |
NERVOSAFET | Number of affected nerves | 27.5 | 15.8 | 8.5 |
Quality of the artificial intelligence model applied to the differential diagnosis of paucibacillary and multibacillary leprosy.
Quality parameter | Mato Grosso model | Rio Grande do Sul model | Paraná model |
Accuracy | 0.970 | 0.812 | 0.929 |
Sensitivity | 0.926 | 0.977 | 0.877 |
Specificity | 0.812 | 0.218 | 0.919 |
PPVa | 0.936 | 0.803 | 0.972 |
NPVb | 0.786 | 0.740 | 0.698 |
aPPV: positive predictive value.
bNPV: negative predictive value.
This study analyzed the SINAN database from 2014 to 2018 considering the epidemiological, clinical, and sociodemographic context of patients diagnosed with leprosy in Brazil. In analyzing the frequency of inconsistencies in the SINAN database (
According to Grossi et al [
According to WHO goals for 2030, it is necessary to employ strategic methodologies to assist leprosy control. The use of AI is a novel method with potential to expand the capacities to diagnose diseases, especially those that are neglected. The use of AI therefore allows for obtaining higher coverage in the initial diagnosis process and facilitates the sharing of secure information, with the aim to expand and reach a larger number of health professionals [
We recognize the importance of a tool to improve the accuracy of insufficiently trained health professionals, especially in the most remote areas of Brazil. The app presented in this work proved to be a promising option to improve the coverage and scalability to the Brazilian health service regarding the choice of an appropriate treatment for leprosy [
Another relevant issue to mention involves problems in incorrectly filling out the reporting form. To fill out a SINAN form correctly, it is necessary to list the Madrid Classification and the treatment according to the operational classification. Therefore, a patient given a classification of indeterminate or tuberculoid has to receive treatment for paucibacillary leprosy, whereas a classification of borderline or lepromatous requires treatment for multibacillary leprosy.
According to the Brazilian Ministry of Health in 2017, a case with a positive BI result should be considered as multibacillary leprosy. However, doubts arise for cases that are considered to be borderline and close to the tuberculoid pole. Despite the difficulty of correct classification, these cases are generally considered to be multibacillary leprosy [
In addition to the implementation of an algorithm that helps choose the correct therapy, the use of intelligent data collection devices would allow for higher quality and validation [
Notably, the high accuracy (92.38%), sensitivity (93.97%), and specificity (87.09%) of this app provide a multiplatform method to support scalable characterization/classification for numerous other neglected diseases in remote communities in Brazil and worldwide.
As previously mentioned, SINAN is a platform launched to manage notifications from each Brazilian state. The records of this platform are obtained from assessments made by many health professionals with different levels of qualification. Thus, the quality of data depends on many factors, including (i) quality of the items requested by the forms and their correct interpretation, (ii) correct clinical assessment of the patient, and (iii) proper filling out of the forms. In addition, it is important to note that the possibility to add more items of information about serological and molecular integrated tests for leprosy diagnoses could undoubtedly improve the accuracy of the method, as we have done in our research group [
The proposed app showed good accuracy to classify a case as paucibacillary or multibacillary leprosy by recognizing patterns in leprosy cases registered in the SINAN database. After validation, this app could be an essential tool to help health professionals make an accurate leprosy operational classification and decide which treatment to use for patients with paucibacillary or multibacillary leprosy considering reducing the likelihood of mistreatment. This study also highlights the importance of improving data collection methods given that prediction accuracy markedly increases with improved data quality.
artificial intelligence
area under the receiver operating characteristic curve
bacilloscopy index
false negative
false positive
false positive rate
not applicable/not available
neglected tropical disease
random forest
receiver operating characteristic
Sistema de Informação Nacional de Agravos de Notificação (National Notifiable Diseases Information System)
Sistema Único de Saúde (Unified Health System)
true negative
true positive
true positive rate
World Health Organization
The authors thank Artur José Vilar Sette, Davi Metzker Júnior, and Vladmir Machado Rios for helping in some aspects of app building, and Tillman Rauh for helping to translate some parts of this article. We are also grateful to all members of CREDEN-PES, Programa Multicêntrico de Bioquímica e Biologia Molecular at Universidade Federal de Juiz de Fora Campus Governador Valadares, and to PROEX/PROPP/UFJF. This study received financial support from the Conselho de Desenvolvimento Tecnológico e Científico/CNPq/BRAZIL, FAPEMIG. This study was also financed in part by the Coordenação de Aperfeiçoamento de Pessoal de Nível Superior – Brasil (CAPES; Finance Code 001, file number: 88881.361990/2019-01 [Migrated-SICAPES3]). The funding sources had no role in the design of the study; in the collection, analysis, implementation, and interpretation of data; and in writing the manuscript.
GAL processed all raw data from SINAN using Python 3.8 and WPS spreadsheets. MLMDS used these data to apply the RF algorithm in R software and programmed the app (via ShinyApp). GAL and MLMDS prepared the figures and, along with LADOF have authorized, reviewed, and edited this article. ACB and JKF provided expertise for interpretation of the results and critically edited the manuscript. All authors have read and approved the final version of the manuscript.
None declared.