Comparative analysis of predictive machine learning algorithms for diabetes mellitus

Kirti Kangra, Jaswinder Singh

Abstract


Diabetes mellitus (DM) is a serious worldwide health issue, and its prevalence is rapidly growing. It is a spectrum of metabolic illnesses defined by perpetually increased blood glucose levels. Undiagnosed diabetes can lead to a variety of problems, including retinopathy, nephropathy, neuropathy, and other vascular abnormalities. In this context, machine learning (ML) technologies may be particularly useful for early disease identification, diagnosis, and therapy monitoring. The core idea of this study is to identify the strong ML algorithm to predict it. For this several ML algorithms were chosen i.e., support vector machine (SVM), Naïve Bayes (NB), K nearest neighbor (KNN), random forest (RF), logistic regression (LR), and decision tree (DT), according to studied work. Two, Pima Indian diabetic (PID) and Germany diabetes datasets were used and the experiment was performed using Waikato environment for knowledge analysis (WEKA) 3.8.6 tool. This article discussed about performance matrices and error rates of classifiers for both datasets. The results showed that for PID database (PIDD), SVM works better with an accuracy of 74% whereas for Germany KNN and RF work better with 98.7% accuracy. This study can aid healthcare facilities and researchers in comprehending the value and application of ML algorithms in predicting diabetes at an early stage.

Keywords


Diabetes mellitus; Logistic regression; Machine learning; Support vector machine; WEKA

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v12i3.4412

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats