Classification of gene expression dataset for type 1 diabetes using machine learning methods

Noor AlRefaai, Sura Zaki AlRashid


Type 1 diabetes (T1D) disease is considered one of the most prevalent chronic diseases in the world, it causes a high level of glucose in the human blood. Despite the seriousness of this disease, T1D may affect people and their condition develops to an advanced stage without feeling it, which makes it difficult to control the disease. Early prediction of the presence of this disease may significantly reduce its risks. There are many attempts to overcome this disease, some of them are heading towards biological solutions and others towards bioinformatic solutions. Several studies have used a single feature selection method with a machine learning (ML) model to predict or classify T1D. In this paper, ML techniques were used for classification, such as Naive Bayes (NB), support vector machine (SVM), and random forest (RF) using a T1D gene expression dataset that has a multiclass to classify the genes associated with this disease. The proposed model can identify the genes related to T1D with high efficiency, which helps a lot in predicting a person carrying the disease before symptoms appear. The highest accuracy of 89.1% was obtained when applying SVM with chias the feature selection method.


Classification; Gene expression; Naive Bayes; Random forest; Support vector machine; Type 1 diabetes

Full Text:




  • There are currently no refbacks.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats

Bulletin of Electrical Engineering and Informatics (BEEI)
ISSN: 2089-3191, e-ISSN: 2302-9285
This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU).