A predictive analysis framework of heart disease using machine learning approaches

Shourav Molla, F. M. Javed Mehedi Shamrat, Raisul Islam Rafi, Umme Umaima, Md. Ariful Islam Arif, Shahed Hossain, Imran Mahmud

Abstract


Heart diseaseis among the leading causes for death globally. Thus, early identification and treatment are indispensable to prevent the disease. In this work, we propose a framework based on machine learning algorithms to tackle such problems through the identification of risk variables associated to this disease. To ensure the success of our proposed model, influential data pre-processing and data transformation strategies are used to generate accurate data for the training model that utilizes the five most popular datasets (Hungarian, Stat log, Switzerland, Long Beach VA, and Cleveland) from UCI. The univariate feature selection technique is applied to identify essential features and during the training phase, classifiers, namely extreme gradient boosting (XGBoost), support vector machine (SVM), random forest (RF), gradient boosting (GB), and decision tree (DT), are deployed. Subsequently, various performance evaluations are measured to demonstrate accurate predictions using the introduced algorithms. The inclusion of Univariate results indicated that the DT classifier achieves a comparatively higher accuracy of around 97.75% than others. Thus, a machine learning approach is recognize, that can predict heart disease with high accuracy. Furthermore, the 10 attributes chosen are used to analyze the model's outcomes explainability, indicating which attributes are more significant in the model's outcome.

Keywords


Decision tree; Gradient boosting; Heart diseases; Random forest; Univariate feature selection

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v11i5.3942

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats