Evaluation of feature scaling for improving the performance of supervised learning methods

Tsehay Admassu Assegie, Vadivel Elanangai, Josephin Shermila Paulraj, Mani Velmurugan, Daya Florance Devesan

Abstract


This article evaluates the performance of the support vector machine (SVM), decision tree (DT), and random forest (RF) on the dataset that contains the medical records of 299 patients with heart failure (HF) collected at the Faisalabad Institute of Cardiology and the Allied hospital in Pakistan. The dataset contains 13 descriptive features of physical, clinical, and lifestyle information. The study compared the performance of three classification algorithms employing pre-processing techniques such as min-max scaling, and principal component analysis (PCA). The simulation result shows that the performance of the DT, and RF decreased with dimensionality reduction while the SVM improved with dimensionality reduction. The SVM achieved 84.44%. Thus, feature scaling improves the performance of the SVM. The RF performs at 82.22%, the DT at 81.11%, and the SVM shows an improvement of 1.64% with scaled features, compared to the original dataset.

Keywords


Decision tree; Feature selection; Heart failure; Random forest; Support vector machine

Full Text:

PDF


DOI: https://doi.org/10.11591/eei.v12i3.5170

Refbacks

  • There are currently no refbacks.


Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.

Bulletin of EEI Stats