Detecting spam using Harris Hawks optimizer as a feature selection algorithm
Mosleh M. Abualhaj, Ahmad Adel Abu-Shareha, Sumaya Nabil Alkhatib, Qusai Y. Shambour, Adeeb M. Alsaaidah
Abstract
The Harris Hawks optimization (HHO) was used in this study to enhance spam identification. Only the features with a high influence on spam detection have been selected using the HHO metaheuristic technique. The HHO technique's assessment of the selected features was conducted using the ISCX-URL2016 dataset. The ISCX-URL2016 dataset has 72 features, but the HHO technique reduces that to just 10 features. Extra tree (ET), extreme gradient boosting (XGBoost), and support vector machine (SVM) techniques are used to complete the classification assignment. 99.81% accuracy is attained by the ET, 99.60% by XGBoost, and 98.74% by SVM. As we can see, with the ET, XGBoost, and k-nearest neighbor (KNN) techniques, the HHO technique achieves accuracy above 98%. Nonetheless, the ET technique outperforms the XGBoost and KNN techniques. ET outperforms other methods due to its robust ensemble approach, which benefits from the diverse and relevant feature subset selected by HHO. HHO's effective reduction of noisy or redundant features enhances ET's ability to generalize and avoid overfitting, making it a highly efficient combination for spam detection. Thus, it looks promising to combat spam emails by combining the ET technique for classification with the HHO technique for feature selection.
Keywords
Feature selection; Harris Hawks algorithm; ISCX-URL2016 dataset; Machine learning; Spam
DOI:
https://doi.org/10.11591/eei.v14i3.9198
Refbacks
There are currently no refbacks.
This work is licensed under a
Creative Commons Attribution-ShareAlike 4.0 International License .
<div class="statcounter"><a title="hit counter" href="http://statcounter.com/free-hit-counter/" target="_blank"><img class="statcounter" src="http://c.statcounter.com/10241695/0/5a758c6a/0/" alt="hit counter"></a></div>
Bulletin of EEI Stats
Bulletin of Electrical Engineering and Informatics (BEEI) ISSN: 2089-3191, e-ISSN: 2302-9285 This journal is published by the Institute of Advanced Engineering and Science (IAES) in collaboration with Intelektual Pustaka Media Utama (IPMU) .