Obfuscated computer virus detection using machine learning algorithm

Tan Hui Xin, Ismahani Ismail, Ban Mohammed Khammas


Nowadays, computer virus attacks are getting very advanced. New obfuscated computer virus created by computer virus writers will generate a new shape of computer virus automatically for every single iteration and download. This constantly evolving computer virus has caused significant threat to information security of computer users, organizations and even government. However, signature based detection technique which is used by the conventional anti-computer virus software in the market fails to identify it as signatures are unavailable. This research proposed an alternative approach to the traditional signature based detection method and investigated the use of machine learning technique for obfuscated computer virus detection. In this work, text strings are used and have been extracted from virus program codes as the features to generate a suitable classifier model that can correctly classify obfuscated virus files. Text string feature is used as it is informative and potentially only use small amount of memory space. Results show that unknown files can be correctly classified with 99.5% accuracy using SMO classifier model. Thus, it is believed that current computer virus defense can be strengthening through machine learning approach.


Machine learning; Obfuscated computer virus; Signature based; SMO classifier model; String features

Full Text:


DOI: https://doi.org/10.11591/eei.v8i4.1584


  • There are currently no refbacks.

Bulletin of EEI Stats