INTEGRATING INFORMATION GAIN FOR HYBRID OPTIMIZATION IN SUPERVISED LEARNING ALGORITHMS

Authors

  • Ahmad Syahrul Ramadhan-Firdaus Department of Informatics, Universitas Islam Kebangsaan Indonesia, Aceh, Indonesia

DOI:

https://doi.org/10.5281/zenodo.15847076

Keywords:

Hybrid; Optimization; Information Gain; Supervised Learning; K-NN; SVM; Naïve Bayes

Abstract

this study aims to enhance the performance of supervised learning models in dermatology data classification through a hybrid approach that combines Information Gain-based feature selection with several established supervised learning algorithms, namely K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Naive Bayes. Utilizing the Dermatology dataset from the UCI Machine Learning Repository, consisting of 366 instances with 34 numeric attributes and 6 class labels, the research identifies attributes with the lowest Information Gain values, including Family History, Eosinophils in the infiltrate, and Hyperkeratosis. These attributes undergo dimensional reduction to expedite computation and improve model performance. The study evaluates the impact of dataset dimensionality reduction on the performance of the supervised learning algorithms, encompassing KNN, SVM, and Naive Bayes. Experimental results reveal a significant enhancement in the performance of supervised learning models. Specifically, the generated models achieve a True Positive Rate (TPR) of up to 82.52%, True Negative Rate (TNR) of 98.81%, Positive Predictive Value (PPV) of 33.55%, Negative Predictive Value (NPV) of 98.78%, and accuracy of 96.29% using the KNN algorithm. Furthermore, the utilization of SVM and Naive Bayes also yields significant improvements in model performance.

Downloads

Published

2025-07-09

Issue

Section

Articles