TY - JOUR
T1 - An efficient henry gas solubility optimization for feature selection
AU - Neggaz, Nabil
AU - Houssein, Essam H.
AU - Hussain, Kashif
PY - 2020/3/13
Y1 - 2020/3/13
N2 - In classification, regression, and other data mining applications, feature selection (FS) is an important pre-process step which helps avoid advert effect of noisy, misleading, and inconsistent features on the model performance. Formulating it into a global combinatorial optimization problem, researchers have employed metaheuristic algorithms for selecting the prominent features to simplify and enhance the quality of the high-dimensional datasets, in order to devise efficient knowledge extraction systems. However, when employed on datasets with extensively large feature-size, these methods often suffer from local optimality problem due to considerably large solution space. In this study, we propose a novel approach to dimensionality reduction by using Henry gas solubility optimization (HGSO) algorithm for selecting significant features, to enhance the classification accuracy. By employing several datasets with wide range of feature size, from small to massive, the proposed method is evaluated against well-known metaheuristic algorithms including grasshopper optimization algorithm (GOA), whale optimization algorithm (WOA), dragonfly algorithm (DA), grey wolf optimizer (GWO), salp swarm algorithm (SSA), and others from recent relevant literature. We used k-nearest neighbor (k-NN) and support vector machine (SVM) as expert systems to evaluate the selected feature-set. Wilcoxon’s ranksum non-parametric statistical test was carried out at 5% significance level to judge whether the results of the proposed algorithms differ from those of the other compared algorithms in a statistically significant way. Overall, the empirical analysis suggests that the proposed approach is significantly effective on low, as well as, considerably high dimensional datasets, by producing 100% accuracy on classification problems with more than 11,000 features.
AB - In classification, regression, and other data mining applications, feature selection (FS) is an important pre-process step which helps avoid advert effect of noisy, misleading, and inconsistent features on the model performance. Formulating it into a global combinatorial optimization problem, researchers have employed metaheuristic algorithms for selecting the prominent features to simplify and enhance the quality of the high-dimensional datasets, in order to devise efficient knowledge extraction systems. However, when employed on datasets with extensively large feature-size, these methods often suffer from local optimality problem due to considerably large solution space. In this study, we propose a novel approach to dimensionality reduction by using Henry gas solubility optimization (HGSO) algorithm for selecting significant features, to enhance the classification accuracy. By employing several datasets with wide range of feature size, from small to massive, the proposed method is evaluated against well-known metaheuristic algorithms including grasshopper optimization algorithm (GOA), whale optimization algorithm (WOA), dragonfly algorithm (DA), grey wolf optimizer (GWO), salp swarm algorithm (SSA), and others from recent relevant literature. We used k-nearest neighbor (k-NN) and support vector machine (SVM) as expert systems to evaluate the selected feature-set. Wilcoxon’s ranksum non-parametric statistical test was carried out at 5% significance level to judge whether the results of the proposed algorithms differ from those of the other compared algorithms in a statistically significant way. Overall, the empirical analysis suggests that the proposed approach is significantly effective on low, as well as, considerably high dimensional datasets, by producing 100% accuracy on classification problems with more than 11,000 features.
U2 - 10.1016/j.eswa.2020.113364
DO - 10.1016/j.eswa.2020.113364
M3 - Article
SN - 0957-4174
VL - 152
JO - Expert Systems with Applications
JF - Expert Systems with Applications
ER -