Feature selection optimization with filtering and wrapper methods: two disease classification cases

Loading...
Thumbnail Image

Date

2023

Journal Title

Journal ISSN

Volume Title

Publisher

Scientific and Technological Research Council Turkey

Access Rights

info:eu-repo/semantics/openAccess

Abstract

Discarding the less informative and redundant features helps to reduce the time required to train a learning algorithm and the amount of storage required, improving the learning accuracy as well as the quality of results. In this study, we present different feature selection approaches to address the problem of disease classification based on the Parkinson and Cardiac Arrhythmia datasets. For this purpose, first we utilize three filtering algorithms including the Pearson correlation coefficient, Spearman correlation coefficient, and relief. Second, metaheuristic algorithms are compared to find the most informative subset of the features to obtain better classification accuracy. As a final method, a hybrid model involving filtering algorithms is applied to the datasets to eliminate half of the features, and then a metaheuristic algorithm based on a proposed genetic algorithm is applied to the rest of the datasets. With all three methods, we use three classification algorithms: support vector machine, K-nearest neighbor, and random forest. The results show that the best scores are obtained from the metaheuristic algorithm based on the proposed genetic algorithm for both datasets. This comparative study contributes to the literature by increasing the accuracy of classification for both datasets and presenting a hybrid model with filtering and a metaheuristic algorithm.

Description

Keywords

Feature selection, optimization algorithms, metaheuristic algorithms, genetic algorithms, filtering methods

Journal or Series

Turkish Journal of Electrical Engineering and Computer Sciences

WoS Q Value

Q4

Scopus Q Value

Volume

Issue

Citation