Attribute Reduction Based on Rough Set Theory by Soccer League Competition Algorithm

نویسندگانMajid Abdolrazzagh-Nezhad, Ali Adibiyan
نشریهNashriyyah-i Muhandisi-i Barq va Muhandisi-i Kampyutar-i Iran
نوع مقالهFull Paper
تاریخ انتشار۲۰۲۱
رتبه نشریهISI
نوع نشریهچاپی
کشور محل چاپایران

چکیده مقاله

This research paper introduces a novel approach to feature selection in data mining by combining rough set theory (RST) with a modified version of the Soccer League Competition Algorithm (SLC). The increasing volume and dimensionality of datasets pose significant challenges for knowledge extraction, making feature reduction a critical preprocessing step. Traditional metaheuristic algorithms often struggle with local optima, convergence speed, or parameter tuning when applied to feature selection problems. The authors propose an adapted SLC algorithm, originally a continuous optimization method, to effectively handle the discrete nature of feature selection based on RST.

The core innovation lies in several modifications made to the standard SLC to suit the problem. These include using the combined power of both fixed and reserve players to calculate team strength, designing a new binarization mechanism to convert continuous player positions into discrete feature subsets, introducing a hydraulic penalty analysis for the fitness function to penalize solutions that do not achieve full RST dependency, and refining the imitation and provocation operators. The algorithm is structured around a two-level population of teams and players, promoting both global exploration and local exploitation within sub-populations, which helps escape local optima and accelerates convergence.

The proposed method was evaluated on 11 benchmark datasets from the UCI repository, spanning small, medium, and large dimensions. Its performance was compared against four well-known metaheuristic algorithms: Particle Swarm Optimization (PSO), Genetic Algorithm (GA), Artificial Immune System (AIS), and the League Championship Algorithm (LCA). The experimental results demonstrated the competitive advantage of the modified SLC, particularly for medium and large datasets. It consistently found smaller feature subsets with higher RST dependency coefficients and showed faster convergence rates in convergence curve analyses. Statistical significance tests, such as the Wilcoxon signed-rank test, confirmed that the improvements were meaningful for most datasets.

A further validation step involved applying the Naive Bayes classifier to both the original datasets and the reduced feature subsets obtained by each algorithm. The results indicated that the subsets selected by the SLC-based method often maintained or even improved classification accuracy, precision, sensitivity, and specificity compared to using all features, while significantly reducing computational complexity. The paper successfully addresses the "peaking phenomenon," where adding too many features degrades classifier performance, by providing a robust mechanism for selecting minimal, informative feature subsets.

In conclusion, this work makes a valuable contribution to the field of feature selection by effectively hybridizing rough set theory with a creatively adapted sports-inspired metaheuristic. The modified Soccer League Competition Algorithm proves to be a powerful, parameter-efficient tool for dimensionality reduction, especially for high-dimensional data. Its layered population structure and the introduced enhancements allow it to outperform established algorithms in finding compact, high-quality feature subsets, thereby facilitating more efficient and accurate data mining and classification tasks.

لینک ثابت مقاله

tags: Attribute Reduction