Weighted rank aggregation based on ranker accuracies for feature selection

Abdolrazzagh; Nezhad, M. and M. Kherad

نویسندگان	Abdolrazzagh-Nezhad, M. and M. Kherad
نشریه	Soft Computing
نوع مقاله	Full Paper
تاریخ انتشار	2025
رتبه نشریه	ISI
نوع نشریه	چاپی
کشور محل چاپ	آلمان

چکیده مقاله

this paper introduces a novel unsupervised method for feature selection based on weighted rank aggregation, named Rank Aggregation by Agreement/Disagreement (RAAD). The core challenge addressed is the filter rank selection problem, where no single filter-based feature selection method consistently performs best across different datasets and classifiers. Existing rank aggregation methods that attempt to combine multiple ranked feature lists often fail to account for the varying quality and accuracy of the individual base rankers in a practical, assumption-free manner.

The key innovation of RAAD lies in its principled approach to estimating the accuracy or reliability of each base ranker without requiring ground truth labels or prior assumptions. The method models the pairwise ordering of features by each ranker as an opinion. It then computes the empirical disagreement ratios between rankers and formulates an optimization problem to find the set of ranker accuracies that minimize the discrepancy between these observed disagreements and their theoretical counterparts derived from the hypothesized accuracies. These estimated accuracies serve as intelligent weights for a subsequent weighted majority voting aggregation process.

A significant technical contribution is the method's handling of potential inconsistencies in the aggregated ranking, such as cyclic orderings. The final aggregated results are modeled as a weighted directed graph, where edges represent pairwise orderings and their weights reflect the confidence from the weighted votes. A greedy algorithm is employed to find a maximum-weight acyclic subgraph, which is then topologically sorted to produce a consistent final feature ranking. This approach ensures a robust and logically coherent aggregated list.

The proposed RAAD method was extensively evaluated against five individual filter-based methods (Chi-Square, ReliefF, Information Gain, MRMR, CFS) and several state-of-the-art rank aggregation techniques, including Borda variants, Markov Chain methods, and distribution-based methods like RRA. Experiments were conducted on nine diverse UCI datasets using a Naïve Bayes classifier and evaluated across six metrics: accuracy, precision, recall, specificity, F-measure, and G-mean.

The experimental results demonstrate that RAAD consistently outperforms both individual filter methods and other aggregation techniques across most datasets and performance metrics. Furthermore, the results validate that rank aggregation methods generally yield better feature subsets than any single filter method, with RAAD achieving the best overall performance. An additional advantage of RAAD is its computational efficiency, requiring significantly less time than metaheuristic-based aggregation methods like genetic algorithms, making it practical for moderate to large-scale problems. The method also provides interpretable outputs in the form of estimated weights for each base ranker, offering insight into their relative reliability on specific datasets.

In conclusion, the RAAD method presents an effective, efficient, and unsupervised solution to the feature selection problem via intelligent rank aggregation. It successfully leverages the agreement and disagreement patterns among base rankers to assign meaningful weights, leading to superior and more robust feature subsets for classification tasks. The method's lack of dependency on ground truth or strong prior assumptions makes it highly applicable to real-world scenarios. Future work may focus on enhancing the accuracy estimation with learning-based approaches and applying the framework to other domains beyond feature selection.

لینک ثابت مقاله