Chemical reaction optimization to disease diagnosis by optimizing hyper-planes classifiers

AuthorsJalayeri, S. and M. Abdolrazzagh-Nezhad
JournalSoft Computing
Paper TypeFull Paper
Published At2019
Journal GradeISI
Journal TypeTypographic
Journal CountryGermany

Abstract

This research paper presents a novel hybrid approach for medical disease diagnosis by combining a new classification method called Hyper-Planes Classifier (HPC) with the Chemical Reaction Optimization (CRO) algorithm. Early and accurate disease diagnosis is critical for patient survival, but existing methods like Support Vector Machines (SVM) face challenges such as sensitivity to the number of features, the need for feature selection, and difficulty in handling non-convex, non-integrated class distributions. To overcome these limitations, the authors propose HPC, which divides the feature space into distinct regions using multiple hyper-planes, assigns a unique binary code to each region, and tags it with the class that has the majority of patients in that region.

The core innovation lies in the design of HPC and its optimization via CRO. Unlike SVM, HPC does not require feature reduction and can effectively model complex class distributions by increasing the number of hyper-planes. Finding the optimal coefficients for these hyper-planes is an NP-hard problem, which is addressed using CRO—a metaheuristic inspired by chemical reactions. CRO employs four types of reactions (decomposition, synthesis, on-wall ineffective collision, and inter-molecular ineffective collision) to balance exploration and exploitation in the search space. This allows the algorithm to efficiently navigate the solution space, share information between solutions (molecules), and avoid local optima, leading to robust optimization of the HPC parameters.

The proposed CRO-HPC method was rigorously evaluated on twelve medical datasets from UCI and KEEL repositories, covering diseases such as breast cancer, diabetes, hepatitis, and lung cancer. The experiments used 10-fold cross-validation and tested HPC with 1 to 5 hyper-planes. Remarkably, the results showed that using five hyper-planes achieved a 0.000% diagnosis error on test data for most datasets, excluding chronic kidney and Breast Cancer Coimbra. The method demonstrated strong convergence behavior, with four and five hyper-planes generally yielding the best and fastest convergence. The performance was consistent, with little deviation between the best and average results, indicating reliability.

A comprehensive comparison was made with over 50 existing diagnostic methods from the literature, including various SVM hybrids, neural networks, decision trees, and other metaheuristic-based classifiers. The CRO-HPC method proved highly competitive, often outperforming state-of-the-art techniques on multiple datasets. Notably, it achieved this without reducing the original dataset's dimensionality, utilizing all patient symptoms to find optimal diagnostic patterns. The time complexity of HPC is linear with respect to the number of features, making it computationally manageable even as hyper-planes increase.

In conclusion, this work successfully addresses key limitations of traditional classifiers like SVM by introducing a flexible, region-based hyper-plane classifier optimized through a dynamic and efficient chemical-inspired algorithm. The CRO-HPC framework not only achieves exceptional diagnostic accuracy, including perfect test classification for several diseases, but also maintains robustness and consistency across diverse medical datasets. It offers a promising direction for automated medical diagnosis, with potential future extensions into parallel processing, fuzzy HPC for uncertain data, and dynamic parameter tuning to further enhance performance.

Paper URL

tags: disease diagnosis