loading page

Automatically Evolving Interpretable Feature Vectors using Genetic Programming for an Ensemble Classifier in Skin Cancer Detection
  • +1
  • Qurrat Ul Ain ,
  • Harith Al-Sahaf ,
  • Bing Xue ,
  • Mengjie Zhang
Qurrat Ul Ain
Victoria University of Wellington

Corresponding Author:[email protected]

Author Profile
Harith Al-Sahaf
Author Profile
Mengjie Zhang
Author Profile

Abstract

Early diagnosis of skin cancer saves lives as it can be successfully treated through complete excision. Computer-aided diagnosis methods developed using artificial intelligence techniques help earlier detection and identify hidden causes leading to cancers in skin lesion images. In skin cancer image classification problems, an ensemble of classifiers has demonstrated better classification ability than a single classification algorithm. Traditionally, training an ensemble uses the complete set of original features, where some of these features can be redundant or irrelevant and hence, may not provide useful information in generating good models for ensemble classification. Moreover, newly created features may help improve the classification performance. To address this, existing methods have used feature construction for building an ensemble classifier, which usually creates a fixed number of features that may fit the training data too well, resulting in poor test performance. This study develops a novel classification approach that combines ensemble learning, feature selection, and feature construction utilizing genetic programming (GP) to handle the above limitations. The proposed method automatically evolves variable-length feature vectors consisting GP-selected and GP-constructed features suitable for training an ensemble classifier. The study evaluates the goodness of the proposed method on two benchmark real-world skin image datasets that include dermoscopy and standard camera images. The experimental results reveal that the proposed algorithm significantly outperforms six state-of-the-art convolutional neural network methods, existing GP approaches, and ten commonly used machine learning methods. Furthermore, the study also includes interpreting evolved individuals that highlight important skin cancer characteristics playing a vital role in discriminating images of different cancer classes. This study shows that high classification performance can be achieved at a low cost of computational resources and inference time, and accordingly this method is potentially suitable to be implemented in mobile devices for automated screening of skin lesions and many other malignancies in low resource settings.