loading page

Improving biomarker selection for cancer subtype classification through multi-objective optimization
  • +1
  • Luca Cattelani ,
  • Arindam Ghosh ,
  • Teemu Rintala ,
  • Vittorio Fortino
Luca Cattelani
Author Profile
Arindam Ghosh
Author Profile
Teemu Rintala
Author Profile
Vittorio Fortino
University of Eastern Finland (Kuopio), University of Eastern Finland (Kuopio)

Corresponding Author:[email protected]

Author Profile


The current ML-driven approaches for omics-driven biomarker discovery often result in panels that are not reproducible in external validation datasets, and their optimization in terms of feature set size remains unsolved, which jeopardizes their translation into cost-effective clinical tools. The present study investigates how to optimize the feature set size by testing six algorithms on eight large-scale transcriptomics datasets for breast, lung, renal, and ovarian cancer. Most importantly, we propose a new evaluation metric called Cross Hypervolume (CHV) to assess the performance of multi-objective feature selection algorithms on both training and test datasets. CHV is an improvement over other metrics as it considers the trade-off between classification accuracy and the size of the selected features. The CHV metric allows for better assessment of biomarker models and helps to select the most accurate and biologically relevant ones.