loading page

Transparent Dimension Reduction by Feature Construction with Genetic Algorithm
  • Nikita Radeev
Nikita Radeev
Novosibirsk State University

Corresponding Author:[email protected]

Author Profile


There are domain areas where all transformations of data must be transparent and interpretable (medicine and finance for example). Dimension reduction is an important part of a preprocessing pipeline but algorithms for it are not transparent at the current time. In this work, we provide a genetic algorithm for transparent dimension reduction of numerical data. The algorithm constructs features in a form of expression trees based on a subset of numerical features from the source data and common arithmetical operations. It is designed to maximize quality in binary classification tasks and generate features explainable by a human which achieves by using human-interpretable operations in a feature construction. Also, data transformed by the algorithm can be used in a visual analysis because the algorithm builds features that make space linearly separable using distance criteria in a fitness function to shift classes from each other as far as possible without loss of classification quality. The multicriterial dynamic fitness function is provided to build features with high diversity.