loading page

Filter method-based feature selection process for unattributed-identity multi-target regression problem
  • Iker Garcia ,
  • Roberto Santana
Iker Garcia
Author Profile
Roberto Santana
Author Profile

Abstract

In this paper, for the first time, a feature selection (FS) problem for an unattributed-identity multi-target regression (UIMTR) problem is presented. UIMTR is defi?ned as a multi-target regression problem where the set of target and predictor variables are undetermined, i.e., the identity of the variables is unattributed. Two forward selection ?filter-based mutual information sequential-methods are proposed. In particular, the proposed methods are multi-objective adaptations of the classical Mutual Information Maximization (MIM) and Maximum Relevance Minimum Redundancy (mRMR). The concept of “sentinel variable” is also introduced in this paper: any variable selected by the methods that, a posteriori, will be a predictor variable (its real-time data will be used by the models to predict the value of the target variables). To highlight the existence of this type of problems in the industry, and thus the need for this approach, a current problem of low voltage power grids is presented and modelled. In particular, the question of selecting a subset of smart meters (“sentinel smart meters”) that serve as predictors of a certain electrical measurement for the rest of the smart meters in the grid. The empirical approach will be applied to voltage curves of smart meters for six different transformer substations. The results are evaluated from three perspectives: (i) the quality of the predictions, (ii) the stability of the methods and (iii) the execution time. In addition, the results are compared with three other methods, a purely empirical one proposed in the article (based on voltage patterns (VP)) and another two which are well-known in the literature: (a) RReliefF (Relief for regressions) and (b) Fisher Score.