Abstract
In this paper, we propose an ensemble learning based model to synthesize
the logarithmic magnitude response of head-related transfer function
(HRTF) using anthropometric features. We first cluster subjects based on
relevant anthropometric features to reduce differences within each
group, then we use the ensemble learning algorithm on clustered results
to predict the log-magnitude HRTF. In the training phase, three deep
neural networks (DNNs), each of which aims to predict log-magnitude
HRTFs in a particular group, are trained using anthropometric and
angle-related features. Afterward, another DNN is trained to integrate
estimates from the three group-wise DNNs into log-magnitude HTRFs. The
proposed model is compared with a baseline DNN model and our previously
proposed model, which incorporates an auto-encoder for dimensionality
reduction. Experimental results show that the proposed model performs
the best in synthesizing log-magnitude HRTFs in terms of the
log-spectral distortion (LSD) measure with great stability.