Dilated Convolutional Model for Melody Extraction
preprintposted on 03.02.2022, 04:51 by Xian WangXian Wang, Lingqiao LiuLingqiao Liu, Javen ShiJaven Shi
Melody extraction is a challenging task in music information retrieval that enables many down-stream applications. In this paper we propose a simple dilated convolutional model for melody extraction. It takes variable-q transforms as inputs. It first uses consecutive layers of convolution to capture local temporal-frequency patterns. Afterward, it relies only a single layer of dilated convolution for capturing global frequency patterns formed by the pitches and harmonics of active notes. This model is effective in that it achieves the-state-of-the-art performance on most datasets, for both general and vocal melody extraction. In addition, it gets the best performance with the least training data.