loading page

Hybrid Model Compression for BERT-Based Polyphone Disambiguation in Mandarin Chinese
  • wei zhao ,
  • zuyi wang ,
  • li xu
wei zhao
Zhejiang University

Corresponding Author:[email protected]

Author Profile
zuyi wang
Author Profile


Polyphone disambiguation has long been a crucial problem for Chinese text-to-speech (TTS) systems. Researchers have greatly improved the performance of neural disambiguation models by leveraging pre-trained BERT models. However, the vast computation of BERT makes its deployment on edge devices extremely hard. To mitigate this problem, we present a hybrid model compression strategy incorporating both network pruning and knowledge distillation for polyphone disambiguation. In contrast to previous methods that compressed the model only through knowledge distillation, our approach can produce compact alternatives by directly pruning an existing BERT, thus remarkably reducing the training cost. Besides, we also provide a new dataset for common Chinese polyphonic characters (CCPC) to facilitate future research. Experimental results demonstrate that our approach can offer efficient disambiguation models with highly comparable performance to the BERT-based baseline.