Data-Centric Content Classification of Smart City Residential Services
The residential services in the context of smart cities accu- mulate massive real-time inquiry data in natural language to describe the services in need. Such inquiry requests have diverse topics in the content and considerable variate length. Besides, the responsible departments that may handle the inquiry involve a large number of organizations, from metropolitan administration to local communities. Hence accumu- lated request data is primary and central to the service of accurately dispatching requests to responsible departments. The challenge is de- vising a data centric approach to fit the data with SOTA models and improve the request classification accuracy. In this paper, we analyze the factors of embedding tokens, data segmentation, model structures, and classification methods. We devise a unified modelling process with mul- tiple dataflows that combine these factors to observe their interactions. The experiment results demonstrate the compound effects and provide insights into how SOTA models respond differently to variations in these factors. The observations allow us to fine-tune the learning task at each stage and achieve a maximum 82.4% F1-Score.
History
Email Address of Submitting Author
ding.li@mail.concordia.caORCID of Submitting Author
0000-0001-5311-953XSubmitting Author's Institution
Concordia UniversitySubmitting Author's Country
- Canada