Graph Embedding-based Matching Multilingual Out-of-Vocabulary Terms on Social Media
Our study aims to detect multilingual Out-of-Vocabulary (OOV) and matching among multilingual OOV. Based on the original OOV issue, many multilingual OOVs also emerged at the same time. In order to solve this problem, this paper proposes a graph embedding-based matching among multilingual OOV. The method is divided into two parts. The first part is to extract OOV from the network corpus and understand it. In the second part, the OOV in the first part is taken as the target node, and the understood part is taken as the feature node of the target node to construct the graph and embed the graph. Our study uses Chinese, Korean, and Japanese for the experiment. The result of the method that Our study proposed is that F1- score reached 93.94%. Our study also compares this method with other embedding algorithms, and the F1-score is higher than the average of other algorithms F1-score by 9.62%.
This study was supported by a grant from the National Research Foundation of Korea (NRF) funded by the Korean government (MSIP) (NRF-2020R1A2B5B01002207).
Email Address of Submitting Authorgabrielpondc@cau.ac.kr
ORCID of Submitting Author0000-0001-9913-7864
Submitting Author's InstitutionDepartment of Computer Engineering, ChungAng University
Submitting Author's Country
- Korea, Republic of (South Korea)