TechRxiv
learning_from_how_human_correct.pdf (285.56 kB)
Download file

Learning From How Human Correct For Data-Centric Deep Learning

Download (285.56 kB)
preprint
posted on 2021-09-13, 19:51 authored by Tong GuoTong Guo
In industry NLP application, our manually labeled data has a certain number of noisy data. We present a simple method to find the noisy data and relabel them manually, meanwhile we collect the correction information. Then we present novel method to incorporate the human correction information into deep learning model. Human know how to correct noisy data. So the correction information can be inject into deep learning model. We do the experiment on our own text classification dataset, which is manually labeled, because we relabel the noisy data in our dataset for our industry application. The experiment result shows that our method improve the classification accuracy from 91.7% to 92.5%. The 91.7% baseline is based on BERT training on the corrected dataset, which is hard to surpass.

History

Email Address of Submitting Author

779222056@qq.com

Submitting Author's Institution

Never Stop Research

Submitting Author's Country

  • China

Usage metrics

    Licence

    Exports