Reviewer Recommendations Using Document Vector Embeddings and a Publisher Database: Implementation and Evaluation

Yue Zhao; Ajay Anand; Gaurav Sharma

doi:10.36227/techrxiv.14816538.v1

loading page

Reviewer Recommendations Using Document Vector Embeddings and a Publisher Database: Implementation and Evaluation

Yue Zhao ,
Ajay Anand ,
Gaurav Sharma

Abstract

We develop and evaluate an automated data-driven framework for providing reviewer recommendations for submitted manuscripts. Given inputs comprising a set of manuscripts for review and a listing of a pool of prospective reviewers, our system uses a publisher database to extract papers authored by the reviewers from which a Paragraph Vector (doc2vec ) neural network model is learned and used to obtain vector space embeddings of documents. Similarities between embeddings of an individual reviewer’s papers and a manuscript are then used to compute manuscript-reviewer match scores and to generate a ranked list of recommended reviewers for each manuscript. Our mainline proposed system uses full text versions of the reviewers’ papers, which we demonstrate performs significantly better than models developed based on abstracts alone, which has been the predominant paradigm in prior work. Direct retrieval of reviewer’s manuscripts from a publisher database reduces reviewer burden, ensures up-to-date data, and eliminates the potential for misuse through data manipulation. We also propose a useful evaluation methodology that addresses hyperparameter selection and enables indirect comparisons with alternative approaches and on prior datasets. Finally, the work also contributes a large scale retrospective reviewer matching dataset and evaluation that we hope will be useful for further research in this field. Our system is quite effective; for the mainline approach, expert judges rated 38% of the recommendations as Very Relevant; 33% as Relevant; 24% as Slightly Relevant; and only 5% as Irrelevant.

2022Published in IEEE Access volume 10 on pages 21798-21811. 10.1109/ACCESS.2022.3151640

Abstract

Peer review status:Published