Reviewer Recommendations Using Document Vector Embeddings and a
Publisher Database: Implementation and Evaluation
Abstract
We develop and evaluate an automated data-driven framework for providing
reviewer recommendations for submitted manuscripts. Given inputs
comprising a set of manuscripts for review and a listing of a pool of
prospective reviewers, our system uses a publisher database to extract
papers authored by the reviewers from which a Paragraph Vector (doc2vec
) neural network model is learned and used to obtain vector space
embeddings of documents. Similarities between embeddings of an
individual reviewer’s papers and a manuscript are then used to compute
manuscript-reviewer match scores and to generate a ranked list of
recommended reviewers for each manuscript. Our mainline proposed system
uses full text versions of the reviewers’ papers, which we demonstrate
performs significantly better than models developed based on abstracts
alone, which has been the predominant paradigm in prior work. Direct
retrieval of reviewer’s manuscripts from a publisher database reduces
reviewer burden, ensures up-to-date data, and eliminates the potential
for misuse through data manipulation. We also propose a useful
evaluation methodology that addresses hyperparameter selection and
enables indirect comparisons with alternative approaches and on prior
datasets. Finally, the work also contributes a large scale retrospective
reviewer matching dataset and evaluation that we hope will be useful for
further research in this field. Our system is quite effective; for the
mainline approach, expert judges rated 38% of the recommendations as
Very Relevant; 33% as Relevant; 24% as Slightly Relevant; and only 5%
as Irrelevant.