Spam Review Detection: A Systematic Literature Review

In this era of technology, people rely on online posted reviews before buying any product. These reviews are very important for both the consumers and people. Consumers and people use this information for decision making while buying products or investing money in any product. This has inclined the spammers to generate spam or fake reviews so that they can recommend their products and beat the competitors. Spammers have developed many systems to generate the bulk of spam reviews within hours. Many techniques, strategies have been designed and recommended to resolve the issue of spam reviews . In this paper, a complete review of existing techniques and strategies for detecting spam review is discussed. Apart from reviewing the state-of-the-art research studies on spam review detection, a taxonomy on techniques of machine learning for spam review detection has been proposed. Moreover, its focus on research gaps and future recommendations for spam review identification.

on examples of related input-output pair. Supervised learning techniques that have been used for spam review detection so far are; Rule based classification [5,10], Unified model [2], Logistic Regression [4,11,12], Knearest neighbor (KNN) [4], Random Forest [4,[13][14][15], Decision Trees [16,17], Gradient Decent [4,10], Genetic Algorithm [18], Conceptual Model [19], Time Series [20], Neural Network [21], Deep Neural Network [22], Multinomial Naïve Bayes [9,11,13], N-Gram [13], Hybrid Learning Approach (Active and supervised learning) [23], RNN, CNN [24], and Multilayer Perceptron Model (MLP) [4,24], Unsupervised learning is a category of machine learning that work on the unlabeled datasets. Many unsupervised learning techniques have been used in spam detection which are: Natural Language Processing [6,9][58] Markov Network [25], Neural Auto-encoder Decision Forest [16]¸ and PU Learning [26]. Other than these supervised and unsupervised learning techniques, there are many other techniques that have been used for spam detection such as Fuzzy Logic [27], Heterogeneous Information Network [28], Hadoop [29], Text Mining [30], Sentiment Analysis [31][32][33][34][35], Cuckoo Search [36] [57], Adaptive Binary Flower Pollination [37], and Map Reduce [29]. Spam Review Detection has been the most active area of research in past years that covers all broad. In [2] a classier has been build based on logistic regression with content characteristics, feedback features, and rating features to identify fake reviews. Earlier studies have proposed to label datasets in two categories: duplicate reviews as spam reviews and the rest of the reviews as legitimate reviews. However, Jindal and Liu identified that many spam reviews were written in a way that it looks authentic. Hence, they determined that using duplication feature to differentiate legitimate reviews and spam reviews is not suited for creating label datasets [16].
Hernández F. et al. [26] presented PU Learning that builds a binary classifier. In PU Learning two sets were trained: set of positive instances (P) and set of both negative and positive instances but without a label (U). PU Learning technique depicts improvement in results compared to other techniques. Heydari, A. et al. introduces a system for detecting spam reviews using time series. They investigate fake reviews posted at doubtful time intervals. Moreover, they employ rating behaviors, context similarity, and people activeness in each time interval to differentiate between spam reviews and legitimate reviews [20].
Luyang, B, W, T., et al. [21] uses Sentence Convolutional Neural Network (SCNN) and Sentence Weighted Convolutional Neural Network (SWNN) to detect spam reviews. SCNN and SWNN were designed by modifying document-representation learning model. The time complexity of the SCNN and SWNN model is O(n*d 2 ). The SWNN model gives an accuracy of 86.1% as compared to the basic convolution neural network. Shreyas Aiyar, N. S.et al. [13] proposed the spam review detection using N-gram. Their model improves the accuracy of classification, whereas we have also identified the research gaps in implementing techniques used for spam review detection.
Nidhi A. Patel at el. [33] presents the techniques and datasets used for spam review detection. Moreover, they discuss the limitations of datasets such as limited number of features and unlabeled datasets whereas along with all these we have proposed taxonomy, which classifies the existing techniques and approaches so that the most appropriate approach can be figured out.
SP. Rajamohana at el. [8] discusses the accuracy of adapted techniques using evaluation metrics whereas we have also discussed the open issues and challenges in the domain of detecting spam review.
In this paper, a systematic mapping process has been implemented. As a result, this mapping process has allowed summarizing the techniques used for detecting spam review. This systematic mapping study focuses on analyzing, classifying, and summarizing the context of research in view of the spam review detection. Moreover, this paper also proposed a taxonomy for spam review detection The remainder section is organized in the following manner: Section II describes the used research methodology focusing on research questions, objectives, search strategy, screening of relevant papers, inclusion/exclusion criteria, sources, data extractions, and classification scheme; Section III define and design tables of finding and results obtained from section II; Section IV discusses the assessments of research questions; Section V and VI presents the discussion and conclusion

II. RESEARCH METHODOLOGY
The research methodology, "Systematic Literature Review" is selected for this kind of study. The objective of the systematic mapping is to provide an overview of the work that has been already done for detecting spam reviews so far. It establishes the research evidence if it exists. In order to complete this study, we used the process of systematic literature review explained by Petersen [38]. To write a systematic literature review, guidelines were implemented described by the Charters and Kitchenham [39]. The main objective of this study is to propose a taxonomy and explore existing research that has been done to detect spam review. The process followed for systematic mapping is shown in Fig. 2.

A. RESEARCH OBJECTIVES
This research consists of the following objectives. RO1: Characterize and categorize existing techniques in the domain of spam review detection.

RO2:
A taxonomy is proposed that shows the adopted techniques and approaches used for detecting spam review. RO3: Identify the challenges and research gaps. RO4: More focused research has been done in the domain of spam review detection.

B. IDENTIFY/ DEFINING RESEARCH QUESTIONS
Basic and important step of systematic literature review is identifying and defining the research questions. • The study focuses on gaps in approaches that were utilized to determine the issue of "spam review detection". • The research contains detailed information of "spam review detection" techniques. • The study presents relative information and analysis. After having detailed literature three most important research questions related to spam review detection are shown in Table I.

R-Q3: What information and features have been found in datasets of reviews?
This research question has helped us to identify the datasets and their features. These features help to select the method for detecting spam reviews.

C. CONDUCTING SEARCH
Conducting search is the second stage of the SLR. In the stage, all the relevant papers are searched related to the research topic. Methods defined by the research protocol has been used to undertake a specific search of systematic literature. A search string is defined to collect all papers which are related to the research topic using scientific databases. The terms or phrases used in search string were selected after the initial searches, where all possible keywords were tested. Therefore, the goal of this systematic mapping process is to search and map papers that relate to the technical aspects of the spam reviews. The search string that was used to collect all the related papers is described in Table II. After the design and test of search string, we selected all the authentic scientific databases for searches. To conduct a systematic literature review, peer-reviewed, and high-quality papers which are published in workshops, books, conference, journals, and symposium that are related to the research topic. There are five scientific databases that are used for paper retrieval. Selected databases are Science Direct, IEEE Xplore, Spring Link, ACM Digital Library and Elsevier.

1) SEARCH STRATEGY
In this step, relevant studies were identified for review. The articles used for conducting searches were consulted from five scientific databases: ACM Digital Library, Springer Link, Science Direct, IEEE Xplore and Elsevier. Another source Google Scholar was also used in order to access the gray literature in this field like white papers or technical reports. The search string was generally defined using the following equation 1 whereas KP represents primary keywords, KS represents secondary keywords and KA represents additional keywords.
Following is the search string that was used to perform an automatic search in scientific or databases. Fig. 2 is the representation of the Search String that how it works. "Spam" AND ("Fake" OR "Junk") AND ("Review" OR Opinion") AND ("Identification" OR "Detection" OR "Analytics") AND ("Predictive" OR "Descriptive") The search string for all scientific databases were checked and modified.

D. INCLUSION/EXCLUSION CRITERIA
Inclusion criteria refers to the characteristic that should be considered and exclusion criteria refers to the characteristic that prohibit viewpoint subjects from including them in the study. Following Fig 3. shows the inclusion exclusion criteria for this systematic mapping study.
Inclusion Criteria i.
Research articles should be peer-reviewed. ii.
Research articles should be related to the search string. iii.
Studies that present spam detection frameworks. iv.
Studies that present techniques used for spam detection. v.
Studies that present case studies related to spam detection. vi.
All the research articles published from 2007 to 2019. vii.
Published literature as chapters of books, books, and technical/non-technical reports. Exclusion Criteria i.
Title of research articles that were irrelevant or not related to spam detection. ii.
Research articles that were not full research papers for example tutorials, conference abstracts, essays, thesis or presentation. iii.
Research articles that were not in the English language. iv.
Research articles that were without abstract. v.
Research articles that did not present solutions explicitly.

D. SEARCH AND SELECTION RESULTS
The results of the search and selection of research papers are shown in Fig. 5. Initially, 6099 papers were retrieved when the designed search string was implemented to selected five scientific databases. The search was performed in three phases i.e., primary search, secondary search, and snowballing. In first-round papers were included or excluded on the basis of titles of retrieved research papers. Titles of all the research papers were examined which result in selection of 154 papers. The high number of papers was excluded because they were not relevant to the topic. For example, most of the research papers discuss the business perspective of spam review detection. Hence, those papers were not part of our study. Moreover, we also studied papers that were relevant to spam detection other than spam reviews e.g. Spam emails e.tc. This review has been done from 2012 to 2019. This research has been done by primary search using scientific databases, conference procedures, journal papers, books and review papers. In the second round of inclusion or exclusion, primary search results were assessed by considering the titles, abstract, methodology and in the final round of inclusion or exclusion snowballing was used. Therefore, 63 research papers were selected for studies. The criteria for the selection of various research papers were reliant upon the define research questions. The nature of SLR relies on the determination of significant works. This SLR utilized the criteria defined in Fig

E. KEYWORDING USING ABSTRACT
In this stage of systematic literature review, after selecting relevant papers keywording was done on the basis of the abstract. We have used the process for keywording defined by Petersen [38]. This process has been completed in two phases. In the first phase, abstracts have been examined and identify concepts that reflect the contribution of studies. In the second phase, higher-level understanding has been developed on the basis of keywords. These keywords have been used to cluster categories.

F. QUALITY ASSESSMENT CRITERIA
In SLR, quality assessment criteria are defined to assess the quality of selected papers after screening. A set of questionnaires has been designed for scoring and measuring the quality of papers selected for studies.

G. DATA EXTRACTION AND CLASSIFICATION SCHEME
The data extraction method has been implemented to get the possible answers to the research questions defined in Table I. Table VI was designed to extract data in order to collect the relevant information required to address the research questions defined in this systematic literature review. Data extraction id from DE1 to DE7 all collect the basic information related to the papers. These data extraction features include the title of research papers, name of the author's country, author name, publication type, source of the publications and name of the place where the publication was held. All other data from DE8 to DE11 were extracted after studying the paper.

1) PRIMARY SEARCH RESULTS
Primary search results were obtained from the five scientific databases ACM digital library, Elsevier, IEEE Xplore, Science Direct and Springer. Table IZV presents primary search results obtained after using the search string on scientific databases. The papers selected for the primary study were extracted using the data extraction sheet. First, an Excel sheet was maintained, which consist of all the Metadata of research papers, books or articles obtained from the primary search results. Then, using the data extraction and classification scheme defined under the heading "Data Extraction and Classification Scheme" was used for a secondary search. In secondary search, we select all the papers by the title of the research paper that were best suited according to our selected topic. In the next round, we select the papers by reading the abstract and conclusion of the research papers that are obtained after the secondary search. We select 90 papers after implementing the abstract and conclusionbased search. Then we implemented a full text-based search on those 90 papers and selected 63 papers as final papers that were used to write this systematic literature review. Fig. 6 describes the number of publications and their year of publication of final results. The classification criteria were divided into categories, established with the help selected primary studies. Information for RQ1 is further categorized into the year in which the studies are produced, publication channels of the primary selected studies and the citation ratio. The Google scholar engine was used to obtain the citation count because it is an important feature that reflects the quality of the select research study. All of this information is collected directly from the respective study sources. RQ2 includes categories and techniques used for detecting spam review. RQ3 categories include information and features of publicly available datasets. From Table VIII it can be seen that most of the selected research papers are published between 2015 to 2019. Fig.  6 shows the distribution of these article publications by year.

Research Goals
The research goals and achievement that is defined in papers E11

Discussion and Conclusion
Major findings of the research or study.

III. QUALITY ASSURANCE
When performing Systematic Literature Review quality assurance is the most influential part in order to select the literature of high quality, so that accurate and reliable analysis can be produced. Selection of inclusion/exclusion and well-defined keywords are the most important tasks for the planning phase of Systematic Literature Review. To accomplish this task following criteria is defined in order to validate the quality of research studies In the final set, each publication was assessed for its quality. The quality assessment was performed during the data extraction phase and ensures that remaining included studies contribute a valuable study to the SLR. Hence, to fulfill this task questionnaires were designed. The questions were written in a way inspired by [4]. These criteria are labeled as a, b, c, d, and e. The scores of the quality criterion shows that journals are advantageous than conferences, workshops, and symposia because it is more difficult to publish papers in the journals of rank Q1 and Q2 than in conference, symposium or workshop. The final score of each paper is calculated by taking the sum of all five related questions. Table VIII depicts the evaluation results based on the criteria defined in Table III.

IV. ASSESSMENTS OF RESEARCH QUESTION
In this section, detailed answers to each research question are given which depends on analysis of the 63 selected primary research papers. Detail discussion of all three research question are extracted after analyzing and studying the selected research paper.

A. RQ1 Assessment: Which approaches have been utilized to distinguish spam contents or reviews?
Many techniques have been used to identify spam reviews and most of them are based on machine learning and rest uses semantic analysis. Mainly these techniques were classified in Unsupervised Learning (30%). Supervised Learning (20%), Active Learning (17%), Hybrid Learning (7%), Semantic Analysis (15%) and Data Mining (11%). The classification of these techniques is shown below in Fig. 7. Each of these techniques has throe own domain and limitations. Table IX depicts details of all the limitations, techniques and datasets used for detecting spam reviews.

1) INACCESSIBILITY OF THE LABELED DATASETS
Availability of the labeled datasets is the main issue in the area of "spam review detection". Only one dataset is found about the hotel reviews was available publicly. However, this dataset has a limited number of features. Researchers need typically labeled datasets in order to design the classifiers for distinguishing proof of spam or legitimate reviews.

2) GROWTH IN DATASETS:
A large number of reviews exist on websites that are review based for example amazon.com, alibab.com, etc. The frequency of reviews and the reviewers are growing continuously and rapidly. Datasets with such frequency require high computation power.

3) LIMITATIONS OF DATA ATTRIBUTES
Review datasets that publicly available have limited features. Lack of dataset attributes limits the researcher to detect spam reviews more precisely and accurately.

4) DETECTING MULTILINGUAL SPAM REVIEW
Review is a user-created content and user can compose a review in any language of their decision. Up until this point, not many researchers have worked on the dataset other than English, for example, Arabic, Chinese, or Malay. There is a need to have top to bottom research on the detection of spam in multilingual reviews.

C. RQ3 Assessment: What information and features have been found in datasets of reviews?
This section discusses about publicly available datasets and their features used for spam detection. The extraction of the features from data is called feature extraction. Numerous considerations have utilized various methods for feature extraction to remove the most widely identified features or words in reviews.

1) REVIEW DATASETS:
Accessibility of a dataset is the beginning stage of any "spam review detection" research. The key issue in the "spam review detection" is the accessibility of the labeled dataset. It has been seen from existing research that just one labeled review dataset is available, however, it has just reviewed the content and availability of different features. "Amazon Mechanical Turk (AMT)" is likewise uses labeled datasets through online laborers (called Exhausts As indicated by Mukherjee et al. the way toward marking has not given improved precision to "spam review detection" on genuine datasets. Table X records review datasets utilized by various researchers and show absolute reviews, frequency of reviewers and frequency of items for each dataset. It is seen by the researcher that all review datasets are not openly available, and common researchers use crawlers to accumulate required data. It has additionally been seen that the greater part of the researchers utilized Amazon.com, internet business site datasets in their works, as it is the greatest web-based business stage to have item reviews, and the second biggest review dataset is available from booking.com and yelp.com, which is an online lodging booking site. Furthermore, the researchers working in the "spam review detection" utilize these datasets gave by such sites. In view of the existing researcher, it is seen that constrained true labeled datasets are available. Thus, there is a need to have freely available labeled standard review datasets that might be utilized by the researchers for dissecting and learning the consequences of various "spam review detection systems.

V. DISCUSSION
In this section detail, the discussion has been presented about the techniques used for spam review detection. A taxonomy has been proposed to sum up the findings and results of the research.

A. PROPOSED TAXONOMY:
As best of our knowledge Fig. 9 presents the proposed taxonomy based on analysis of the selected studies. This work results in new taxonomy that may help researchers to categorize the techniques used for problems related to detection of spam review.

Spam Review Detection
Spam Review can be detected using following approaches:

Machine learning
Various studies have been done on techniques of machine learning. Most of these techniques of machine learning have been implemented in the area of the spam detection especially for detecting spam emails. There are two types of machine learning approaches that have been used span detection and these approaches have been classified in to different techniques. Following are the techniques that have been used for spam detection.

Supervised Learning
In supervised learning, machine is trained for prediction by using labeled data. In other words, we can say that some of the data is tagged with accurate answers. Following are the supervised learning techniques that have been used for spam detection.
• Decision Tree Classifiers: It is a systematic classification approach, which is used to build model of classification with the help of input dataset. It is specially used for problems which are complex to classify.

Unsupervised Leaning
In unsupervised learning, machine is trained for prediction by using unlabeled data. Following unsupervised learning techniques have been used to detect spam detection.
• K-means Clustering: It is quite simple algorithm of supervised learning. It helps in solving problems related to clustering. Data set is classified in to clusters which is defined using letter "k". • Twice Clustering Technique: It is another unsupervised learning technique that have been used for spam detection. • Neural Networks: We can define the neural network as computing system. These computing systems consist of interconnected nodes and nodes are known as neurons. Information is processed using these nodes that are organized in form of layers. Following are the techniques of neural networks that have been used for spam detection. o Auto Encoders: It is another technique of unsupervised learning used to encode or compress the data and reconstruct data using the "reduced encoded representation. o Multilayer Perceptron Model: It is a computational graph. It has some layers. First layer is called input layer that usually contain features that we have as input. Second layer is called hidden layer. There may be one or more than one hidden layers. Third layer is output layers that contain prediction.

Lexicon based learning
It is also used to extract sentiments from the text or document and predicting using sentiment analysis. Following are the lexicon learning approaches that have been used to detect spam reviews. o Dictionary based methods o Corpus based methods

B. OPEN ISSUES AND CHALLENGES
Some open issues and challenges have been identified from the literature which is given below: • Mainly the open issue and challenges are related to the unavailability of labeled datasets. There is only one labeled dataset of hotel reviews, but it has limited features or attributes.
• All the datasets which are publicly available have a limited number of attributes. This results in a lack of accuracy as more attributes are required to improve the accuracy of the implemented models or algorithm. • With the passage of time review datasets are growing rapidly, which in return requires higher computing power. Sentiment analysis will become more challenging in the domain of spam review detection with growth in datasets. • The feedback of the reviewer's review is not evaluated for spam detection. For example, some websites ask "Did you find this review helpful?" Such comments of other reviews are not considered for spam review detection.

VI. CONCLUSION
This study exhibited a systematic literature review of the spam review detection area and featured late research commitments as various component designing methodologies, spam review detection techniques, and various measures utilized for quality assessment. To separate valid shreds of evidence, this work uses review technique, concentrated on the search string, brought up investigate issues, chose papers from scientific databased and uses the inclusion/exclusion criteria. A sum of 6099 papers distributed from 2012 to December was chosen dependent on the search string, and in the wake of applying search based on title, 154 papers were selected. At last, by reading full length and snowballing, 63 research papers were concluded for additional study. Moreover, the criteria of quality assessment were defined to decide the importance and legitimacy of the research area fitting to selected publications.