A Comparative Study of Enhanced Machine Learning Algorithms for Brain Tumor Detection and Classification

 Abstract —The improvement of Artificial Intelligence (AI) and Machine Learning (ML) can help radiologists in tumor diagnostics without invasive measures. Magnetic resonance imaging (MRI) is a very useful method for diagnosis of tumors in human brain. In this paper, brain MRI images have been analyzed to detect the regions containing tumors and classify these regions into three different tumor categories: meningioma, glioma, and pituitary. This paper presents the implementation and comparison of various enhanced ML algorithms for the detection and classification of brain tumors. A brain tumor is the growth of abnormal cells in the human brain. Brain tumors can be cancerous or non-cancerous. Cancerous or malignant brain tumors can be life threatening. Hence, detection and classification of brain tumors at an early stage is extremely important. In this paper, enhanced ML algorithms have been implemented to predict the presence or the absence of brain tumors using binary classification and to predict whether a patient has brain tumor or not and if he does, detect the type of brain tumor using multiclass classification. The dataset that has been used to perform the binary classification task comprises of two types of brain MRI images with tumor and without tumor. Here nine ML algorithms namely, Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbor (KNN), Naïve Bayes (NB), Decision Tree (DT) classifier, Random Forest classifier, XGBoost classifier, Stochastic Gradient Descent (SGD) classifier and Gradient Boosting classifier have been used to classify the MRI images. A comparative analysis of the ML algorithms has been performed based on a few performance metrics such as accuracy, recall, and precision, F1-score, AUC-ROC curve and AUC-PR curve. Gradient Boosting classifier has outperformed all the other algorithms with an accuracy of 92.4%, recall of 94.4%, precision of 85%, F1-score of 89.5%, AUC-ROC of 97.2% and an AUC-PR of 91.4%. To address the multi-class classification problem, four ML algorithms namely, SVM, KNN, Random Forest classifier and XGBoost classifier have been employed. In this case, the dataset that has been used consists of four types of brain MRI images with glioma tumor, meningioma tumor, and pituitary tumor and with no tumor. The performances of the ML algorithms have been compared based on accuracy, recall, precision and the F1-score. XGBoost classifier has surpassed all the other algorithms in terms of accuracy, precision, recall and F1-score. XGBoost has produced an accuracy of 90%, precision of 90%, and recall of 90% and F1-score of 90%.


I. INTRODUCTION
HE applications of Artificial Intelligence (AI) and Machine Learning (ML) have been expanding at an exponential rate [1], [2], [3], [4], [5], [6], [7], [8], [9], [10]. There has been a substantial growth in the applications of AI and ML techniques in various radiological imaging tasks [11], [12]. AI and ML have been playing a major role in uncovering hidden insights into clinical decision making [13], [14]. ML is used in clinical practice including digital pathology and chest, brain, cardiovascular and abdominal imaging. Merging AI and ML with the competency of medical practitioners, it is feasible to enhance the productivity and potentially improve the accuracy [15], [16], [17].
It is essential to promote early diagnosis of brain tumors because they are the most common cause of cancer-related deaths in children and people up to 40 years of age. Therefore, it is necessary to devise strategies to accelerate early diagnosis of brain tumors. An early diagnosis of brain tumor implies faster response in treatment, thereby increasing the surviving rates of patients. A process designed to automatically detect, locate and classify brain tumors is desirable. AI and ML have gained prominence in almost every field of decision-making and can be successfully implemented for the detection and classification of brain tumors.
The objective of this paper is to investigate the use of ML classification algorithms to detect the presence of brain tumors and also distinguish between different types of brain tumors such as glioma, meningioma and pituitary tumors from brain MRI images. A computer-aided classification method is more reliable for brain tumor diagnostics. The proposed scheme involves a few steps including data collection, data preprocessing (data labeling and image pre-processing), classification based on enhanced ML techniques and finally a comparative analysis of the implemented models.

II. RELATED WORK
A lot of research work has been done in the field of Artificial Intelligence (AI) and Machine Learning  T application in the field of medical imaging. Noreen et al. [18] have proposed the use of two pre-trained deep learning models i.e. Inception-v3 and DensNet201 for developing a multi-level feature extraction and concatenation method for the early detection of brain tumors and their classification. At first, they have extracted the features from different Inception modules from the pre-trained Inception-v3 model. Then they have passed those features to the softmax classifier to perform the classification of the brain tumors. Secondly, they have used a pre-trained DensNet201 to extract features from various DensNet blocks. Then they have concatenated those features and passed them to the softmax classifier to classify the brain tumors. The dataset that they have used comprised of three classes of brain tumors and it is available publicly. Their proposed methodology has produced exceptional results and has outperformed all the existing state-of-the-art ML and Deep Learning (DL) models for brain tumor detection and classification.
In [19] Naik and Patel have used the decision tree classification algorithm for the detection and classification of brain tumor from MRI images. In the pre-processing step they have used the median filtering process and texture feature extraction technique has been used to extract the features. Their proposed model has exhibited improved efficiency in comparison to the traditional image mining methods. The results that they have obtained have been compared with the Naïve Bayesian classification algorithm. The decision tree classification algorithm has achieved a precision of 100%, Sensitivity of 93%, Specificity of 100% and Accuracy of 96%.
Tandel et al. [20] have proposed a transfer-learning-based AI paradigm using a Convolutional Neural Network (CNN) for brain tumor classification using MRI data. The transfer-learning-based CNN model has been benchmarked against six different ML classification algorithms, namely Decision Tree, Linear Discrimination, Naive Bayes, Support Vector Machine, K-nearest neighbour and Ensemble. Their proposed model has proven to be very useful in multiclass brain tumour grading and has yielded better results in comparison to the other ML models.
Sarhan [21] has presented a computer-aided detection (CAD) technique for the classification of brain tumors in MRI images. The features from the brain MRI images have been extracted by utilizing the Discrete Wavelet Transform (DWT). The extracted features have then been applied to a CNN to classify the input MRI image. His proposed approach has produced an overall accuracy of 98.5%.
Mohsen et al. [22] in their research work have proposed the development of a Deep Neural Network (DNN) classifier for the classification of brain tumors on a dataset comprising of 66 brain MRI images of 4 types of brain tumors, namely, normal, glioblastoma, sarcoma and metastatic bronchogenic carcinoma tumors. They have combined the classifier with DWT for feature extraction and principal components analysis (PCA). The DNN classifier yielded extremely good results with an average classification rate of 96.97%, average recall of 0.97, average precision of 0.97, average F-Measure of 0.97 and average area under the ROC curve (AUC) of 0.984 of all the four classes (normal, glioblastoma, sarcoma and metastatic bronchogenic carcinoma tumors).
In [23] Rehman et al. have conducted three studies using three architectures of convolutional neural networks (AlexNet, GoogLeNet, and VGGNet) to perform the classification of brain tumors such as meningioma, glioma, and pituitary. Then they have explored the transfer learning techniques, i.e., finetune and freeze using MRI slices of brain tumor dataset-Figshare. They have applied data augmentation techniques to the MRI images to generalize the results, increase the dataset samples and reduce the chance of over-fitting. The proposed fine-tune VGG16 architecture has attained the highest accuracy up to 98.69% in terms of classification and detection.

A. Dataset-A (binary classification)
Dataset-A has been used for binary classification. It comprises of 982 brain MRI images of patients with tumor and 493 images with no tumor. Thus, a total of 1475 images are present. A collection of 18 brain tumor images from Dataset-A are shown in Fig.1.      The presence of a white spot as marked by the red circles in Fig. 1, 3, 4, and 5 is an indication of the abnormal growth of tissues in the human brain. There is an aggregation of abnormal cells in some tissues of the brain in the above mentioned section. The tumors in these images are quite critical. Manual classification of these images is rather difficult. So, ML algorithms have been employed for the efficient detection and classification of brain tumors from these MRI images.

IV. STATE-OF-THE-ART MACHINE LEARNING ALGORITHMS
In this section some of the state-of-the-art ML algorithms that can be used for the detection and classification of brain tumors have been discussed. Support Vector Machine (SVM) is a supervised ML classification algorithm designed by a separative hyper-plane. The main purpose of the SVM is to segregate the data in the best possible way [24]. Therefore, SVM is a frontier which best segregates the two classes. Logistic Regression is a supervised ML algorithm which is used to predict a binary outcome based on a set of independent variables. The main aim of logistic regression is to find the best fitting model to describe the relationship between the outcome and a set of predictor variables [25]. K-Nearest Neighbor (KNN) is a supervised ML algorithm. It is used to solve binary classification problems. KNN predicts whether a given data point belongs to a particular class or the other by calculating the distance between the given data point and the other points [26]. The given data point belongs to that class whose data points are nearest to it. K in KNN refers to the number of points to be selected in the vicinity of the given data point. Naïve Bayes (NB) is a supervised ML algorithm used mostly for binary classification. It is based on Bayes' theorem with an assumption of independence among the predictors [27]. NB classifier assumes that the presence of a particular feature in a class is unrelated to the presence of any other feature. Decision Tree (DT) is a supervised learning model which is used to solve binary classification problems. DTs learn from simple decision rules inferred from the data features and predict the value of a target variable [28]. Random Forest is an ensemble ML algorithm which builds multiple DTs and then merges those together [29]. Hence, it produces results which are more accurate. In case of DTs, if the dataset is too large there are chances of over-fitting. So, Random Forest is used to avoid over-fitting of data. Random Forest can be used to solve both classification and regression problems. Stochastic Gradient Descent (SGD) is an efficient optimization algorithm which minimizes the cost function by altering the values of the parameters or coefficients of a function [30]. SGD Classifier implements a SGD learning routine to support various loss functions to perform classification tasks. Extreme Gradient Boosting, XGBoost is a member of the family of boosting algorithms. It is an efficient implementation of the Gradient Boosted Trees algorithm which is a supervised learning method. It is an ensemble ML technique and uses Gradient Boosting framework for prediction [31]. Boosting is an ensemble learning technique. It combines predictors with low accuracy and converts them into a model with an improved accuracy [32]. In gradient boosting the errors made by the predecessors is corrected by the predictor itself resulting in a strong model with high accuracy.

V. METHODOLOGY
The proposed methodology has been illustrated in Fig.7.

A. Methodology (binary classification)
In this section, the methodology that has been used to predict whether a patient has brain tumor or not from the brain MRI images in Dataset-A has been described.

1) Data pre-processing Data labeling
The images of brain tumor have been labeled as ‗1' and the images with no brain tumor as ‗0'.
Image pre-processing The images have been read in the gray scale (2D). To build a classifier using ML algorithms all the images have been converted into the same dimension. So, each image has been resized into 200*200 pixels.
For instance, the original image as shown in Fig. 8 has a dimension of 630*630 pixels. Its dimension has been transformed into 200*200 pixels. Similarly, each and every image in the dataset has been converted into a dimension of 200*200 pixels.

2) Splitting the dataset
The entire dataset has been split into training and testing data with a test size of 25%.

3) ML models used
The following ML algorithms have been implemented to perform the binary classification task:

B. Methodology (multi-class classification)
The methodology for multi-class classification of the brain MRI images in Dataset-B has been described in this section.

1) Data pre-processing Data labeling
The images with no tumor have been labeled as ‗0', images of glioma brain tumor as ‗1', images of meningioma as ‗2' and pituitary tumor as ‗3'.
Image pre-processing Each and every image in Dataset-B has been resized into a dimension of 200*200 pixels. For instance, the original image shown in Fig.9 has a dimension of 350*350 pixels. It has been converted into a dimension of 200*200 pixels. 2) Splitting the dataset 75% of the dataset has been used for training and the remaining 25% has been used for testing purposes.

3) ML models used
The following ML algorithms have been implemented to perform the multi-class classification task:

A. Binary classification results
A comparative analysis of the nine algorithms has been done based on the following performance metrics: For the evaluation of the accuracy, recall, precision and F1 score, the following 4 attributes have been used in the measurement: The attributes for determining the performance metrics for each of the 9 ML algorithms has been demonstrated in Table I. From the values of the attributed described in Table 1, it can be observed that Gradient Boosting classifier has detected 119 data as true positives and 222 true negatives accurately. It has misclassified only (21+7=) 28 data.
To get a clear picture, the evaluation metrics of all the ML algorithms have been computed and tabulated. The performance comparative analysis of the different ML algorithms based on accuracy, precision, recall and F1-score have been depicted in Table II From the test results shown in Table 2, it can be inferred that Gradient Boosting classifier has exhibited the highest accuracy among all the other ML models that have been implemented. To enable a clear interpretation of the binary classification models, ROC curves and PR curves have also been plotted. This has been described in the following sections.

ROC curves of the different algorithms for performance comparison
The ROC curves corresponding to SVM, Logistic Regression, KNN, NB, DT classifier, Random Forest classifier, SGD classifier, XGBoost classifier and Gradient Boosting classifier have been shown in Fig. 10, 11, 12, 13, 14    The AUC-ROC scores of all the ML classification algorithms have been depicted in Table III to enable a performance comparative analysis. The AUC-ROC curve is an important evaluation metric for binary classification problems. The area under the curve is a measure of the ability of a classifier to distinguish between the two classes. An excellent model should have an AUC near to 1 which is an indication that the model has a good measure of separability. AUC-ROC score of Random Forest exceeds that of Gradient Boosting by 0.003. The AUC-ROC score of Gradient Boosting is less than that of XGBoost by 0.001. Therefore, in order to conclude which classifier is the most accurate in detecting the presence or absence of brain tumor, the AUC-PR scores of all the algorithms have been computed. This has been described in the following section.

PR curves for performance comparison among the ML algorithms
The PR curves corresponding to SVM, Logistic Regression, KNN, NB, DT classifier, Random Forest classifier, SGD classifier, XGBoost classifier and Gradient Boosting classifier have been shown in Fig. 19, 20, 21     Similar to the ROC curve, the PR curve is used for evaluating the performance of the binary classification algorithms. The AUC-PR is constructed by plotting the precision against the recall for a single classifier at various threshold values. The nearer the AUC-PR score is to 1, the better the performance of the classifier.
The AUC-PR scores and the weighted average precision (AP) across all the thresholds of the proposed ML algorithms have been enlisted in Table IV.  Table IV, it can be observed that Random Forest has the highest AUC-PR score of 0.946 followed by XGBoost with an AUC-PR score of 0.940 and Gradient Boosting with an AUC-PR score of 0.914. A good classifier should maintain both high precision and high recall across the graph. Thus, it can be inferred performances of Random Forest, XGBoost and Gradient Boosting classifiers have been noteworthy. SVM, Logistic Regression, KNN, NB, DT and SGD have AUC-PR scores of 0.829, 0.837, 0.860, 0.723, 0.832 and 0.804 respectively. While Random Forest has the highest AUC-PR score NB classifier has the lowest.
Therefore, it is evident that the performances of Random Forest, XGBoost and Gradient Boosting classification algorithms have been quite encouraging. The evaluation metric values of these three classifiers have been tabulated in Table V in order to conclude which of these classifiers has exhibited the best overall performance. As depicted in Table V Also, the training time and prediction time of all the ML algorithms have been evaluated. This has been described in Table VI. The test results of the multi-class classification problem have been described in the following sections.

B. Multi-class classification results
A comparative analysis of the 4 ML algorithms has been done based on the following evaluation metrics: The performance comparison of the proposed algorithms on the basis of accuracy, recall, precision and F1-score has been demonstrated in Table VII. As depicted in Table VII, XGBoost has outperformed the other models in terms of accuracy, recall, precision and F1score. XGBoost has produced an accuracy, recall, precision and F1-score of 0.90 respectively.
In order to visualize the performance of the multi-class classifiers, the AUC-ROC curves of the four ML algorithms have also been plotted.

ROC curves of the multi-class classifiers for performance comparison
The ROC curves corresponding to SVM, KNN, Random Forest and XGBoost classifiers have been shown in Fig. 28, 29, 30 and 31 respectively.  The AUC-ROC scores of the multi-class classification algorithms have been tabulated in Table VIII.  Table VIII, it can be observed that the AUC-ROC score of XGBoost is the highest. XGBoost has exhibited an AUC-ROC score of 0.990. Random Forest has produced an AUC-ROC score of 0.989, followed by SVM and KNN with AUC-ROC scores of 0.931 and 0.899 respectively.
Hence, after comparing the evaluation metrics of all the four ML algorithms it can be concluded that XGBoost classifier has outperformed the other models with accuracy, precision, recall, F1-score and AUC-ROC of 0.90, 0.90, 0.90, 0.90 and 0.99 respectively. Therefore, XGBoost is the best model to accomplish the task of multi-class classification of the images from Dataset-B.

VII. CONCLUSION AND SCOPE OF FUTURE WORK
This paper presents a detailed overview of how ML algorithms can be used for medical image processing. ML has improved and paved the way for efficient diagnosis, recognition and prediction in numerous domains of healthcare, brain tumor detection and classification being one of them. Nine ML algorithms have been used to predict whether a patient has a brain tumor or not based on a dataset comprising of brain MRI images. The ML algorithms that have been used are Support Vector Machine (SVM), Logistic Regression, K-Nearest Neighbor (KNN), Naïve Bayes (NB), Decision Tree (DT) classifier, Random Forest classifier, XGBoost classifier, Stochastic Gradient Descent (SGD) classifier and Gradient Boosting classifier. A performance comparison of the different ML algorithms has been conducted based on a few performance metrics such as accuracy, recall, precision, F1-Score, AUC-ROC curves and AUC-PR curves. After the evaluation of the test scores, it has been concluded that Gradient Boosting is the best classifier among all the other ML classifiers that have been used. Also, multi-class classification has been performed on a different dataset comprising of brain MRI images of glioma, meningioma, pituitary and no tumor using SVM, KNN, Random Forest and XGBoost classifier. The ML algorithms have been compared based on accuracy, recall, precision, F1-score, AUC-ROC score and it has been observed that XGBoost classifier has exhibited the best results. In future, one of the most important improvements that can be made is adjusting the architecture so that it can be used during brain surgery, for classifying and accurately locating the tumor. Detecting the tumors in the operating theatre can be performed in real-time conditions; thus, in that case, the improvement would also involve adapting the network architecture to a 3D system. By keeping the network architecture simple, detection in real time can be made possible.