Essential Maintenance: All Authorea-powered sites will be offline 9am-10am EDT Tuesday 28 May
and 11pm-1am EDT Tuesday 28-Wednesday 29 May. We apologise for any inconvenience.

loading page

Comparative Analysis of Machine Learning Algorithms for Predicting House Prices
  • Sachith Nimesh Yamannage
Sachith Nimesh Yamannage
Faculty of Science, University of Ruhuna

Corresponding Author:[email protected]

Author Profile

Abstract

This study conducted a thorough examination of residential property data in the Abbotsford area of M elbourne with the goal of identifying significant trends in housing, regional patterns, and variables aff ecting home values.The dataset contained a number of elements, such as neighborhood information, g eographic coordinates, transaction details, and property attributes.Using a variety of techniques, inclu ding mean imputation, forward filled imputation, machine learning algorithms, and discarding missin g data, the study started with the identification and treatment of missing values.Particularly in the pric e variable, outliers were found, and boxplots and other visualization tools were used for outlier analysi s.For additional analysis, numerical values representing the categorical variables were converted.To in vestigate the distributions of numerical variables and comprehend connections between variables with a focus on correlations with home prices-univariate and bivariate analyses were carried out. Feature engineering, covariance analysis, ANOVA testing, and predictive modeling with regression algorithms like Random Forest, XGBoost, and Support Vector Machine (SVM) were all part of the quantitative analysis process. Metrics like Mean Absolute Error (MAE) were used to assess the performance of the model; the results showed that XGBoost was the most accurate predictor of housing prices. Significant factors influencing home prices were identified by the study, such as building area, property type, number of rooms, and geographic considerations including proximity to important sites. Each component was analyzed in terms of its relative relevance, and the building area and land size. It was noted how the constructed model has limits, such as overfitting and the need for more model refining. The results offer insightful information to scholars, politicians, and real estate professionals who are interested in the dynamics of the housing market.
06 Mar 2024Submitted to TechRxiv
11 Mar 2024Published in TechRxiv