Regression Analysis of Predictions and Forecasts of Cloud Data centre
KPIs using the Boosted Tree Decision Algorithm
Abstract
The National Institute of Standards and Technology defines the
fundamental characteristics of cloud computing as: on-demand computing,
offered via the network, using pooled resources, with rapid elastic
scaling and metered charging. The rapid dynamic allocation and release
of resources on demand to meet heterogeneous computing needs is
particularly challenging for data centres, which process a huge amount
of data characterised by its high volume, velocity, variety and veracity
(4Vs model). Data centres seek to regulate this by monitoring and
adaptation, typically reacting to service failures after the fact. We
present a real cloud test bed with the capabilities of proactively
monitoring and gathering cloud resource information for making
predictions and forecasts. This contrasts with the state-of-the-art
reactive monitoring of cloud data centres. We argue that the behavioural
patterns and Key Performance Indicators (KPIs) characterizing
virtualized servers, networks, and database applications can best be
studied and analysed with predictive models. Specifically, we applied
the Boosted Decision Tree machine learning algorithm in making future
predictions on the KPIs of a cloud server and virtual infrastructure
network, yielding an R-Square of 0.9991 at a 0.2 learning rate. This
predictive framework is beneficial for making short- and long-term
predictions for cloud resources.