Abstract
By pushing resources to far-edge servers located in the proximity of
users, edge computing can greatly reduce end-to-end transmission delays.
Task offloading in multi-tier networks refers to the optimization of
which tasks should be offloaded from the far-edge to the edge and the
cloud. Moreover, the containerization of applications can further reduce
resource and time consumption and, in turn, the latency of such
applications. Even though Kubernetes has become the de facto container
orchestrator, not many works have considered the offloading of
containerized applications in Kubernetes clusters spanning from cloud to
far-edge. In this work, the problem of offloading Kubernetes tasks (or
pods) in three-tier networks is formulated and optimized. First, a
utility function is presented in terms of the cumulative weighted pod
response time, and a utility minimization problem with central
processing unit (CPU) constraints is presented. Based on the optimal
theoretical solution to this problem, a three-tier offloading decision
algorithm (TTODA) is developed. Vertical scaling is considered, and
specific hardware capabilities of each node are taken into account by
setting specific SLAs that are fed back to the algorithm. Numerical
results show that TTODA outperforms a typical Kubernetes QoS model based
on first-in, first-served algorithm (FIFSA) in terms of utility, average
pod response time, and usage of far-edge CPU. Further, TTODA achieves an
excellent trade-off between performance and computational complexity,
and thus it can help achieve the requirements of latency-sensitive
applications. Moreover, TTODA can easily be extended to scenarios with
joint memory and CPU constraints.