Reliable Provisioning With Degraded Service Using Multipath Routing From Multiple Data Centers in Optical Metro Networks

With the adoption of edge computing, several data centers are available within the footprint of an optical metro network, and contents are replicated in multiple locations. Such a wide content replication offers a unique opportunity to provide better services to users, especially for content-based services, e.g., video delivery. Thus, a service-provisioning scheme can embrace this opportunity to optimize network resource utilization, improve reliability, and achieve lower latency. In this study, we propose a reliable service-provisioning scheme that selects the optimal subset of data centers hosting the desired content and inversely multiplexes a content request over multiple link-disjoint paths. We formulate an integer linear program and develop heuristics for the problem, and use them to solve various complex and realistic network instances. Numerical data show that, compared to conventional service-provisioning schemes such as multipath routing from a single data center or dedicated-path protection, our proposed scheme efficiently utilizes network resources, improves reliability, and reduces latency; hence, it is suitable for the above-mentioned services.

infrastructure to a composite network-and-computing ecosystem where new applications and services can be implemented and supported [1].In particular, with the adoption of edge computing, several data centers (DCs) are now available within the footprint of an optical metro network.Typically, micro DCs are available in metro-access nodes, medium-size DCs are available in metro-core nodes, while hyper-scale DCs are available in core network nodes, and they communicate with the metro network via metro-core backbone nodes as gateways.Hence, with edge computing, contents (e.g., media files, applications, Web services, and documents) are now widely replicated in multiple DCs closer to users to offload core network traffic and to lower latency [2], [3].
Such a wide replication of contents in multiple locations offers a unique opportunity to provide better services to users, especially for content-based services.Among such services, video delivery is playing the main role, as, by 2022, it will be 79% of the world's mobile data traffic and 82% of all consumer Internet traffic [4].Other content-based services, such as augmented reality (AR) and virtual reality (VR), are emerging, which allow users to interact intuitively with the environment through six degrees of freedom [5].These services require high bandwidth, low latency, reliable connectivity, and are classified as mobile broadband reliable low-latency communication (MBRLLC) in the vision of 6G communications [6].While some of these services require full protection, others can continue to operate with reduced, i.e., degraded, quality in case of failures and can be served with partial protection (e.g., a video stream can switch to a lower resolution depending on available bandwidth).
High-capacity optical metro networks are exposed to many threats such as malicious attacks, equipment failures, human errors (e.g., misconfigurations), and natural and human-made large-scale disasters (e.g., earthquakes, hurricanes, and terrorist attacks).To ensure reliability, protection and restoration schemes are traditionally used.In a protection scheme, extra network resources are reserved when a connection is provisioned.Conventionally, a pair of paths is provided to a connection: one is used to carry traffic during normal operation, referred to as primary path; and the other path, referred to as backup path, is reserved and will be activated after a failure occurs on the primary path.In a restoration scheme, no extra resources are reserved for the backup path, and the network must react to find an alternative path after a failure occurs on 1932-4537 c 2023 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
See https://www.ieee.org/publications/rights/index.html for more information.the primary path [7].Since failures are hard to predict and statistically rare, providing protection in a dynamic network environment, especially full protection against multiple failures, would require massive and economically-unsustainable bandwidth overprovisioning.On the other hand, a restoration scheme would require a longer recovery time and provide no guarantee to restore a disrupted path.This study concentrates on multipath routing, a flexible and resource-efficient protection scheme in which a service request is provisioned over multiple paths by routing part of the requested bandwidth on each path [8], [9].With respect to the baseline multipath routing, we consider the opportunity to route different paths towards different destinations, representing different metro DCs hosting the required content.To illustrate our proposed service-provisioning scheme, in Fig. 1, we consider a dynamic network environment where, at time t, a user at node 1 is requesting for a content replicated in multiple DCs at nodes 4, 5, 6, and 7.In addition, the user requires bandwidth b during normal operation and can tolerate degraded service (i.e., degraded bandwidth) 0.6b in case of a singlelink failure.Note that, even though multiple-link/node failures can occur, a single-link failure is still the dominant failure scenario in an optical network [10], [11], [12], [13].Also, due to the asymmetric traffic characterizing content retrieval, we only consider downstream traffic (i.e., from DCs to the requesting node).Conventionally, a dedicated-path protection (DPP) scheme selects the optimal DC (e.g., the DC at node 4 which is closest to the requesting node) and reserves a pair of primary and backup paths (e.g., p 0 and p 1 ) for the request [14].As shown in Fig. 1.a, the bandwidths on the primary and backup paths are b and 0.6b, respectively.In case of a failure occurring on the primary path, the requested degraded service is still guaranteed after the backup path is activated.In total, DPP requires bandwidth 1.6b over the primary and backup paths and occupies network bandwidth 2.2b.Here, we define the network bandwidth as the sum of the bandwidth on each path weighted by the number of hops (i.e., 2.2b = b + 2*0.6b).
Fig. 1.b shows a different protection scheme using multipath routing from a single DC (MPSD) [8], [9], [15], [16], [17].In this scenario, data from the closest DC at node 4 to the requesting node are simultaneously transmitted on three link-disjoint paths p 0 , p 1 , and p 2 with bandwidths of 0.4b, 0.3b, and 0.3b, respectively.In case a failure occurs on a path, the requested degraded service is still fulfilled since survivable bandwidth remains at least 0.6b.In total, MPSD requires bandwidth b over three paths and occupies network bandwidth 1.6b.In Figs.1.a and 1.b, DPP and MPSD share a major shortcoming as they do not provide protection against failures in the source DC.For instance, in case of a failure occurring in the DC at node 4, provisioned services are disrupted.
In Fig. 1.c, we describe a service-provisioning scheme where the user at node 1 is simultaneously served by three different DCs at nodes 4, 5, and 7 on three link-disjoint paths p 0 , p 1 , and p 2 with bandwidths of 0.4b, 0.3b, and 0.3b, respectively.Hereafter, we refer to this service-provisioning scheme as multipath routing from multiple DCs (MPMD).In case a path is disrupted, MPMD ensures that survivable bandwidth remains at least 0.6b; hence, the required degraded service is still guaranteed.Compared to DPP and MPSD, MPMD also provides protection against failures in DCs (i.e., survivable bandwidth remains at least 0.6b if a failure occurs in one of the serving DCs, e.g., at node 4).In total, MPMD requires bandwidth b over three paths and occupies network bandwidth 1.6b.For each content request, MPMD must: 1) select the optimal subset of DCs hosting the desired content, 2) find linkdisjoint paths from each selected DC to the requesting node, and 3) allocate bandwidth to each path such that the total requested bandwidth during normal operation and degraded service in case a path is disrupted are fulfilled.In literature, this service-provisioning scheme is often referred as an inverse manycast scheme to indicate that source nodes (i.e., optimal subset of DCs) must be selected from a larger candidate set [13], [18].Compared to an inverse multicast scheme where source nodes are specified ahead of time, an inverse manycast scheme has greater flexibility in choosing the source nodes from which data (e.g., contents) are retrieved.Here, MPMD exploits the wide replication of a content in multiple DCs to provide better services to users and requires no additional resources (e.g., storage capacity) in each DC.Note that the underlying multipath routing can be implemented based on techniques such as virtual concatenation (VCAT) and link capacity adjustment scheme (LCAS), in which, different parts of the same content can be transported on different lightpaths from different DCs [19], [20].In case one path is disrupted, total offered bandwidth can be reduced to a degraded level using bandwidth squeezing restoration [21].
In this study, for the first time and to the best of our knowledge, we propose MPMD as a reliable service-provisioning scheme to fulfill content requests (e.g., video-on-demand, VR, or AR requests) in a dynamic network environment while guaranteeing strict requirements on bandwidth, and improving reliability and latency.This proposed service-provisioning scheme enjoys the benefits of multipath routing, provides protection against network (e.g., link) and content source (e.g., DC) failures, and uses minimal additional network resources due to the nature of multipath routing.We formulate the MPMD problem as an integer linear program (ILP), develop two scalable heuristics, and use them to solve various complex network instances.Numerical data show that, compared to conventional service-provisioning schemes such as DPP and MPSD, MPMD efficiently utilizes network resources, provides higher reliability, and reduces latency; hence, it is highly suitable for the emerging content services.
The rest of this study is organized as follows.In Section II, we review related works.In Section III, we formulate the MPMD problem as an ILP.In Section IV, we develop heuristics for the MPMD problem.In Section V, we perform numerical validation in various scenarios and compare the performance of MPMD to those of reference protection strategies (e.g., MPSD, DPP).We conclude this study in Section VI.

II. RELATED WORKS
Some studies have been conducted on reliable service provisioning using multipath routing.In [8], [9], [15], the authors developed algorithms for multipath routing in a static scenario in which, for each traffic demand, the source and destination nodes are specified ahead of time.Due to the nature of multipath routing, the algorithms ensure a certain level of degraded service in case of failures on one or several paths.In [16], the authors extended their work in [15] to a dynamic scenario where survivable bandwidth is guaranteed in case a path is disrupted.In [18], the authors studied the problem of routing and wavelength assignment for static manycast demands in wavelength-division multiplexing (WDM) networks.They proposed a solution for upstream traffic in which the requesting node sends data to several nodes in a larger set of candidate nodes using multipath routing and manycast.
The authors in [22] leveraged the concept of content connectivity in DC networks and developed algorithms to place contents at optimal DCs for a static scenario such that K node/link-disjoint paths are guaranteed from the requesting node to the DCs hosting the desired content.In [23], the authors proposed dynamic service provisioning in elastic optical networks (EONs) with hybrid single-/multipath routing (HSMR).They investigated the flexibility of selection between single-path routing, multipath routing, and how the paths between a pair of nodes are computed.Numerical results showed that HSMR with online path computation (OPC) can achieve the lowest bandwidth blocking probability among all HSMR schemes.However, HSMR was designed for unicast requests and did not consider survivability and differential delay of paths.In [24], [25], the authors developed reliable service-provisioning schemes using multipath routing and inverse manycast.However, the authors developed the solution for a static scenario and the paths between each pair of nodes are pre-computed (i.e., offline path computation).This assumption makes the solutions less practical for content-retrieval applications where requests (e.g., video on demand) arrive, hold, and depart dynamically, the network topology (based on availability of resources) changes over time, and OPC is more practical.
In this study, our focus is on the dynamic problem for content requests leveraging multipath routing and inverse manycast while ensuring strict requirements on bandwidth, and improving reliability and latency.

III. RELIABLE PROVISIONING WITH DEGRADED SERVICE USING MPMD
In this section, we formally state the MPMD problem and formulate it as an ILP (MPMD-ILP).

A. Problem Statement
In this study, each dynamic content request, θ, is characterized by a tuple θ = (t, n, c, b, m) where t is arrival time (s), n is the requesting node, c is the desired content (i.e., content ID), b is requested bandwidth (Mbps), and m is the ratio of survivable bandwidth to requested bandwidth in case a path is disrupted.In other words, if a path is disrupted, survivable bandwidth must remain at least m*b.We consider a graph G t (V t , E t ) to represent a network where V t is the set of nodes with available computing capacity and E t is the set of links with available bandwidth at request arrival time.Since content requests come, hold, and depart, the sets V t and E t can vary over time.Moreover, if the network finds insufficient resources for an incoming content request at its arrival time, the content request is blocked (no queuing).The desired content, c, has size h (GB), and is replicated in multiple DCs denoted by the set of nodes D, D⊂V t , |D|≥2.Without loss of generality, the requesting node does not host the desired content, i.e., n / ∈D.Also, we assume that each selected DC can support only one path.To guarantee degraded service (i.e., bandwidth m*b) in case a path is disrupted, total offered bandwidth over all paths, b (Mbps), can be larger than requested bandwidth (i.e., b ≥b).Moreover, if a request is offered bandwidth b , it departs at t = t + 8000 * h/b (s), where 8000 * h/b is the holding time (s) of the request (i.e., h (GB) and b (Mbps)).Here, we assume that each path has the same bandwidth and retrieves an equal amount of the desired content (i.e., holding time is equal in each DC).Note that we consider large contents, so a content's transmission delay is the major contributor to its total delay (and its propagation delay through the network is relatively negligible).
In this work, we focus on MPMD's bandwidth efficiency, survivability, and latency; and leave the choice of technologies in the physical layer (e.g., traffic grooming, wavelength assignment, impairment, WDM, and EON) as open research topics.Also, even though the ILP and heuristic algorithms developed in this and next sections can be applied for any optical network, we analyze their performance using an optical metro network covering an urban area up to tens of kilometers because, in most content-retrieval applications, a content is replicated in multiple edge DCs close to the requesting node which is typically a central office (CO).The data from the CO is ultimately delivered to an end user via either a mobile or fixed network.Note that, when MPMD is applied to a backbone network covering thousands of kilometers, constraints on impairments such as noise and nonlinearity must be incorporated in the ILP and heuristic algorithms.In case of space-division multiplexing (SDM) realized by EON and multi-core fibers (MCFs), the additional constraint on survivability of lightpaths (possibly in different fibers on different strands) must be considered [26].We reserve this topic as a possible extension of our current work.
The MPMD problem can be formally stated as follows.Given the content request θ = (t, n, c, b, m), the network graph at request arrival G t (V t , E t ), and availability of the desired content characterized by its size h and set of hosting DCs D, find: 1) optimal subset of DCs hosting the desired content (for convenience, we use D 0 to denote this optimal subset, D 0 ⊆D), 2) link-disjoint paths from each DC in D 0 to requesting node n, and 3) bandwidth on each path such that total requested bandwidth b during normal operation and degraded service m*b in case a path is disrupted are fulfilled.
Noting that multiple objectives cannot be optimized simultaneously, a weighted objective function is defined.The objective is to minimize total bandwidth over all paths, total network bandwidth (weighted sum), and total propagation delay of the paths from each selected DC to the requesting node.
The optimal solution is subject to constraints on the available capacity of each physical link, maximum differential delay between paths (i.e., differential delay constraint (DDC) which is maximum difference of propagation and processing delay between paths that can be compensated at the receiver [17]), and survivability of paths.
In the following sections, we develop the algorithms for MPMD-ILP and MPMD heuristics.It is worth noting that the algorithms developed in this work are for a single content arrival request.To obtain numerical results for a dynamic network environment, the algorithms must be integrated into a dynamic simulation framework.

B. Mathematical Formulation
Inputs: • h: size of the desired content (GB).
Subject to: In objective function (1), we introduce the scalars α, β, and γ to control the weight of each term.The first summation minimizes total bandwidth over all paths, the second summation minimizes total network bandwidth, and the third summation minimizes total propagation delay of the paths from each selected DC to the requesting node.We obtained numerical results for different values of α, β, γ, and derived the following observations.First, if α is very large compared to β and γ (e.g., α = 500000, β = 1, and γ = 1), MPMD-ILP tends to find a solution with minimal total bandwidth over all Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
paths to satisfy a content request, while the paths from each selected DC to the requesting node can become longer.These longer paths typically imply that the total propagation delay may not be minimal.Second, if γ is very large compared to α and β (e.g., α = 1, β = 1, and γ = 500000), MPMD-ILP tends to find paths with minimal total propagation delay.As a result, MPMD-ILP normally avoids circuitous paths and uses a lower number of paths (e.g., two paths to the first and second closest DCs) to admit a content request.However, the total bandwidth over all paths must be larger to guarantee a degraded service level in case a path is disrupted.Third, if β is very large compared to α and γ (e.g., α = 1, β = 500000, and γ = 1), MPMD-ILP tends to find a solution with minimal network bandwidth, while the total bandwidth over all paths might not be minimal.Moreover, since network bandwidth is directly related to number of hops, minimal network bandwidth implicitly tends to reduce total propagation delay.Numerical results also showed that, compared to the other two weight tuples, this weight tuple has a higher acceptance ratio of incoming requests.For these reasons, in our simulation setup, we set α = 1, β = 500000, and γ = 1.
Constraint (2) ensures that each request is simultaneously served by at least two DCs (i.e., multiple DCs).Constraint (3) guarantees that the total bandwidth over all paths for each request is at least b during normal operation.Constraint (4) sets the binary variable w d,n = 1 if the requesting node n uses DC d (i.e., x d,n > 0) where Ψ is a large positive integer, e.g., 10000.Constraint (5) requires that the mapping of the request on the physical network does not exceed the capacity of each physical link at request arrival time.Constraint (6) enforces flow conservation in which, for each path, traffic originates from DC d and ends at requesting node n.At a transit node, input traffic is equal to output traffic.Constraint (7) computes a binarization of the integer variable y i,j d,n and assigns it to z i,j d,n .Constraint (8) guarantees that the desired degraded service is satisfied (i.e., survivable bandwidth must remain at least m*b in case a path is disrupted).Constraint (9) restricts the mapping of the request on the physical network such that a single physical link cannot be shared by two or more paths.In other words, traffic from the DCs in D 0 to the requesting node is carried on link-disjoint paths.Therefore, constraints ( 8) and ( 9) strictly enforce survivable bandwidth to be at least m*b in case a path is disrupted.Constraint (10) ensures that the differential delay of two distinct paths fulfills DDC.Since this work focuses on MPMD's bandwidth efficiency, survivability, and latency, we assume that physical-layer components (e.g., transceivers, transponders, and computing and storage capability) in each node can provide enough throughput and skip the constraints on these components in the MPMD-ILP.
In   physical nodes and number of physical links, while it increases quadratically with number of hosting DCs.

IV. HEURISTICS FOR RELIABLE PROVISIONING WITH DEGRADED SERVICE USING MPMD
Since the MPMD-ILP presented in Section III is intractable for large network instances, it is impractical in a dynamic network environment where a fast solution is more desirable.In this section, we propose various heuristic algorithms for reliable provisioning with degraded service using MPMD.Before that, below, we introduce the auxiliary graph used in the following heuristics.

A. Auxiliary Graph
To find link-disjoint paths from the DCs hosting the desired content to the requesting node, we introduce an auxiliary graph by leveraging a dummy node and several dummy links.To ensure that the final output is not affected by the addition of the dummy node and links, we assume that the dummy node has unlimited computing capacity and the dummy links have unlimited bandwidth and zero propagation delay.
As shown in Fig. 2, the dummy node (i.e., node 0) is connected to each DC hosting the desired content using the dummy links (denoted by dotted lines).The solid and dashed lines represent the network graph at request arrival time (i.e., G t (V t , E t )) in which a user at node n is requesting for a content replicated in DCs d 0 , d 1 , d 2 , and Here, we use the dashed lines to abstract the real network with more nodes and links.Henceforth, we use to denote the auxiliary network graph which includes G t (V t , E t ), the dummy node, and dummy links.
One can observe that, to find the link-disjoint paths from the DCs hosting the desired content to the requesting node, we can find the link-disjoint paths from the dummy node to the requesting node on Once the link-disjoint paths are found, the MPMD problem in the previous section can be restated as follows.Given the link-disjoint paths from the DCs hosting the desired content to the requesting node, allocate bandwidth to each path such that, for each content request, total requested bandwidth b during normal Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
operation and degraded service m*b in case a path is disrupted are fulfilled.The amount of bandwidth allocated to each path must not exceed the available/bottleneck bandwidth of the path, and differential delay of two distinct paths must satisfy DDC.In the specific scenario in Fig. 2, the numbers of linkdisjoint paths (i.e., K), also the number of serving DCs, (i.e., K = |D 0 | = 3), is fewer than the number of DCs hosting the desired content (i.e., |D| = 4).In this scenario, the algorithm selects the DCs in ascending order of their distances to the requesting node.

B. Equal-Bandwidth, Maximum-K MPMD (K-MPMD)
In this subsection, we design an algorithm that finds the maximum number of link-disjoint paths from the dummy node to the requesting node, and equally allocates bandwidth to each path such that the total requested bandwidth and degraded service are fulfilled.Hereafter, we refer to this serviceprovisioning scheme as the equal-bandwidth, maximum-K MPMD (K-MPMD).
In Algorithm 1, as inputs, θ = (t, n, c, b, m), G t (V t , E t ), h, D, Λ t , Ξ, and Ω denote the tuple representing a content request, network graph at request arrival time, desired content size (GB), set of DCs hosting the desired content, hash table representing the available capacity of each link in E t at request arrival time, hash table representing the propagation delay of each link in E t , and maximum differential delay between paths (μs), respectively.As outputs, Algorithm 1 finds K as the number of link-disjoint paths for the content request, D 0 as the set of optimal DCs to serve the request (D 0 ⊆ D, |D 0 | = K ), P as the list of K link-disjoint paths from each DC in D 0 to the requesting node, b as the total offered bandwidth for the request, and t as request departure time (s).Since the total offered bandwidth, b , is equally allocated to K link-disjoint paths, the bandwidth on each path is b p = b /K .
Algorithm 1 starts by constructing the auxiliary graph with the dummy node and dummy links, ).In find_paths, we first set the capacity of each link in E d t to one bandwidth unit and use the Ford-Fulkerson's algorithm to find K as the maximum number of link-disjoint paths from the dummy node to requesting node [27], [28].find_paths also returns the flow graph, , where V K t and E K t are the actual nodes and links carrying the flow from the dummy node to the requesting node.If the maximum number of link-disjoint paths is larger than one (i.e., there are enough paths for multipath routing), the algorithm continues to find the minimum offered bandwidth on each path (i.e., b p , which is rounded up to the nearest integer).The algorithm computes the minimum offered bandwidth on each path (b p ) such that the constraints on the total requested bandwidth (b) and degraded service (m*b) in case a path is disrupted (i.e., there remains K − 1 survivable paths) are fulfilled (line 4).
From lines 5 to 16, Algorithm 1 performs a path decomposition loop to find the actual link-disjoint paths from the dummy node to the requesting node, and equally allocate bandwidth to each path.Here, we use an augmented Dijkstra's algorithm to find the shortest path (p ρ ) from the dummy node Algorithm 1: Equal-BW, Max-K MPMD (K-MPMD)  to the requesting node on the flow graph (line 6) [29].In addition to the actual path, the algorithm also finds the available bandwidth of the path (i.e., b ρ , bottleneck bandwidth), optimal DC (i.e., d ρ , from which the desired content is retrieved), and total propagation delay of the path (i.e., ξ ρ ).Lines 8 and 9 update the propagation delay of the shortest path (ξ min ) and longest path (ξ max ) in each iteration.From lines 10 to 13, the algorithm verifies that the available bandwidth of the path is enough for the requested bandwidth (b ρ ≥b p ) then adds optimal DC d ρ to set D 0 , appends path p ρ to list P, and removes the links along path p ρ from G K t (V K t , E K t ).Note that a path traced out from the dummy node on the flow graph always terminates at the requesting node since G K t (V K t , E K t ) is an acyclic, directed graph, and the path decomposition loop (lines 5-16) assuredly finds K link-disjoint paths [27], [28].However, the path decomposition loop provides no backtracking, thus it offers no guarantee that the total propagation delay of all paths is minimal.In case the available bandwidth of the path is not enough for the request, the content request is not admitted and the algorithm terminates with no result (line 15).
Once the bandwidth on each path is sufficient for the request, from lines 17 to 23, the algorithm verifies that the differential propagation delay of the longest path and the shortest path fulfills DDC (i.e., ξ max − ξ min ≤ Ω); then it subtracts resources (i.e., capacity used on each link) from the network and returns K, D 0 , P, b , and t .Lastly, if DDC is not fulfilled (line 26) or the number of paths from the dummy node to the requesting node is not enough for multipath routing (i.e., K < 2, line 28), the content request is not admitted and the algorithm terminates with no result.

C. Flexible MPMD (F-MPMD)
Since K-MPMD exploits great path diversity by inversely multiplexing a content request over the maximum number of link-disjoint paths, it can relax the required bandwidth on each path.However, it may have potential shortcomings, e.g., K-MPMD rejects a content request if the available bandwidth on one path is not enough for the requested bandwidth.In another scenario, K-MPMD also rejects a content request if the propagation delay of one path does not fulfill DDC.This approach may not be optimal since the remaining paths (i.e., the paths whose bandwidth and propagation delay do not violate the two above-mentioned constraints) can still be used to admit the content request.In case a fewer number of paths is used, more bandwidth needs to be allocated to each path.Furthermore, if there exists only one path from the dummy node to the requesting node, the algorithm can still make its best effort to admit the request but provides no guarantee of survivability (i.e., no multipath routing).
For an illustration, let us consider a scenario where K-MPMD first finds four link-disjoint paths (e.g., p 0 , p 1 , p 2 , and p 3 ) from four distinct DCs to the requesting node.To fulfill a content request with bandwidth b and m = 0.75, K-MPMD requires the bandwidth on each path to be at least 0.25b, and the differential propagation delay of the longest path and the shortest path must fulfill DDC.In case of insufficient bandwidth on one path (e.g., p 3 ), the remaining paths (p 0 , p 1 , and p 2 whose propagation delays fulfill DDC) can be used to admit the content request with reserved bandwidth on each path of 0.375b.If we further consider the scenario where the propagation delay of p 2 does not fulfill DDC, the content request can be admitted using two paths (p 0 and p 1 ) with reserved bandwidth on each path of 0.75b.So, we derived an alternative approach based on these observations, called flexible heuristic for the MPMD problem (F-MPMD) as follows (F for flexible).Note that, in case of a multipath provisioning (i.e., K ≥ 2), F-MPMD strictly enforces survivable bandwidth and DDC.
As shown in Algorithm 2, F-MPMD shares the inputs, outputs, and most steps with K-MPMD.However, F-MPMD also provides S as the output to indicate whether or not a service provisioning is survivable (i.e., survivable, S = True).In contrast to Algorithm 1 (where the algorithm terminates if the number of paths is not enough for multipath provisioning, i.e., K < 2), Algorithm 2 continues even if there is only one path from the dummy node to the requesting node (line 3).The code block between lines 4 and 9 computes the actual path(s) between the two nodes.
If there exists only one path from the dummy node to the requesting node and bandwidth on this path is enough for the requested bandwidth (b 0 ≥b), the content request is admitted, but provisioning is not survivable (S = False, default value).Algorithm 2 updates the network and returns results Algorithm 2: Flexible MPMD (F-MPMD)  (lines [10][11][12][13][14][15][16].In case the bandwidth on this single path from the dummy node to the requesting nodes is not enough for the requested bandwidth (b 0 < b), the content request is not admitted and the algorithm terminates with no result (line 18).
In case there are two or more link-disjoint paths from the dummy node to the requesting node (line 19), Algorithm 2 sorts the paths in P by their total propagation delay in ascending order, sets the shortest path as the reference, and removes the paths whose propagation delay violates DDC (lines 20-25).In case the number of paths is more than one after removing the paths whose propagation delay violates DDC (line 26), Algorithm 2 continues to sort the paths in P by their available bandwidths in descending order and removes the paths whose bandwidths are not enough to fulfill the requested bandwidth (lines 27-34).If the number of paths in P is enough for multipath provisioning after both operations (i.e., removals of paths whose propagation delay and bandwidth do not fulfill the two-mentioned constraints, line 35), the content request is survivably provisioned (i.e., S = True).Algorithm 2 subtracts resources (i.e., capacity used on each link) from the network and returns K, D 0 , P, b , t , and S (lines 36-42).In each step, if the number of paths remains one, Algorithm 2 returns to the scenario where there exists only one path from the dummy node to the requesting node (i.e., returns to line 10, such as on lines 44 and 46).
In summary, Algorithm 2 is flexible in every step to provision a content request.The algorithm keeps removing invalid paths (i.e., paths whose propagation delay and bandwidth are insufficient for the request) until it can find the optimal K to provision a content request.
The heuristic formulations for F-MPSD (i.e., flexible, MPSD) and DPP can be directly obtained from the F-MPMD formulations if, instead of the dummy node, the closest DC to the requesting node is used in Algorithm 2.

D. Complexity Analysis
In Algorithm 1, since the number of link-disjoint paths (i.e., K) is deterministic (e.g., number of link-disjoint paths must not exceed the nodal degree of the requesting node), the number of iterations inside the while loop (lines 5-16) is also deterministic.Moreover, as the number of links on each path is deterministic, the number of iterations inside the for loops (lines 12-13 and 19-22) is also deterministic.Note that the construction of a graph from another graph requires linear time complexity (i.e., O(|V t | + |E t |)), which is a non-dominant term.Algorithm 1's time complexity heavily depends on the time complexity of the method to find the maximum flow (line 2) and the method to find the shortest path (line 6) from the dummy node to the requesting node.In this study, we use the Ford-Fulkerson's algorithm to find the maximum flow from the dummy node to the requesting node whose time complexity is . Moreover, we use the augmented Dijkstra's algorithm and a min-heap data structure to find the shortest path from the dummy node to the requesting node whose time complexity is O(( If we omit non-dominant terms and deterministic constants, the time complexity of Algorithm )) if we omit deterministic constants and non-dominant terms such as sorting of the paths in P (lines 20, 27).

E. Service Probability
In the previous subsections, we designed the algorithms for the MPMD problem against the dominant failure scenario, namely a single-link failure.The algorithms guarantee degraded service against a random single-link failure in an optical metro network.In this subsection, we compare the reliability of F-MPMD, F-MPSD, and DPP from a service probability perspective.We consider five typical scenarios, including a single-link failure (L 1 ), a double-link failure (L 2 ), a single-DC failure (D 1 ), a double-DC failure (D 2 ), a singlelink plus a single-DC failure (L 1 + D 1 ), and address the fundamental question: if a content request is already survivably provisioned (i.e., K ≥ 2) and any of the failure scenarios occurs, what is the probability of fulfilling the requested degraded service?We define service probability, or Π, as the probability that the requested degraded service is guaranteed against a specific failure scenario.To simplify calculations without loss of generality, in this subsection, we assume that a link or a DC in an optical metro network is failed with equal probability.
Since F-MPMD, F-MPSD, and DPP are designed to guarantee degraded service against a single-link failure, their service probabilities in this scenario are 100%, i.e., Moreover, in contrast to F-MPSD and DPP, F-MPMD also provides protection against a single-DC failure, hence, Π D 1 F-MPMD = 100%.For other failure scenarios, we use E θ to denote the set of physical links used by the content request θ, E ρ θ to denote the sets of physical link(s) on each path, ρ = {0..K − 1}, and derive the formulas for service probability as follows.
Here, for example, for D 2 (i.e., a double-DC failure), we compute the number of combinations (without competition) which disrupt the requested degraded service (e.g., the numerator of the second term in (14) which is |D 0 | 2 where D 0 is the set of serving DCs), the total number of combinations (e.g., the denominator of the second term in (14) which is |D|  2 where D is the set of content-hosting DCs), and the service probability against this failure scenario, In equations (11) to (17), we derive the formulas to compute the service probability for each failure scenario and each protection scheme.We will use the above equations to show that, among the three schemes, F-MPMD has better service probability in most failure scenarios.

A. Physical Network and Simulation Setup 1) Physical Network:
To evaluate our proposed solutions, we use the Tokyo23 metro network covering an urban area up to tens of kilometers in diameter as in Fig. 3 [30].The Tokyo23 metro network has been designed using regional characteristics such as population distribution, locations of local government offices, and railway lines with the number of passengers getting on/off each station.It consists of 43 bidirectional links (i.e., 86 100-Gbps, unidirectional links) and 23 nodes, with each node located at each ward office building in the Tokyo metropolitan area.We also validated our proposed algorithms on the Milan52 metro network which represents a Telecom Italia metro-regional reference network with 52 nodes and 101 bidirectional links [31], [32].Since the results obtained with the Milan52 metro network are in line with the ones obtained with Tokyo23 metro network, below, we only reported the results obtained with Tokyo23 metro network to not unnecessarily repeat the same general findings.
2) Experiment Setup: For our experiment setup, we use the simulation framework in Fig. 4 to simulate the network with 105000, 205000, 305000, 405000, and 505000 content requests whose arrivals follow a discrete Poisson process.We first generate all content requests and enqueue them in a time-priority queue and start with an empty network (i.e., a network with no active traffic).During simulation, each content request is dequeued from the queue and provisioned using one of the proposed algorithms (MPMD-ILP, K-MPMD, or F-MPMD).If a content request is admitted, required network resources are reserved for it, and a departure event with departure time equal to the content request arrival time plus the holding time is enqueued to the queue.If the event is a departure, reserved network resources are released.Note that the simulator processes one event at a time (either an arrival or a departure), and it proceeds with the next event in queue as soon as it finishes the current one.We also found that the network requires approx.5000 content requests to reach a steady state, and numerical results obtained for simulating 105000 and 505000 content requests are comparable.To reduce experiment time, below we report the numerical results for simulating 105000 content requests (i.e., first 5000 content requests are discarded and the acceptance ratio is the ratio of the number of admitted requests over the number of simulated requests (i.e., 100000 in our simulations).
In a dynamic network environment, the acceptance ratio of incoming requests as a function of the arrival rates is crucial.In an ideal network with abundant resources (e.g., high-capacity links, super-fast computing capability in DCs/nodes, and contents replicated in multiple locations), the acceptance ratio, or η, is 100%.However, in practice, network resources are scarce, and when content requests arrive at a very high rate, the network becomes congested, and several incoming requests may be dropped (i.e., η < 100%).In this study, we define the congestion point in a dynamic network as the arrival rate at which the network starts to drop several incoming requests, and use η 0 to denote this value.To run a network harder, with the same input setting, a service-provisioning scheme with a higher η 0 is desirable.In a congested network, a serviceprovisioning scheme with a higher η is preferred since it can admit more requests.In the next sections, we will use η 0 and η to evaluate our proposed service-provisioning schemes (MPMD-ILP, K-MPMD, and F-MPMD), and compare their performance to those of reference schemes (F-MPSD, DPP).Also, in Sections V-B, V-C, and V-D1, we consider content requests which are survivably provisioned (i.e., K ≥ 2).

B. MPMD-ILP vs. K-MPMD vs. F-MPMD
Let us first compare the performance of MPMD-ILP, K-MPMD, and F-MPMD.We consider a service provider with a content catalog of 10000 contents, whose size, h, ranges from 5 GB (e.g., a medium HD video) to 1000 GB (e.g., a long VR/AR video).We assume that, for each content request, the desired content is replicated in multiple DCs, i.e., D = {1, 7, 13, 16}.Note that contents in DCs are dynamic, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.stateful, and require frequent updates (e.g., a content is replicated and synchronized in an edge DC by its popularity following a Zipf distribution [2]).However, the synchronization of contents among DCs is not considered in this study.The requested bandwidth, b, is uniformly selected from discrete values, ranging from 200 Mbps (e.g., a stream for a VR head-mounted display worn by the user) to 2000 Mbps (e.g., an uncompressed VR/AR flow).The required level of degraded service for each request, m, is randomly selected from discrete values, m ∈{0.5, 0.7, 1.0}, where m = 1.0 denotes full protection in case a path is disrupted.The requesting node, n, is selected from the nodes not hosting the desired content following the population distribution.For DDC, we consider the propagation delay on fibers as the major delay in an optical network whose practical value is set to 2 ms (e.g., 6DoF VR immersive experience use case) [2], [8], [17].
In Fig. 5, we report the acceptance ratio of MPMD-ILP, K-MPMD, and F-MPMD at different request arrival rates.As expected, MPMD-ILP outperforms K-MPMD and F-MPMD.In fact, while K-MPMD and F-MPMD start dropping incoming requests around the arrival rates of 77 and 98 requests/minute, respectively, MPMD-ILP can run the same network harder and starts dropping incoming requests around the arrival rate of 136 requests/minute.In a congested network (e.g., where arrival rate is larger than 136 requests/minute), compared to K-MPMD and F-MPMD, MPMD-ILP approximately accepts 5% and 12%, respectively, more incoming requests.Even though MPMD-ILP outperforms K-MPMD and F-MPMD, in a dynamic environment, F-MPMD, which outperforms K-MPMD, is more desirable since it can find a fast solution.
In Table II, we report other relevant data for MPMD-ILP, K-MPMD, and F-MPMD.On average, the number of paths per request (i.e., K avg ) decreases when moving from K-MPMD (∼3.5 paths/request), to F-MPMD (∼3.3 paths/request), and to MPMD-ILP (∼2.6 paths/request) because K-MPMD always finds the maximum number of paths from the dummy node to the requesting node and equally allocates requested bandwidth to each path.On the contrary, MPMD-ILP finds just enough number of paths and allocates more bandwidth to each path to fulfill a request.Also, since F-MPMD first finds the maximum number of paths from the dummy node to the requesting node and then drops invalid paths (i.e., whose bandwidths or propagation delay is insufficient for the request), the average number of paths per request for F-MPMD is less than the average number of paths per request for K-MPMD.On average, the number of paths per request is increased 32% and 40% from MPMD-ILP to F-MPMD and K-MPMD, respectively.
Another notable observation is that, since the path decomposition loops in Algorithms 1 and 2 provide no guarantee to find the paths from each DC in D 0 to the requesting node whose total propagation delay is minimal, MPMD-ILP's average path propagation delay is less than K-MPMD and F-MPMD's average path propagation delay.On average, the path propagation delay (i.e., ξ avg ) of MPMD-ILP is about 10 μs less (or 18% less) than the path propagation delay of K-MPMD and F-MPMD.As a result, even though MPMD-ILP offers more bandwidth per request (i.e., b avg , on average, 11.5% more bandwidth), it uses less network bandwidth (i.e., b avg ).On average per request, compared to K-MPMD and F-MPMD, MPMD-ILP uses about 17.4% less network bandwidth.
In Table III, we reported the average execution time per arrival request (ms) for MPMD-ILP, K-MPMD, and F-MPMD.Our heuristics can obtain a solution in a timely manner, considering also that a few milliseconds for computing a solution and buffering data is initially reserved for each content request.The ILP, instead, is not suitable for a dynamic content request but provides a reliable benchmark for the heuristics.

C. F-MPMD vs. F-MPSD vs. DPP
In this subsection, we compare MPMD (using F-MPMD) with two reference protection schemes, namely F-MPSD and DPP.We use the same simulation setting as in Section V-B, and since the congestion point of DPP is lower than the congestion point of F-MPMD and F-MPSD, we also report numerical data for the arrival rate as low as 40 requests/minute.
1) Acceptance Ratio and Latency: In Fig. 6, we report the acceptance ratio of F-MPMD, F-MPSD, and DPP at different request arrival rates.Numerical data show that F-MPMD outperforms F-MPSD and DPP as it can run the network harder with a higher congestion point.In detail, the congestion points of F-MPMD, F-MPSD, and DPP are 102, 88, and 40 requests/minute, respectively.In a congested network, compared to F-MPSD and DPP, respectively, F-MPMD accepts approximately 4% and 15% more incoming requests.
In Table IV, we report the relevant data for F-MPMD, F-MPSD, and DPP.Since both F-MPMD and F-MPSD rely on multipath routing, they use significantly less bandwidth.Compared to DPP, on average, F-MPMD uses about 39.5% less bandwidth per request (i.e., b avg ).As another observation, among the three service-provisioning schemes, F-MPMD uses least network bandwidth per request (i.e., b avg ), and compared to F-MPSD, it can save up to 30%.Moreover, on average, the path propagation delay of F-MPMD (i.e., ξ avg ) is approximately 20 μs less than the path propagation delay of F-MPSD, making F-MPMD more suitable for emerging services which have stringent latency constraint.Lastly, since DPP uses only two paths (i.e., the first and second shortest paths from the closest DC to the requesting node), the average propagation delay per path used by DPP is least among the three service-provisioning schemes (F-MPMD, F-MPSD, and DPP).
2) Service Probability: We now compare the service probability of F-MPMD to the service probability of F-MPSD and DPP.Here, we set the request arrival rate to 40 requests/minute (i.e., η = 100% for F-MPMD, F-MPSD, and DPP as in Fig. 6), and compute the average service probability per request for each failure scenario using the formulas in Section IV-E.We report the results in Table V.As expected, F-MPMD, F-MPSD, and DPP all guarantee degraded service against a random single-link failure on the physical network (i.e., Π L 1 F-MPMD = Π L 1 F-MPSD = Π L 1 DPP = 100%).Moreover, since F-MPMD offers protection against a single-DC failure, its service probability is 100% in this scenario (i.e., Π D 1 F-MPMD = 100%) while F-MPSD and DPP provide no service guarantee (i.e., Π D 1 F-MPSD = Π D 1 DPP = 71.3%).We also see that, since DPP uses only two paths, for each request, the number physical links used by DPP is fewer than the number physical links used by F-MPMD and F-MPSD.In other words, the number the double-link failure combinations disrupting DPP is lower than the number the double-link  7, increasing the number of content replicas from NR = 2 to NR = 3 significantly improves the acceptance ratio of incoming requests.In detail, the congestion point is increased from 44 to 65 requests/minute; and in a congested network, with NR = 3, compared to NR = 2, F-MPMD accepts around 25% more incoming requests, while acceptance ratio improves more slowly when content replicas are further increased to 4, 5, and 6.In general, adding more content replicas increases the acceptance ratio, but it also implies more synchronization overhead.For this trade-off, only a content provider knows which option is cost-optimal.Therefore, the decision on the number of content replicas in a specific network may vary from content provider to content provider.
2) Survivability (Surv) vs. Non-Survivability (Non-Surv): To this end, we show how many more content requests F-MPMD can admit even though their survivability is not guaranteed.We reports results for three distinct levels of degraded service (i.e., m = 0.5, 0.7, and 1.0).
As shown in Fig. 8, the dotted lines denote acceptance rates where admitted requests are survivable in case a path is disrupted (i.e., S = True in Algorithm 2) while the solid lines represent acceptance rates where requests can be non-survivable.Considering the dotted lines, switching the levels of degraded service from 1.0 (i.e., full protection), down to 0.7, and to 0.5 increases the congestion points from 87, to 100, and to 107 (requests/minute), respectively.Furthermore, the solid lines show how many more content requests F-MPMD can admit even though it provides no guarantee of survivability against a single-link failure.By admitting several requests without ensuring survivability (solid lines), F-MPMD can run the same network much harder and only starts dropping incoming requests at the arrival rates of 131, 140, and 152 (requests/minute), for m = 0.5, m = 0.7, and m = 1.0, respectively.In a highly congested network (e.g., arrival rate 180 (requests/minute)), compared to the scheme where B-MPMD ensures survivability against a single-link failure, B-MPMD can admit approx.10% more incoming requests.

VI. CONCLUSION
We proposed a reliable service-provisioning scheme that inversely multiplexes a dynamic content request over multiple link-disjoint paths from multiple data centers using manycast.We developed an integer linear program and two scalable heuristics for the proposed scheme and used them to solve various complex network instances.Numerical data show that, compared to conventional service-provisioning schemes such as multipath routing from a single data center and dedicatedpath protection, our proposed service-provisioning scheme efficiently utilizes network resources, admits more requests, improves reliability, reduces latency; hence, it is very suitable for emerging content-based services.

Fig. 2 .
Fig.2.An auxiliary graph with a dummy node (0) and dummy links (dotted lines from node 0 to DCs hosting the desired content).

Fig. 6 .
Fig. 6.Acceptance ratio of F-MPMD, F-MPSD, and DPP as a function of request arrival rates.

Fig. 7 .
Fig. 7. Acceptance ratio of F-MPMD as a function of number of content replicas (NR) at different request arrival rates.failure combinations disrupting F-MPMD and F-MPSD.As a result, among the three protection schemes, DPP has the highest service probability against a double-link failure (i.e., Π L 2 DPP = 99.9%).Similarly, compared to F-MPSD, F-MPMD tends to use shorter paths (refer to ξ avg in Table IV); hence, it has higher service probability against a double-link failure (Π L 2 F-MPMD = 99.4%,compared to Π L 2 F-MPSD = 98.9%).In the scenario where there are two simultaneous failures in two distinct DCs, F-MPMD outperforms F-MPSD and DPP in terms of service probability (Π D 2 F-MPMD = 60.1% compared to Π D 2 F-MPSD = Π D 2 DPP = 54.2%) because F-MPMD uses multiple DCs while F-MPSD and DPP use only one DC for each request.For the last scenario where one link plus one DC fail simultaneously (i.e., L 1 + D 1 ), with the same total number of links (i.e., |E θ | in Eqns.(16) and (17)), the number of combinations disrupting F-MPMD is |D 0 | − 1 times the number of combinations disrupting F-MPSD and DPP.Hence, compared to F-MPSD and DPP, F-MPMD has lowest service probability (Π L 1 +D 1 F-MPMD = 94.7%,Π L 1 +D 1 F-MPSD = 96.5%, and Π L 1 +D 1 DPP = 98.6%).

Fig. 8 .
Fig. 8. Acceptance ratio of F-MPMD as a function of request arrival rates at three different levels of degraded service.
if the path from DC d to requesting node n is mapped on physical link (i, j); and 0 otherwise.= 1 if the path from DC d, d ∈ D, to requesting node n is mapped on link (i, j); and 0 otherwise.
• D: DCs hosting the desired content, D⊂V t , |D|≥2.• Λ t : hash table representing capacity of physical links where each key-value pair (i , j ) : λ i,j t , ∀(i , j ) ∈ E t , is link (i, j)'s available capacity at request arrival time (Mbps).• Ξ: hash table representing propagation delay of physical links with each key-value pair (i , j ) : ξ i,j , ∀(i , j )∈E t , being propagation delay of link (i, j) (μs).• Ω: maximum differential delay between paths (μs).Variables: • w d,n : a binary variable, and w d,n = 1 if requesting node n uses DC d, d ∈ D; and 0 otherwise.• x d,n : an integer variable denoting the bandwidth reserved on the path from DC d, d ∈ D, to requesting node n. • y i,j d,n : an integer variable denoting the mapping of the path from DC d, d ∈ D, to requesting node n on physical links (i, j).Here, y i,j d,n = x d,n , x d,n > 0 • z our MPMD-ILP, the numbers of variables and constraints for each request are upper bounded by 2 * |D| * (1 + |E t |) and 2 * (1 + |D| + |E t |) + |D| * (|V t | + |E t |) + |D| 2 , respectively.Here, we use |•| to denote the cardinality of a set and