Placement of High Availability Geo-Distributed Data Centers in Emerging Economies

The data center markets in emerging economies are being built at a furious pace. When high availability is required, as it always is in the modern digital economy, the placement of geo-distributed data centers may be influenced by factors such as technician shortage and under-developed infrastructure, both of which are typical in emerging economies. Although the data center availability subject in general has been well studied, it remains unclear how rapid and unbalanced economic development in emerging economies may affect the availability of geo-distributed data centers and their cost of ownership. In this paper, we incorporate the unbalanced availability of infrastructure and technician into the data center placement. The problem is first formulated as a mixed integer nonlinear program (MINLP). To solve this potentially large scale problem, we transform it into a QCQP, capable of handling heterogeneous workloads. The resulting problem can then be efficiently solved by off-the-shelf optimization toolboxes. With real-life data in China, we show how unbalanced development of infrastructure and technician shortage may affect the placement of data centers, and analyze the tradeoff between cost and availability. Our results indicate that technician shortage and unbalanced network infrastructure will lead to increased cost and distinct data center placement strategies.

in those countries. According to [2], the ten Emerging Market countries are forecast to grow significantly, at an overall rate of 52.9% over the 2021-2025 period, with a CAGR equivalent to over 13.2%.
Since data center facilities are inevitably subject to various types of risks, high availability geo-distributed data centers are becoming increasingly important in modern digital economy. These risks include natural calamities such as earthquakes, hurricanes and floods. Risks can also be caused by human factors, such as unintentional mis-configuration, or cyber-attacks. A brief disruption in the cloud service may cause millions to be denied of service, and put thousands of businesses on hold, resulting in huge economic losses [3], [4]. It is reported in [6] that temporarily shutting down a top U.S. cloud computing provider could trigger as much as $19 billion in business losses. The need for high availability of cloud data centers thus comes hand in hand with its rapid growth. Furthermore, the diversity of Internet services also puts forward new requirements for availability. Different business units will have different requirements that accommodate different service levels. These requirements are generally availability, QoS, performance, and MTTR (mean time to repair) [7]. Therefore, heterogeneous Internet services with diverse SLAs bring new challenges to DC placement. The uneven regional distribution of infrastructure development levels and technical talent, common in emerging economies, may also have significant impacts on how data centers should be placed. One factor is the unbalanced development of the digital infrastructure, such as power supply and telecommunication infrastructure. In a report by the World Bank, unreliable and fluctuating power supply can be a major obstacle for developing the digital economy in under-developed regions [5]. In terms of communication infrastructure, network architectures tend to be deployed in high-revenue locations, so focus is on high-density population areas, while some remote areas have no coverage at all. Geographically unbalanced Internet penetration will lead to regional differences in network link availability. One other factor is the unbalanced flow of technical talents between developed areas and under-developed ones [8], [9]. The lack of technical personnel will lead to a longer repair time when an unexpected outage happens. When data centers are placed, these two factors partly negate the cost advantage of under-developed areas, and thus, make the planning problem more difficult.
Existing studies propose the method with the objective to minimize the disaster risk [14] and loss [15] in order to plan high available data center. In this paper, we aim to tackle the placement of high availability geo-distributed data center problem in the context of emerging economies, where the availability of infrastructure and technical talent may vary significantly across different areas. First, we characterize the data center availability model influenced by infrastructure construction and technical talent and propose a two-site topology data center planning framework capable of capturing the distinct factors eminent in these countries. Second, we reformulate the model into a structured quadratically constrained quadratic program (QCQP). We finally apply real-life data in China to demonstrate what can be achieved by the model.
The key contributions of this paper include the following: r For the first time, we incorporate the unbalanced availability of infrastructure and technical personnel into the data center placement problem, and show that both can significantly impact the data center cost of ownership and distribution in emerging economies.
r We first formulate the problem into an MINLP. We then simplify the problem by transforming it into QCQP and decomposing with heterogeneous types of workloads. With this method, the problem can be efficiently solved by off-the-shelf optimization toolboxes such as Gurobi, and globally optimal placement results can be obtained. The rest of the paper is organized as follows. In Section II, we discuss the related work. In Section III, the system model and assumptions are proposed. We formulate the MINLP problem in Section IV. In Section V, we reformulate and decompose QCQP to simplify the model. In Section VI, we introduce our data sources and instantiate the problem using real-life data in China, followed by numerical results and discussions in Section VII. Finally, we conclude our work in Section VIII.

II. RELATED WORK
It is not cost-effective to harden individual data centers to meet typical availability targets [10], [11]. A Tier-4 DC that provides 99.99% availability is twice as expensive as a Tier-2 DC that provides 99.75% availability. Thus, geo-distributed spare capacity is required for achieving higher availability [12]. For example, Amazon S3 redundantly store objects across multiple zones [17]. In [13], the authors introduced the architecture of disaster tolerant data center (DTDC) and showed the availability improvement through the system. In this section, the existing related work can be divided into two categories. First category focuses mainly on service availability method to plan highly available geo-distributed data center. Alternatively, the second category debates on the data center placement strategy.
Service availability: In the problem of planning highly available geo-distributed data center, service availability is the primary concern. In [14], authors defined an availability model that consider possible disasters. They proposed a disaster-aware data center and content placement framework with the objective to minimize risk. Dikbiyik et al. [15] focused on the risk in optical networks and developed a probabilistic risk model to analyze the loss caused by possible disasters. They also provided a mathematical model and solution to reduce the risk and decrease the loss in case of disasters. Narayanan et al. [12] mentioned the requirement of geo-distributed spare capacity for high availability and presented an optimization framework jointly considering application availability and latency. Chang et al. [16] defined service availability from the perspective of labor forces for network maintenance. They conducted Amdahl's law to describe human teamwork and proposed solutions aiming at optimally scheduling labor forces to satisfy service availability requirement. Muhammed shaukat et al. [19] concentrated on both high availability and high energy efficiency. They proposed Energy-Aware Fault-Tolerant (EAFT) approach and study the tradeoff between fault tolerance and energy efficiency.
Placement strategy: Data center selection and placement has been addressed in many studies. In [20], the authors formulated the DC placement problem into a multi-objective optimization problem, and used enhanced lexicographic optimization and tabu search to solve the problem. The authors in [21] studied how renewable energy can be used to provision green data centers and developed a heuristic-based optimization approach. In [10], the authors investigated the major contributing components in building and operating data centers, and proved that wise placement of data centers may help to save millions of dollars. The authors solved the problem by reformulating the problem to linear programming (LP). Heuristic approaches were also provided. In [23], dynamic VM placement method focusing on minimizing energy and carbon cost was proposed. The algorithm they provided first selected the data center with the minimum added cost, considering green energy and dynamic PUE. The next step was server selection and was based on the least increase in the server power consumption. Saad Mustafa et al. [18] also present the consolidation techniques to reduce energy consumption along with resultant SLA violations. Different from the traditional best-fit decreasing (BFD) algorithm, two new techniques are proposed: Maximum Capacity BFD and Minimum Power BFD. The proposed algorithms outperform others by selecting best servers with regards to CPU capacity and energy consumption. In [35], a multi-resource based approach was proposed to improve application performance with the consideration of CPU utilization and execution time. The problem is designed with the mixed-integer linear programming for VM placement, and showed the effectiveness in achieving the load balancing objectives.
Different from the previous works, we study the high availability DC placement problem under the context of emerging economies, where economic development across different areas is often unbalanced. To the best of our knowledge, it is the first time that such factors are considered in this problem. Our study also differs from existing ones in that, we transform the MINLP, which may potentially have millions of variables and is hard to solve, into a structured QCQP that can be efficiently solved in previous work [22], or by optimization toolboxes such as Gurobi and CPLEX. Fig. 1 shows the architecture of high availability geodistributed data centers. The architecture comprises two sets of data centers (namely primary DCs and secondary DCs) geographically distributed. Requests from different regions are collected through POPs (point-of-presence) and forwarded to the corresponding primary DC to obtain services. When the primary DC failure event happens, the secondary DC will continue to provide service. It is worth noting that in the cloud data center network, some data center nodes play dual roles as primary DC and secondary DC. Let I = {i 1 , i 2 , . . ., i m } denote the candidate sites to deploy data centers in distinct geographical regions and J = {j 1 , j 2 , . . ., j n } denote the set of user regions. All the symbols are summarized in Table I.

A. Service Availability Model
Service availability can be defined as the ratio of time period that a repairable object flawlessly operates to its lifetime [16].
Upon the failure of a component, it is repaired and restored to be "as good as new". The status of a network component can be roughly classified as workable and unworkable respectively. The workable period is referred to as the mean time to failure (MTTF), and the unworkable period is referred to as the mean time to repair (MTTR). Thus, the availability of a network component can be obtained as follows: 1) Technical Talent Availability: Availability of technical talent is closely related to MTTR and has a significant impact on service availability. The repair job is assumed to be divided and dispatched to several working groups and MTTR can be evaluated upon the number of working groups. Thus, the relation between MTTR and repair resources is simplified by assuming that MTTR is inversely proportional to the number of technical talents. Let MT T R k denote the MTTR of an object with k technical labor units allocated. We adopt the method in [16], it is defined by MT T R k = MT T R/k. The availability of a network component with k technical labor units can be obtained by The service availability is affected by the supply of technical talents. Assume that the relative availability of a network component with s technical labor units is A s . When the supply of technical talents changes to t, the availability A t can be calculated by A s . Because A s , A t follow the same equation with s and t in (2), and we can get the relationship between them In (3), a = (1 − t/s) represents the technical talent shortage ratio (TTSR) compared to A s . Due to the huge differences in regional economic development in emerging economies, TTSR varies in differential places. We can then obtain the availability A i related to the local TTSR a i by (3). Note here that to reflect non-linear relations between labor units and time to repair, other models can be used as well. For example, the model in (2) may be generalized into MT T R k = MT T R/k n , where n is an exponential factor, to reflect an increasing and decreasing productivity, when n > 1 and n < 1, respectively. When this model is changed, TTSR should also be modified. In this case, TTSR is changed to a = (1 − (t/s) n ).
2) Network Link Availability: Connectivity fault of the network link is embodied by its inability to transmit messages [24], which can be described by the topology reliability. According to [25], the topology reliability of one network link is mainly determined by the availabilities of elements in this link. The network link is available only when all associated elements are in normal. Thus, network link availability is the product of the availability values of the components along the link. Notice that unavailability is expected to be significantly low (U ≤ 1.0e − 3), we adopt the network link availability model by Component failure parameters usually can be obtained from the network operators. In particular, the network link availability is distance related and can be derived according to measured fibercut statistics [26]. Given the uneven development of network infrastructure in emerging economies, the network link availability varies across regions. Let A L ij,s denote the availability of the physical network link L ij , the unavailability value fluctuates with the network infrastructure development.
We comprehensively consider the physical link availability and technical personnel factors, the network link availability is calculated by 3) Data Center Availability: The availability of each data center depends on the level of redundancy in its design. According to [27], data centers are commonly classified into tiers, where each tier implies a different redundancy level and expected availability. For example, Tier I data centers have a single path for power and cooling distribution and reach an availability of 99.67%. At the other hand, Tier IV data centers achieve 99.995% with two active power and cooling distribution paths. When we consider to build data centers in emerging economies, the local power supply should also be taken into account. Because data centers consume dramatically amount of electricity every day, which brings huge challenges to the power infrastructure. Due to the different development of power infrastructure construction, the availability of power supply varies from locations to locations. We denote A D to be the designed availability of data centers for each tier and A P i to be the availability of local power supply. The data center availability is also influenced by technical talent with the assumption in (3), and can be obtained by 4) Service Availability: The availability of a data center i providing service for user region j can be guaranteed by ensuring that both the data center i and the network link L ij are available. Such single site availability can be defined by The availability of service can be guaranteed by ensuring that at least one of the data centers is available. Therefore, the service availability is the joint single site availability provided by data center i and i

B. Workloads Heterogeneity
Workloads are the basic resource allocation unit in our model. According to [28], [29], Internet services can be classified into 5 categories, including search queries, social networking, e-commerce, media streaming and others. These diverse Internet applications require different mixtures of computing, networking resources [30]. They also have different service availability demand. Therefore, in order to perform effective planning, we divide the applications into multiple types served by heterogeneous workloads, each with a representative mixture of required resources and the certain satisfaction of service availability. We assume that workloads operate at their maximum capacity and use all the allocated resource. Since data centers provide diversified network services, the each serving workload type occupies a different combination of computing and bandwidth resources. The heterogeneous workloads also need to achieve various availability requirements.
Let W = {w 1 , w 2 , . . ., w q } be the set of q heterogeneous workload types that will serve the q different types of applications. Let A w denote the service availability requirement of type w workloads. Let V w be the number of type w workloads one server can carry. It represents the computing resource required for type w workloads. Also let b w be the bandwidth requirement for type w workload. The number of type w workloads needed by user region j can be obtained by Where T is the total number of workloads data centers will have to serve, r w denotes the proportion of each type of workload, and d w j represents the percentage of type w workload served on user region j, and apparently, w r w = 1, and j d w j = 1.

C. Data Center Network
Data center traffic can be broadly categorized into three different types: i) traffic that flows inside the data center -intra-DC traffic; ii) traffic that flows from one data center to another data center -DC-to-DC (D2D) traffic; iii) traffic that flows from a data center to end users -DC-to-user (D2U) traffic. As in [31], we will ignore intra-DC traffic, as it has no impact on the planning problem, and incorporate the DC-to-user (D2U) and DC-to-DC (D2D) networks in our planning framework. D2U traffic is generated by end users when they request service, such as web surfing, online searching, social networking, e-commerce and the like. These applications are typically sensitive to response time, which in turn may impact revenue [32]. It is thus important to deploy data centers in close proximity to the users and allocate sufficient bandwidth to keep response time low. Therefore, it is important to model the network latency between users and potential data centers. Generally, the amount of latency on a network can be estimated as Where l 0 ij is the latency between data center i and user region j when the network is idle, and U ij is the fraction (between 0 and 1) of the network throughput being utilized [33].
We assume that data centers are inter-connected by dedicated WAN links and use B ii to represent the link bandwidth between data center i and i . The traffic between data centers, generated from moving data between clouds, or copying content to multiple data centers as data replication, is usually delay tolerant. Thus, link capacity is allocated to guarantee that traffic A ii between data center i and i can be transmitted between data centers within a specific time h.

D. Cost Components
In this subsection, we briefly discuss the parameters that are of interest to data center planning studies in general and will also be used in this paper. And the detail value will be given in Section VI. PUE (power usage effectiveness) is an important metric to quantify a data center's power efficiency. According to [23], [36], the PUE is dependent on the data center utilization (IT load) and outside temperature. Therefore, the data center's PUE can be obtained by Where U t , H t is the data center utilization and outside temperature. Current data centers are commonly measured by the maximum power that they are designed to consume. Thus, when calculate the maximum power, we should use the P UE max which U t is set to 1 and H t is set to the max temperature. However, when it comes to the running power, P UE run can be calculated by setting U t to average utilization and H t to average temperature.
Total Cost of Ownership (TCO) of geo-distributed data centers can be split up into Capital Expenses (CAPEX) and Operational Expenses (OPEX). CAPEX are those expenses to acquire fixed assets and depreciated over the lifespan. The expenses of purchasing servers C S and internal networking gears C IN belong to CAPEX. These expenses are proportional to the actual number of hosted servers in the data centers and are not dependent on location. CAPEX also includes the expenses of land purchasing C L and infrastructure building C BD 1 , C BD 2 . These expenses can be estimated according to the maximum power consumption of a data center [10], which in turn can be calculated by assuming all servers are up and running at 100% utilization, and PUE is maximal. Note that C BD 1 , C BD 2 are used to characterize the scale effect of data centers in infrastructure building, i.e., the larger the scale, the lower the cost per unit of power. The land price and temperature of each site vary from location to location. These expenses rely heavily on the location of data centers.
OPEX consists of those expenses incurred during the daily operation of data centers. The expenses of maintenance and administration costs C M , C AD depend on the actual number of hosted servers and the maximum power. The expense of electricity consumption C E can be calculated based on the running power and local electricity price. Different from the maximum power, the running power is estimated as the average power when servers run at average utilization (a common average utilization in Internet service data centers at 30% utilization [37]) and average PUE. The expense of networking C DU , C DD depends on bandwidth and distance. OPEX of data centers varies geographically because of the varied price, temperature and distance at different geographic locations.

IV. FORMULATION OF THE PROBLEM
We define X w,p ij as the number of type w workloads that primary data center i provides for user region j and X w,s ij as the secondary data center distribution. We assume that the traffic generated by the same type of workload is uniform and proportional to the number of workloads. Let G w , T w denote the whole traffic volume and the whole workload number of the type w workloads. We adopt the similar method in [34] to calculate the D2D traffic. The traffic between data center i and i is proportional to the product of their load In (12), the traffic generated in primary data center i is proportional to X w,p ij , and distributes to secondary data center i evenly according to X w,s i j . So the traffic volume between data center i and i is proportional to the product of two sets of loads. We assume that the traffic should be transmitted within a specific time h. According to [12], the bandwidth between DC pair (i, i ) is proportional to the traffic demand. Then the D2D link bandwidth between data centers i and i can be calculated by Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
Let B ij denotes the bandwidth between data center i and user region j. The bandwidth utilization can be estimated as When we set the max latency l ij , B ij can be obtained based on (10), (14). The expenses of carrying the two kinds of network traffic are proportional to B ij and B ii . The power of data center is an important parameter to calculate the expenses of data centers. As mentioned above, the max power P max i and the running power P run i are two kinds of power in data centers. They can be calculated by Where p s,max and p s,run are server's max power and average running power.
The DC placement problem can be formulated into an MINLP. The objective function is to minimize the cost that jointly considers the mainly cost of data centers and the workloads requirements for computing resource and bandwidth demand on different types The problem is subject to the following types of constraints: Workload constraints: For each user region j, the type w workload is supposed to be served by the primary DCs and secondary DCs Capacity constraints: For each data center candidate site, the size of data center and the computing resource is limited.
Bandwidth constraints: The bandwidth capacity of D2U and D2D networking is limited Latency constraints: The latency between data centers and user regions should be limited to the maximum latency requirement Backup constraints: For the type w workload needed by user region j, the primary DC and the secondary DC should not be placed in the same place. We introduce two sets of binary variables K w,p ij and K w,s ij to represent if the type w workload in user region j is served in data center i. They are defined as 1 if DC pairs distance constraints: For the type w workload in region j served by data center i as primary DC and data center i as secondary DC, the distance between the primary DC i and the secondary DC i pairs should not be too close (≥ D min minimum distance limit) to prevent the two data centers from being influenced by the same accident, and not too far (≤ D max maximum distance limit) to avoid excessive consistency delay Service availability constraints: For each type of workload, the primary DC and secondary DC should satisfy its service availability requirement

V. REFORMULATION AND DECOMPOSITION OF QCQP
Note that the problem above is an MINLP, which contains non-linear constraints and non-linear variables K w,p ij and K w,s ij . Besides, the problem is a large-scale optimization problem, whose size depends on the number of user regions, data center candidate locations and workload types. Hence, the problem can have millions of linear and non-linear variables and constraints, making it difficult to solve. Below, we try to reformulate the problem into a QCQP problem, by taking advantage of the fact that the two sets of optimization variables, X w,p ij and X w,s ij , are mutually exclusive. The resulting QCQP problem is exact and yields the same optimal solution as the original MINLP problem, but are solvable with off-the-shelf toolboxes. We then further decompose the complex QCQP model according to types of workloads for better structural clarity. The decomposed model is highly extensible and can be applied in a wide range of problems such as facility siting and task scheduling considering alternate scenarios.

A. Reformulating the Problem to a QCQP
The backup constraints (24) are to avoid choosing the same site as both primary DC and secondary DC for a certain user region. That implies K w,p ij = 1 and K w,s ij = 1 cannot be satisfied at the same time. According to their definition, we can reformulate the constraint by only using X w,p ij , X w,s ij X w,p ij X w,s ij = 0, ∀i ∈ I, j ∈ J, w ∈ W (27) Consider that X w,p ij , X w,s ij ≥ 0, we can combine all the backup constraints (27) together as In this way, we use only one quadratic constraint (28) to replace the non-linear backup constraints (24). The DC pairs distance constraints (25) limit the distance between primary DC and secondary DC. They also include nonlinear variables K w,p ij and K w,s ij . We try to define the constraints from the opposite condition. If we can rewrite the constraints to exclude all the primary and secondary DC combinations that are too close or too far, the remaining combinations naturally satisfy the distance constraints. By this method, we can remove nonlinear variables and simplify the constraints. So the constraints (25) are changed to (29) Follow the same method, we can reformulate the service availability constraints (26). For user region j, if we choose data center i and i as primary DC and secondary DC to provide service, we can calculate the service availability by (8). In order to satisfy the availability constraints, we exclude all the DC combinations that the service availability cannot reach the requirement. Then the constraints (26) can be rewritten by The latency constraints (23) can also be transformed to linear form. We notice that the value of U ij is in (0,1) and 1 − U ij ≥ 0. Based on (10), (14), the constraints (21) and (23) can be combined to In this way, we replace the non-linear constraints in (23), (24), (25), (26) with (28), (29), (30), (31), so as to avoid the non-linear variables and non-linear inequalities. Instead, it is converted to quadratic constraints and the linear inequality. Since the objective function in (17) is also quadratic, we finally reformulate the problem to a QCQP.

B. Decomposition of QCQP
Without loss of generality, a QCQP has the form as Where P 0 , . . ., P S are 2mn-by-2mn matrices and x ∈ R 2mn is the optimization variables.
First, in order to decompose the problem, we define the vector of optimization variables. Let X w,p i = [X w,p i1 X w,p i2 · · · X w,p in ] T to be the allocation vector for data center i as primary DCs. So as the same definition of X w,s i for secondary DCs. The optimization variables vector for single availability workload x w is defined as Based on x w , for heterogeneous availability workloads W = {w 1 , w 2 , . . . , w q }, we expand the vector x to By defining the optimization variables in this way, we find that its linear vector q 0 and quadratic matrices P i can be decomposed into several parts as shown in Fig. 2. For linear vector, the composition of the coefficient vector corresponds to the optimization variables vector x. Two sets of variables for primary DCs and secondary DCs compose the linear vector for single availability workload. Multiple linear vectors are combined together as heterogeneous availability workloads vector. As to quadratic matrices, diagonal submatrices (area A R ) represent the quadratic coefficients within data centers. The submatrices in the area A Y are the parts between data centers. The quadratic matrices of single availability workload contain two parts for primary DCs and secondary DCs. The diagonal areas A R and A Y are the independent parts for two sets of DCs, while the area A P is the interrelated part between primary DCs and secondary DCs. Several matrices of single availability workload compose the diagonal submatrices in the matrices of heterogeneous availability workloads. The submatrices of the area A B are the interrelated parts between different kinds of workloads. Through the method, all the components in the problem are included in this general framework. Furthermore, the framework can be decoupled based on the single workload model, and then coupled together into heterogeneous workloads model.

1) Decoupling to Single Availability Workload:
Let us now consider the simplest situation where there is only single availability type w workload in the problem.
The objective function can be divided into the linear parts and the quadratic parts. We start with the linear parts. Note that the expenses components of servers, internal networking gears and administration are proportional to number of servers and not changed with locations. Their corresponding coefficients matrices contribute to the linear vector q 0 in (32). Since the coefficients of these expenses are constant for the same workload type, we can get vector q S 0 , q IN 0 , q AD 0 in the same way. We define Authorized licensed use limited to the terms of the applicable license agreement with IEEE. Restrictions apply.
The expenses components of maintenance, land, infrastructure building and electricity are proportional to the data center power, and differ from locations. We define q u to be the index vector of user regions and q ps to be the index of primary and secondary DCs. As shown in Fig. 2, the linear vector of each expense corresponds to the variables vector x, Thus, the vectors of these expenses components can be obtained as follows: Where T are location relevant parameters. We can follow the same method to get q M 0 , q BD 2 0 , q E 0 , q DU 0 . Second, we try to get the matrices of quadratic parts in the objective function. We notice that the quadratic part expense of infrastructure building is within data centers. Its quadratic matrix can be obtained by While the expense of D2D networking is generated only between primary DCs and secondary DCs. Thus, we can get their quadratic matrices by Where C DD is matrix of C DD ii . O is m-dimensional all-one matrix. K is the matrix means that it exists only between primary DCs and secondary DCs. N w is the workload matrix. Their definition is The transformation of constraints are also divided into linear constraints parts and quadratic constraints parts. For the linear constraints, workload constraints in (18), (19) ensure that the requests of all user regions should be satisfied. They are transformed as Where q d is the index vector of data centers. Capacity constraints in (20) are the computing resource limits for all data centers, they can be obtained by The constraints (21) and (23) are reformulated in (31). They are transformed as Where l 0 = [l 0 11 · · · l 0 1n l 0 21 · · · l 0 2n · · · ] T is the vector of l 0 ij . Next, we transform quadratic constraints parts. For the backup constraints (24), it is reformulated as (28). And it is calculated as X T P w 1 X = 0, P w 1 = K ⊗ E mn (43) For the DC pairs distance constraints (25), it is reformulated as (28). We set B D as 0-1 matrix to exclude all the DC pairs that the distance between them does not qualify. The element is It is transformed as For the service availability constraints (26), it is transformed as a quadratic constraint in (29). Similarly, we set B w as 0-1 matrix to exclude all the DC pairs that the service availability does not qualify. The element is defined as It is transformed as D2D bandwidth constraints (22) exists in every pair of different data center i and i . We use Z to represent the index of data center pairs. The elements Z ii and Z i i are set to 1, while others are 0. Then they can be obtained by Thus, we can transform the problem to the form in (32) as The problem also includes the quadratic constraints as (43), (44), (45), (46).

2) Coupling of Heterogeneous Availability Workloads:
Based on our discussions in single availability workload case, we can get the matrices in (47) for different kinds of workloads and couple them together to finally obtain the form in (32) for the heterogeneous availability workloads.
The linear expenses parts q 0 in the objective function can be simply expanded by q w 0 as In quadratic expenses parts, P BD 1 0 represents the quadratic cost coefficients of infrastructure building and is calculated across all the heterogeneous availability workloads. That is, it contains both the independent and interrelated parts of different kinds of workloads. P w,BD 1 0 represents the coefficients calculated for the same workload type w. For the cost between workload type w u and w v , because of the difference in V w u , V w v , P w u w v ,BD 1 0 is calculated differently. And the expanding matrix P BD 1 0 is constructed by both P w,BD 1 0 and However, the D2D networking cost P DD 0 only exists between primary DCs and secondary DCs for the same w. Thus, they are expanded as follows: Because of heterogeneous availability workloads, the number of the workloads constraints (18), (19) are multiplied by q times according to the number of workloads types. Their matrices are expanded in the same way. Take the constraint (18) as example For the capacity constraints (20), though the number of workloads types are increased by q times, the capacity limit of DCs remains the same. The transformation is The quadratic constraints (24), (25), (26) only affect the variables for the same workload type, and do not exist between different workloads types. So they can be expanded in the same way as P 1 Note that heterogeneous availability workloads need to satisfy different service availabilities, the 0-1 matrix B w is also changed with different A w . Combining all the expanded matrices as we have in (47), we finally couple heterogeneous availability workloads into the form in (32). The model is highly flexible and scalable, and can be solved by optimization toolboxes such as Gurobi and CPLEX. Our proposed framework is generalizable and can be applied to a wide range of problems such as facility siting and task scheduling considering alternate senarios and various applications.

VI. CASE STUDY -APPLY THE MODEL TO CHINA
We apply this model to study the placement of high availability geo-distributed data center in one of the most representative fast developing economies, China. In this section, we list our source of data including availability, workloads, network and other data.
We select 20 provinces across China as user regions. According to [34], the amount of workload is influenced by both network user number (population) and economy level (GDP). Therefore, the number of workloads is shared among all user regions proportional to the population and GDP. There is a POP node in each user region to collect and map requests. We select data centers candidate sites where there are already data centers according to [38].

A. Service Availability Model Setup
TTSR is an important metric in our service availability model, which indicates the level of technical personnel lack to repair and restore the service normally. It is affected by the different economic development in each region. To some extent, the per capita income level reflects the economic development of the region. It also implies the ability to attract technical talent. So we assume that TTSR is proportional to the per capita income in each region, which we collect from [39]. The availability of power supply has significant differences across diverse areas and meteorological environments. We get the power supply reliability data in relative regions from [40]. According to [41], the typical network link availability for 250 miles' end-to-end connection is about 99.97%. We adopt the same ratio to calculate the availability of different length network link. Furthermore, the availability of network link is also influenced by the development of network infrastructure in each region. In order to indicate the different development levels of network infrastructure across regions, we introduce LAER (Link Availability Error Range) to measure the uneven development of network infrastructure. We use network infrastructure penetration to reflect the impact and assume that network infrastructure penetration is proportional to the network link availability within LAER. We get the network infrastructure penetration data in each region from [39].

B. Requirements for Workloads
We divide heterogeneous availability workloads into 4 service level agreement. The requirements for service availability are 0.9, 0.99, 0.999 and 0.9999. According to Cisco Cloud Index [42], we let the default bandwidth requirement be 2 Mbps per workload. In terms of workload density, a single server is supposed to be capable of carrying 5 workloads by default. To obtain the workload traffic, we get the traffic data from [42] and calculate the average traffic for each type of workload.

C. Network Topology and Cost
The inter-datacenter network is based on the topology of China Telecom network [43]. In this paper, we focus on the wide-area network propagation latency, as it largely accounts for the user-perceived latency and overweighs other factors such as queuing or processing delays in data centers [44]. The estimation of latency is related to utilization and distance. The latency l 0 ij when the network is idle can be approximated with the distance between data center i and user region j. Empirical studies have demonstrated that 1 km distance incurs a propagation latency of approximately 0.02ms [45].
Leased line expenses for data centers vary greatly depending on three key factors: the location of the connections, the total amount of the bandwidth and the length of the leased line. According to [46], the price of 40 Gbps leased line ranges from $2M-6 M for a 20-year long-term lease. In this paper, we adopt a similar pricing model by using the same ratio to calculate the cost of leased lines. The cost is increased by $1 K for every 1 km increase in distance.

D. Geographic Related Statistics and Other Costs
Temperature, land and electricity price are geography specific data considered in the model. We obtain the temperature information from [39]. The data of maximum temperature and average temperature is used to calculate the PUE as described in (11). We also get the annual land purchase data from [39] and calculate the average price for each data center candidate site. We obtain the electricity price data from local official websites of each location. These geographically diverse parameters make the cost of data centers vary greatly in different locations.
The cost of building a data center, including purchasing and installing its cooling and power delivery infrastructures, is typically computed as a function of its maximum power [47]. According to [10], Small data centers (≤10MW) incur higher per Watt costs than large ones (>10MW). The cost is set to $15 (small) and $12 (large) per Watt. In this paper, in order to reflect the scale effect of data centers construction, we assume that the cost per Watt is a linear function of the maximum power. Refer to the data in [10], the linear function is set as The building cost is amortized over 12 years, the expected lifetime of a data center. We use the method in [10], [21] to calculate the other costs in data centers. The land size required by a data center is computed by its maximum power. Typically, the coefficient is around 6 K SF per Megawatt [47]. Our default servers are Dell PowerEdge R610 with 4 processor cores running at 2.66 GHz with 6GBytes of RAM. These servers consume a maximum of 260 Watts and an average of 200 Watts at a 30% utilization. The cost is around $2000 per server, which is amortized over 4 years. Any data center needs personnel to operate and maintain it. Following [37], our default for maintenance costs is $0.05 per Watt per month. In addition, our default operational costs assume that each administrator can manage 1 K servers for an average salary of $100 K per year.

VII. NUMERICAL RESULTS AND DISCUSSION
With the data collected above, we show numerical results on data center cost varies with different service availability requirements. We also illustrate joint placement of data centers when considering heterogeneous service availability. Based on the special circumstances in emerging economies, we investigate how technical talent and network infrastructure may influence the placement of data centers.

A. Service Availability and Cost
In service availability simulation, we vary the service availability requirements between 0.9 and 0.9999, and get the placement of DCs. Fig. 3 shows the cost changes with different requirements of service availability. When service availability requirements are set to 0.9 and 0.99, it is satisfied by single data center site model which can be solved by our previous work [48] to get the DCs placement. We find 7.1% increase in cost when the service availability requirement changes from 0.9 to 0.99. However, when it comes to 0.999, because of the under-developed infrastructure and the lack of technical support ability in emerging economies, the single data center site model can no longer reach the service availability requirement. Thus, we use the two data center sites replication model. Compared to 0.99, the cost increases dramatically by 59.5% when the service availability requirement changes to 0.999. The increased cost is brought by the two sites replication model, which needs double redundancy to guarantee service availability. It indicates that the cost increases sharply when it changes from single site model to two sites model. When the service availability requirement reaches higher 0.9999, the cost increases by 46.5% compared to 0.999. It also has a steep rise in two site model when the service availability requirement increases. Fig. 4 shows the single site availability distribution between each user region and its requested DCs in the cases of multiple service availability requirements. In the case of 0.9 service availability requirement, the selections of data center sites are widely distributed, because it is easy to reach 0.9 service availability requirement and the locations with lower cost are prior options. When the service availability requirement increases from 0.9 to 0.99, the candidate sites with higher single site  availability are chosen to satisfied the requirement. This is the reason for the increase in cost, the selection of data center sites is more stringent, and the data center sites with lower single site availability are no longer chosen. However, when service availability requirement continues to increase to 0.999, the range of availability distribution is expanded, the data center candidate sites with even lower single site availability are selected to serve users. This is because the two sites replication model allows the combinations of lower single site availability locations to meet the service availability requirement. Under the condition of the service availability requirement reaching 0.9999, the range of availability distribution is narrowed with increasing service availability requirement.

B. Joint Placement of Heterogeneous Service Availability
The model we proposed is capable of simultaneously planning of heterogeneous service availability workloads. Fig. 5 shows the cost saving in joint placement of heterogeneous service availability. We solve the placement with two kinds of availability service combinations in different proportions, to analyze the cost comparison with the single high availability placement when service availability is set to 0.9999. With the ratio of low service availability 0.999 growing from 0.2 to 0.8, the cost  savings are between 10% and 20%. By contrast, when the low service availability is set to 0.99, the cost savings are even higher reaching to 25%-57%.
We calculate the cost saving components comparison for two kind of service availability combinations in Figs. 6 and 7. Fig. 6 shows the cost components saving of land, building, maintenance, energy and network in different ratio compared to the single high service availability placement of 0.9999. Since we deploy the 0.99 low availability service with single data center site model, the cost savings are mainly brought by redundancy reduction. As the proportion of low service availability grows, each component of cost increases accordingly. However, when low service availability is 0.999, both are deployed with two sites model, and the cost savings are mainly due to the reduced availability requirement of parts of service, which relaxes the site selection requirement to choose some data centers in lower cost. So we can see in Fig. 7 that most of the cost savings come from building and energy components. The data center candidate sites with lower energy price and lower temperature are given priority, which contributes most of the cost savings.

C. DC Placement Versus Technical Talent
TTSR is indispensable in our service availability model. It implies the lack of technical support in each region. We use maximum TTSR to measure the degree of technical lack, and TTSR in each region is proportional to the per income capita  in the range of maximum TTSR. Fig. 8 show the single site availability distribution in different maximum TTSR range. As the maximum TTSR range increases, the overall availability presents a decreasing trend, and the difference in availability increases. As the single site availability deteriorates, the data centers that could have meet the service availability requirement do not satisfy the requirement. Therefore, the data centers with higher availability and possibly higher cost have to be selected to meet the service availability requirement. This also causes the increase in cost with maximum TTSR as shown in Fig. 9. Fig. 9 illustrates how the cost varies with the maximum TTSR under different levels of LAER. The lack of technical personnel support brings a huge cost increase to the placement of high service availability in emerging economies. For example, with 0.6 maximum TTSR range, the cost increases by 18.4% without the consideration of LAER.
From Fig. 10, we can see the detailed changes between the data center sites as the maximum TTSR grows. When the lack of technical personnel is small, most of data center sites still meet the availability requirement, and data centers can be centrally deployed with lower cost. As the lack of technical personnel increases, more and more data center sites no longer satisfy the availability requirement. The locations with smaller TTSR are selected. For example, Guangzhou and Shanghai are two developed cites, and are selected to serve the majority of workloads when the maximum TTSR grows to 0.6.

D. DC Placement Versus Network Infrastructure Penetration
Network infrastructure is another important factor in our model. Fig. 11 shows the single site availability distribution under different LAER. Similarly, the increase of LAER will reduce the overall single site availability and widen the difference. From Fig. 12, we learn that excessive differences in network infrastructure in various regions also significantly enhance the cost in the placement of high availability data center. With 0.6% LAER, an additional 16.8% cost is needed without considering TTSR. Fig. 13 shows the exact changes in data center planning results when LAER changes. We can see that geo-distributed data centers tend to be deploy centrally when LAER is too small to influence the availability. When LAER increases, reduced link availability leads to unsatisfactory service availability requirement. Shortening the distance between data centers and user regions is an effective way to increase the network link availability and meet the service availability requirement. Data centers begin to be split into small data centers and are closer to user regions.

VIII. CONCLUSION
In this paper, we propose a model for optimally planning high availability geo-distributed data centers in emerging economies. We incorporate the unbalanced availability of infrastructure and technical talent into the problem. We formulate the problem into an MINLP, and transform the problem into a QCQP problem which is able to handle workloads with heterogeneous availability. We apply the model to China with real-life data. We analyze the placement of data centers under service availability requirements from 0.9 to 0.9999 and find that the increase in redundancy and availability can cause the cost to rise dramatically. Our study reveals that by using our method, the joint placement of heterogeneous availability service can achieve considerable cost-saving up to 57%. The unbalanced spread of technical talent will lead to a rise in cost up to 18.4%. On the other hand, the unbalanced development of network infrastructure will also lead to an increase in cost by 16.8%. We further illustrate the detail placement changes and show the tradeoff between cost and availability.
Our study brings attention to an increasingly significant problem that has not been focused and tackled in previous studies. It is worth noting that our formulation takes into account only long term factors in data center planning. Micro factors such as per-workflow, or diurnal traffic dynamics are not considered. It will be interesting to look into how those micro factors may affect the placement of highly available data centers in emerging economies.