Link-delay-aware Reinforcement Scheduling for Data Aggregation in Massive IoT

Over the past few years, the use of wireless sensor networks in a range of Internet of Things (IoT) scenarios has grown in popularity. Since IoT sensor devices have restricted battery power, a proper IoT data aggregation approach is crucial to prolong the network lifetime. To this end, current approaches typically form a virtual aggregation backbone based on a connected dominating set or maximal independent set to utilize independent transmissions of dominators. However, they usually have a fairly long aggregation delay because the dominators become bottlenecks for receiving data from all dominatees. The problem of time-efﬁcient data aggregation in multichannel duty-cycled IoT sensor networks is analyzed in this paper. We propose a novel aggregation approach, named LInk-delay-aware REinforcement (LIRE), leveraging active slots of sensors to explore a routing structure with pipeline links, then scheduling all transmissions in a bottom-up manner. The reinforcement schedule accelerates the aggregation by exploiting unused channels and time slots left off at every scheduling round. LIRE is evaluated in a variety of simulation scenarios through theoretical analysis and performance comparisons with a state-of-the-art scheme. The simulation results show that LIRE reduces more than 80% aggregation delay compared to the existing scheme.

data aggregation, internet of things, multichannel, duty cycle, wireless sensor networks.

I. INTRODUCTION
Today's infrastructure systems, such as smart homes, smart grids, smart water networks, and intelligent transportation connect our world in ways we never imagined. Internet of Things (IoT) connects such systems all-in-one with the use of sensors. Sensors distributed in any environment connect to form a network, such a sensor network measures and collects data from terrestrial, underground, underwater, or even body area [1]. Sensor networks collect raw data, which must be analyzed and stored, but the data volume becomes enormous. By combining similar data and reducing redundancy in sensed data, the data aggregation technique is used to address this issue [2]. The technique improves energy efficiency, increases network longevity, enhances bandwidth utilization, and minimizes data transmission latency.
As they are powered by nonrechargeable batteries, sensor nodes have limited energy. Although sensor nodes are always awake to listen to the transmissions, they do not always send and receive data from their neighbors. Being awake while idling consumes a significant amount of energy.
Furthermore, when aggregating data, sensors relay data in a multihop network where data are aggregated and forward by intermediate sensor nodes until it reaches the sink or the base station. Therefore, energy conservation is a major issue for extending the lifetime of a network. To this end, emerging energy-harvesting technology in the sensor nodes is continues to advance, it still faces challenges such as implementation complexity and low harvesting efficiency [3]. Another notable way to save energy is to use duty cycle technique, but this comes at the cost of increased data aggregation latency [4,5].
To date, many aggregation scheduling schemes are using a single channel. Multiple channels for multiple frequencies can be used by sensor nodes to send and receive data. Using multiple channels increases concurrent collision-free transmissions, which improves data aggregation time efficiency [6,7,8,9]. In always-on sensor networks, the proposed schemes aim to reduce data aggregation delay. Despite that the performance becomes worse if these schemes apply the problem in duty-cycled networks, energy is conserved so that the lifespan of the networks can be extended which adapts to the needs of our lives is to save or reuse the energy.
In conventional delay-efficient data aggregation (DDA) solutions, a typical aggregation structure is assumed to be built using a connected dominating set (CDS) or shortest-path tree (SPT) [10]. After that, data scheduling is based on sender-receiver pairs in this structure. To avoid collisions, dominatees send data to their dominators sequentially using a CDS-based structure [10,11,12,13]. The CDS-based approach works well in sparse and tiny networks, but it becomes a problem in dense networks. When the number of dominatees exceeds the number of dominators, the dominatees become a burden to the dominators. A bottleneck may occur at the sink or base station where many nodes concentrate in the SPT-based approach [14], the problem is more serious in dense networks. In such those types of trees, only primary collisions are covered which waste time slots. As a result, scheduling in those tree structures may not result in adequate delay performance.
The DDA problem in multichannel duty-cycled IoT sensor networks is investigated in this article. We propose a LInk-delay-aware REinforcement scheduling (LIRE), which uses the active slots of sensor nodes to explore the aggregation tree with pipeline links, to overcome the limitations of current approaches. After that, the link-delay-aware scheduling scheme augments the number of scheduling candidates by considering all leaf nodes of the network at each iteration and then invokes a reinforcement scheduling scheme. The pipeline links are used in the reinforcement scheme to maximize transmissions in a single working period. The main contributions of this paper are summarized as follows.
1) The wake-up time of the receivers to adopt children is used in a link-delay aggregation tree construction approach that we propose. A motivation for using the new tree construction approach is presented, followed by a step-by-step description of the tree construction process, then children distribution affected by tree construction method compared to the reference scheme is presented. We also analyze how the aggregation tree's children distribution affects the scheduling of nodes.
2) We propose a reinforcement scheduling scheme that enhances the nodes scheduling on unused channels and time slots at the current working period, missed by the candidate nodes schedule scheme. To obtain more scheduled nodes, the proposed scheme examines each slot of the current working period one by one. The proposed scheme reduces aggregation delay even more with this greedy scheduling strategy.
3) We demonstrate detailed examples throughout the article. In addition, we perform extensive simulations to evaluate the performance of our proposed scheme in terms of aggregation delay, to compare it to an existing scheme. The results show that the proposed scheme outstandingly achieves lower aggregation delay.
The rest of the paper is organized as follows. In Section II, we go over the related work.
The network model and problem formulation are presented in Section III. Then, in Section IV, we demonstrate our proposed scheme with illustrated examples and algorithm analysis. After that, the performance evaluation of our proposed scheme is provided in Section V. Finally, we conclude our work and discuss the future direction in Section VI.

II. RELATED WORK
Massive IoT applications, which require a large number of devices with low energy, low cost, and small data volumes, and critical IoT applications, which involve very high data availability with low latency and high-reliability requirements, are two types of IoT applications. IoT sensor devices are known for their low battery power, limited computation, and communication resources [15]. For IoT devices to the last longer, energy harvesting is essential. Although the energy harvesting technology is getting concerned recently, these systems have certain drawbacks, such as the amount of harvesting energy is low, the harvesting system is inefficient and the inaccessibility of the energy source where the energy intends to be harvested [16]. As a result, demand for energy-efficient scheduling methods is always high. Duty cycling, in which a sensor node is regularly put into sleep mode when there is no transmission, is a well-known approach to conserve energy [17]. However, because fewer sensor nodes are awake to transmit data, this mechanism increases transmission latency and reduces throughput.
The authors in [18] prove the Delay-efficient Data Aggregation scheduling (DDA) problem is NP-hard and propose an algorithm with latency bound (∆ − 1)R -approximation algorithm for that problem, in which ∆ equals to the maximum node degree of a sensor node and R is communication range. Another algorithm based on maximal independent sets (MIS) with a latency bound 23R + ∆ − 18 is proposed in [19], this algorithm offers a near-constant approximation with a much lower latency than the one in [18]. Xiaohua Xu et al. [20] present three algorithms with latency 15R + ∆ − 4, 2R + O(logR) Another time-efficient approach is proposed in [21], the latency of the proposed scheme is upper-bounded by ( 2π arccos( 1 (D <= 2R) and ∆ is the maximum node degree. Later that same year, XiaoHua Xu et al. [23] propose an improved algorithm with an upper-bounded data aggregation latency 16R + ∆ − 14, where R is the communication range. Another distributed algorithm is proposed in [24] named DICA with the time latency is at most ( 2π arccos( 1 1+ ) + 3)R + ∆ − 4, DICA intertwines the tree construction and nodes scheduling to lower the aggregation delay. In [25], the authors propose the FAST algorithm, which aims to reduce time latency 12R + ∆ − 2.
However, all the works discussed above are solving the DDA problem using one channel for each sensor node to communicate. Using multichannel, several studies have been conducted to address this issue. The authors in [26] design a scheduling algorithm with a schedule length of O(∆(T )logn) and the smallest number of channels required to remove all collisions, where ∆(T ) is the maximum node degree on the tree T and n is the number of nodes in the network.
Another method is studied to solve the DDA problem in multichannel multihop sensor networks is proposed in [27], the approximation algorithm has a latency upper bound of nearly (α + 11β), where α, β are constants along with communication r formed collision range αr and carrier sensing range βr. Ghods et al. [28] design a scheme to reduce the data aggregation delay, named MC-MLAS, the scheme uses multichannel to execute the aggregation tree construction and scheduling simultaneously. A distributed algorithm is proposed in [29], a cluster-based distributed data aggregation scheduling scheme that aims to reduce data aggregation delay in multichannel and multipower sensor networks. Nguyen et al. [30] employ a distributed collision-avoidance scheduling algorithm, named DCAS, to minimize latency for data aggregation in WSNs.
In always-on networks, the studies listed above are attempting to reduce the data aggregation latency for IoT sensors. Even though they do not collect data all of the time, the sensor nodes are always awake. The sensor nodes waste energy due to the wake-up time without performing any mission, the network lifetime becomes lower. By putting sensors into sleep mode when they are in idle time, the duty cycle mechanism is used to conserve energy. The DDA problem in multichannel duty-cycled sensor networks is investigated in [13,31,32]. Yu and Li [13] consider the DDA problem in duty-cycled WSNs for the first time, they show an approximation algorithm based on CDS with aggregation latency nearly-constant not exceed (15R where R is a lower bound on the aggregation delay, T is the working period length and ∆ is a maximum degree in communication graph G. [31] proposes the first distributed algorithm for duty-cycled WSNs without considering routing structure, the algorithm simultaneously does nodes scheduling as well as generates an aggregation tree. To reduce aggregation delay, Jiao et al. [32] propose two maximal independent set-based algorithms based on two new conflict graph concepts. The authors in [33] claim that routing structure algorithms greatly affect aggregation latency. Neither SPT nor CDS are sufficient to construct aggregation trees because of delay efficiency. Therefore, in this research, we propose a new aggregation tree construction approach where the active slots of nodes are applied. We minimize the latency value between senders and receivers when constructing the aggregation tree to increase the number of transmissions in a working period so that reducing the total aggregation delay.
To date, revolutionary technologies such as machine learning, blockchain are greatly applied to most applications automatically and efficiently operate and manage IoT sensor data. The Authors in [34] use Fuzzy logic to find a shortest path for data aggregation, the proposed method improves throughput and conserves energy for the network. Data aggregation scheduling schemes for sensor nodes based on Q-learning are proposed in [35,36] aiming to improve energy and obtain a longer network lifetime. Wang al et. [37] design a Blockchain-based scheme for edge computing empowered IoT obtaining high throughput, low transaction latency and energy efficiency with the restriction on a security level. However, in this study, we use a traditional approach to investigate the DDA problem in multichannel duty-cycled IoT sensor networks. So far, the DDA problem in multichannel duty-cycled IoT sensor networks is only studied in [32] where we compare our approach with their proposed scheme with intensive simulation scenarios.

A. Network model and assumptions
We model a WSN as an undirected graph G = (V, E) where V is the set of sensor nodes and E is the set of communication links. An omnidirectional antenna with a fixed transmission range is installed on each node in the network. The sensor nodes operate in a half-duplex mode so that they can only send or receive the data from others in a certain slot. A communication link (u, v) between sender u and receiver v belonging to E exists when the Euclidean distance between them is less than or equal to the transmission range R, and u and v are neighbors.
For WNS, duty-cycled mechanism is proposed, in which sensor nodes switch between active and dormant modes. We divide the scheduling time into working periods that have the same lengths. A working period is further divided into L slots, L ∈ N, i.e., indexing from 0 to L − 1, in which a slot is a sufficient time duration for one transmission. In duty-cycled networks, a sensor node is active in some slots and sleeps in the others during the same working period to conserve energy. In this mechanism, a node can receive data from others only at its active slots but can wake up at any time slot for data transmission. In this paper, each sensor node is randomly active at α one or multiple slots per working period, α ∈ N, 0 < α < L. The duty-cycled ratio of a node is α/L.

B. Problem formulation
The data aggregation problem aims to collect data from all nodes to the sink. The data aggregation structure is typically modeled as a tree rooted at the sink, with each child node transmitting data to its parent in the tree. An intermediate node receives data from its children then aggregates with its data. To send to the sink node, the aggregated data is compressed into a single packet. The data aggregation scheduling process, which assigns channels and time slots to nodes in the network, completes when all sensor nodes except the sink node are allocated channels and assigned time slots.
Let G = (V, E) be the communication graph of a sensor network where V is a set of vertices and E is a set of edges between vertices. We construct an aggregation tree T = (V T , E T ) consisting of V T sensor nodes and E T edges based on the communication graph. Let s(u) = [f (u), t(u)] denote a transmitting schedule of node u. Herein, node u is scheduled to transmit data at time slot t(u) on channel f (u). Given the number of available channels limited to f and a working period divided into L slots. The minimum latency aggregation schedule in a multichannel duty-cycled network is formulated as follows.
Finding a data aggregation schedule S that minimizes the total data aggregation delay D such that: All nodes in the network except sink s must be scheduled; A parent node must send data after its child nodes; Two nodes that have the same parent cannot be scheduled at the same time to avoid a primary collision; Allocating different available channels, or different working periods to avoid the secondary collision while doing scheduling.
IV. LINK-DELAY DATA AGGREGATION SCHEDULING Table I presents all notation and abbreviations used throughout the paper.

A. Motivation
The recent solution uses an MIS to route sensory data to the sink for sensors with multiple active slots. Each dominator must collect data from all of its neighbors. Because they have so many neighbors, these dominators become bottlenecks in aggregating data in a dense network.
If nodes with asynchronous wake-up slots are improperly scheduled, the total aggregation delay may increase. The duty-cycled awareness is proved much more effective than MIS or CDS [32]. The calculation of waiting time between two nodes is non-deterministic when nodes have multiple active slots. In this paper, we propose an algorithm that allows us to take advantage of duty-cycled wake-up slots, in which the link-delay value is calculated based on a differential of DRAFT October 30, 2021  Using MIS and link-delay approaches, Fig. 2 shows a partial tree construction. Sink node S considers adopting children from its neighbors. Each node is active at two slots in a working period length L = 10. The MIS-based approach (Fig. 2a) assigns S as a parent of all nodes A, B, and C. Since node S is active at two slots, and it can only receive data from one node per slot, it will take two working periods for S to collect data from the other nodes. Fig. 2b shows that node S has two children, and node A has one child. Intuitively, node S collects data from its children in one working period when child node of B transmits its data in a time slot ahead.
In subsection IV-B, we show how to build the aggregation tree, and in subsection IV-C, we show how to create the scheduling scheme.
Sender u finishes its aggregation at slot τ u and wants to transmit data at slot τ v which is an active slot of receiver v, the minimum time it should wait, denoted as w(τ u , τ v , L) defined as follows.
The w(τ u , τ v , L) means that if the active slot of u is ahead of the receiver's receive slot, u can transmit data to v in the same working period, otherwise it needs to wait until the next working period.
Such waiting time on a link is an important factor for an aggregation tree construction, as shown in [38]. However, determining the waiting time for a link is not deterministic in a dutycycled network where a node is active at several slots per working period. We define the linkdelay on a link (u, v) with respect to the active slot τ v ∈ A(v) of node v as follows.
of a working period, the link-delay between sender u and receiver v is defined as follows: Where A(u) and A(v) are active slots of nodes u and v, respectively. L denotes the length of a working period.
Herein, d(u, τ v , L) is the minimum of all possible waiting time w(τ u , τ v , L) between u and v if v wants to receive data at slot τ v . Assuming that the sender u has data ready all of the time, DRAFT October 30, 2021 it must wait until the receiver v is active to send its data. If the active slot of receiver τ is larger than the active slot of the sender τ u , i.e., τ v > τ u , the sender transmits its data to the receiver in the same working period. The sender must wait until the next working period to send data to the receiver at the receiver's active slot if the receiver's active slot is smaller than the sender's.
Since the sensor nodes are active at multiple slots in a working period, d(u, τ v , L), which is the minimum link delay for sender u to wait until receiver v is active to send its data, needs to be determined between two nodes. At v's active slot τ v , all the link delay values with sender u's active slots A(u) are calculated, then the minimum value selected among those values is the final delay value of the link. Using the link-delay as a strategy, we can construct pipeline transmissions where we maximize the number of nodes transmitting data in a working period.
To establish a parent-child pair between two neighbor nodes during the tree construction phase, the algorithm uses link-delay.
B. Link-delay tree construction Algorithm 1 presents an aggregation tree construction process. Take the communication graph G(V, E) including sink s and working period length L as inputs. Output is the aggregation tree T with V T sensor nodes and |E T | links. V T is the set of nodes that belong to the tree and  Fig. 3: Link-delay tree construction process.
initially, V T contains only the sink s (line 1). The tree will continue to grow until it contains all of the graph's sensor nodes (line 3). Starting from the highest slot in L, if v is active at the current slot τ and there is at least one neighbor u ∈ N (v) (lines 4-6), the algorithm assigns a child whose link-delay value d between active slot τ of node v and node u is smallest. Then the algorithm selects the node that forms the smallest link-delay value as the child of v. The parent of that child node is v and that child node x is added to the set V T (lines 7-10). So forth, the algorithm ends when V T includes all sensor nodes.
We take Fig. 3 as an example to illustrate the link-delay tree construction process. A sample network topology (Fig. 3a) consists of 12 sensor nodes in which sink S collects data from other nodes in the network. In a working period of length L = 10, each node is assumed to be active at two slots i. |F| = 1 is the maximum number of channels that can be used. The algorithm first adds node S to the aggregation tree. It starts with one of the slots that S is active, e.g., slot 7. At active slot 7 of S, the link-delay values are calculated one by one from the neighbors to S, nodes A, B, and C. As C is active at slots 2 and 6, the link-delay values between C and S at the two active slots are 7 − 2 = 5 and 7 − 6 = 1, respectively. So that d(C, S 7 , 10) = 1 is the link-delay value of the link CS. Similarly, the link-delay values of links AS and BS are 2 and 6, respectively. Node C is chosen as a child of S and added to the aggregation tree because it has the shortest link-delay value among neighbors of S (Fig. 3b).
In decreasing order, the algorithm moves on to slots 5 and 6. At each slot, nodes G and F DRAFT October 30,2021 are sequentially added to the tree. The aggregation tree has four nodes at slot 4, two of them are active at this slot which is S and F . The algorithm checks one by one node, as S adopts A to be its child, F adopts K to be its child since these links have smallest link-delay values compared to other links as shown in Fig. 3c. The algorithm works in the same process until all nodes in the network are added to the tree (Fig. 3d).

C. Aggregation scheduling
After constructing the aggregation tree, the candidate nodes scheduling scheme (Algorithm The algorithm finds the node u in candidate set V c to add into selected nodes set SN τ u at slot τ ∈ [0, L − 1] if |I τ u | is smallest. Because we want to minimize secondary collisions in a particular slot, we choose a node that has smallest number of occurred secondary collision DRAFT October 30, 2021  14 for each τ ∈ SN τ do 15 Sort SN τ based on |I τ u | (Non-increasing order) 16 for each u ∈ SN τ do 17 c(u) ← Color of node u with respect to SN τ nodes. As a result, the number of nodes that can transmit data simultaneously without colliding is maximized. The node added to SN is removed from V c and I slot , slot ∈ L and slot = τ .
This process operates until all the nodes in set V C are removed (lines 4-13). Following that, selected nodes in SN are assigned channels and time slots one by one. For each τ , nodes in SN τ are sorted in nonincreasing order then channels are assigned using a coloring method. At slot τ , if a node occurs a secondary collision with others (adjacent nodes) in set SN τ , they are assigned different colors with its conflict nodes as c(u). The channel assigned to that node f (u) is calculated as (c(u) mod |F|) + 1 (lines [14][15][16][17][18][19]. At each iteration, the algorithm checks each   4 illustrates the process of scheduling nodes after tree construction phase. As shown in Fig.   4a, the tree has three leaf nodes, from which we can find three links, namely DH, IF and JK.
The algorithm, then, finds the interference set of each link at each receiver node's active slot.
Herein, at the active slot τ = 0, nodes D and K are active. A secondary collision occurs at D if H and J transmit data at the same time. Therefore, interference set I 0 H contains {J}. There is no collision happened when J transmits data so that interference set I 0 J = ∅. Similarly, the interference sets are shown in Fig. 4b for other active slots. Calculating the number of interference nodes in the corresponding set I yields the selected nodes set SN regarding to receiver nodes' active slots. Herein, starting with node H in the candidate set V C , as |I 0 H | = 1 and |I 6 H | = 0, the algorithm adopts selected node H into set SN 6 . Interference set I 0 J is updated, but it remains as the empty set because link HD does not interfere with link JK. Set V C removes H from it.
A similar process to select nodes at specific slots to schedule in candidate set V C , nodes J and I are chosen at slots 0 and 1, respectively, then added into SN 0 and SN 1 . If a node has the same number of nodes in the interference sets at slots, the algorithm selects the node at an earlier active slot to add into the selected node set SN . When more available time slots are remain after nodes have been scheduled, then more nodes can be scheduled. Selected nodes in DRAFT October 30, 2021 After removing scheduled nodes from unused channels and time slots during the current working period, the algorithm returns a tree. At each slot τ ∈ w, setF is initially created containing all unused channels at slot τ . If there is at least one unused channel available, the algorithm creates a reinforcement nodes set V R can be scheduled at slot τ of working period w (lines 1-6). The nodes added to V R are satisfied the three conditions: 1) These are leaf nodes.
2) The receiver of a node in this set is active at slot τ .
3) The receiver of a node in this set is not yet be scheduled to receive data from any other node in this time slot τ .
With each node u in set V R , an interference set I u is obtained, consisting of nodes occur secondary collision with u at time slot t (lines 7-12). Then the algorithm selects the nodes in a set I that have the fewest interference nodes to schedule as many nodes as possible. With this selection, we can schedule the most concurrent nodes possible. The selected node is, then, added to selected nodes set SN and removed from V R . The nodes are added to SN as long as |I| is not larger than |F | (lines [13][14][15][16][17][18][19][20][21][22]. After that, we apply a coloring method c to nodes in selected nodes SN , the channel of node u is f (u) calculated as (c(u) mod |F |) + 1, and the transmitting time slot of node u is t(u) calculated as w.L + τ (lines [24][25][26][27]. Then node u and the link from u to its parent (u, p(u)) are removed from the tree T (V T , E T ) (lines [28][29]. After applying algorithm 3 in the first working period, Fig. 5 shows the reinforcement schedule on an updated tree. After they are scheduled, nodes H, I and J are removed from the tree. The current working period is w = 1. Algorithm 4 starts checking unused channels at each time slot in this first working period. At slots τ = 0 and τ = 1, there are no unused channels since there are some scheduled nodes at these time slots. At slots τ = 2 and τ = 3, the channel is unused, but there are no nodes that satisfy the three conditions to add to the reinforcement nodes set V R . The algorithm continues with slot τ = 4 at the current working period w = 1, there is no channel being used and link KF is satisfied the three conditions which K is a leaf node, the receiver F is active at this slot and it does not receive any data from other nodes in this time slot as shown in the Fig. 5a. Node K is added to the reinforcement nodes set V R , the interference set of node K at slot 4 is I 4 K = ∅ since only K is in V R . Then node K is added to selected nodes set SN 4 at slot 4. A coloring method c is applied to set SN 4 in which c(K) = 0, so that the channel of node K is f (K) = (0 mod 1) + 1 = 1 and the transmitting time slot of K is t(K) = (1 − 1).10 + 4 = 4. Then, node K and link KF are removed from the tree. In the current working period, the same procedure is followed for later time slots. The sample network's final schedule is shown in Fig. 5b; the scheduling takes two working periods to complete. DRAFT October 30, 2021 Algorithm 4: Reinforcement schedule Input : G(V, T ), T (V T , E T ), w, L, F Output: Nodes scheduled on unused channels and time slots in a working period Sort SN based on |I u | in a non-increasing order based on the constructed interference sets I and active slots of nodes in the candidate set. Then, two coloring methods c are applied to nodes in the selected nodes set SN to ensure that two interfered links, which start from two nodes in the selected nodes set to their parents, are assigned different colors. The colors are then used to assign different channels and time slots to these nodes. Scheduled nodes are removed from the tree at the end of each iteration.
Algorithm 4 schedules nodes in the tree on available channels at each specific time slot. First, the algorithm finds reinforcement leaf nodes set V R to be scheduled. Then, for each node in the reinforcement nodes set V R , it constructs an interference set I. A selected node-set SN is obtained based on the interference set I. Finally, a coloring method c is applied to the selected nodes set SN to assign two nodes that form two interfered links in different colors. Different channels are assigned to these nodes. As a result, the LIRE-scheduled nodes are collision-free.

V. PERFORMANCE EVALUATION
We use Python to evaluate the performance of our proposed scheme, LIRE (LInk-delay-aware REinforcement scheduling). We compare our scheme with the existing algorithm named NDAS (Novel Data Aggregation Scheduling) in [32] in various network settings. By varying the number of channels, the number of nodes, the lengths of working periods, and the number of active slots, we compare LIRE's performance to that of the reference scheme in terms of data aggregation time.  The simulation settings are presented in Table II working periods to aggregate data from all nodes whereas NDAS takes 73 working periods. The transmissions in the LIRE scheme appears to be concentrated on some first working periods, up to 9 transmissions in a working period. In later working periods, from 35% to 80% the total working period, every slot almost has 3 transmissions on average. With the NDAS scheme, the transmissions concentrate in the first 40% of total working periods, i.e., up to 6 transmissions.
NDAS scheme leaves many unused slots in each working period since it uses only 1 or 2 slots to schedule nodes. Because a node has a high density of children and it is active at 2 slots, it can receive data from two children maximum in each working period. So that other slots in that working period are unused. To receive more data from its other children, the node must wait until the next working period.

A. Impact of duty cycle
By increasing the number of active slots from 2 to 7, we can compare the aggregation delay between NDAS and LIRE schemes by changing the number of active slots from 2 to 7 as shown in Fig. 8  schedules nodes from the bottom up. When the number of active slots is small, the algorithm schedules nodes from the leaf nodes, it takes advantage of pipeline transmissions from the tree construction phase by selecting the same nodes in the pipelines, then the number of transmissions in a working period increases. However, when the nodes are active at more slots, nodes are selected to schedule becoming randomly, not related to the pipelines. So that the number of transmissions in a working period decreases, which means that the aggregation delay is higher when the number of active slots increases. We conduct experiments where nodes in a network are active 2 slots in a working period length L = 10. Fig. 9a presents the impact of the number of nodes on the aggregation delay. We change the network size from 50 to 1000 nodes, and the number of channels m = 1 (NDAS-1 and LIRE-1) and m = 2 (NDAS-2 and LIRE-2). When the network gets denser, both NDAS and LIRE need more time to schedule nodes so that the aggregation delay of both schemes gets higher. However, LIRE achieves a lower aggregation delay than NDAS and its slopes are gentler than NDAS' when the number of nodes increases from 50 to 1000. The reason for this is that LIRE scheduling algorithm takes advantage of pipeline transmissions created during the tree construction phase, which affects the nodes with fewer active slots. Moreover, when the scheduling algorithm schedules more nodes on unused channels and time slots of the whole working period, that helps reduce the aggregation delay further, i.e., LIRE achieves better aggregation delay than NDAS up to 59.23% (1 channel) and 55% (2 channels).
The impact of the number of channels on the aggregation delay is shown in Fig. 9b period or at different working periods. Therefore, this solution does not leverage the increasing number of channels. LIRE, on the other hand, schedules nodes from a tree with a relatively even distribution of children. Therefore, different channels are useful to allocate to these nodes, which occur the secondary collisions. When the network is denser, say 500 nodes, it is even more effective .

C. Impact of working period length
We perform the simulations with varying the network size 200, 500, and 1000 nodes. With each network size, we compare the aggregation delay of NDAS and LIRE using 1 channel (NDAS-1 and LIRE-1) and 2 channels (NDAS-2 and LIRE-2) by changing the working period length from 10 to 70. The number of active slots of each node is fixed as 2. Fig. 10 shows how the length of the working period affects the aggregation delay of LIRE and NDAS. LIRE has a lower aggregation delay in all cases when compared to NDAS. The aggregation delay increases as the working period length increases. However, the aggregation delay of LIRE increases slightly, while the aggregation delay of NDAS increases dramatically. The results show that LIRE using 2 channels performs better aggregation delay than NDAS up to 82.65%, 84.88%, and 85.75% when N = 200, 500, and 1000 nodes, respectively. Because the working period length is large, nodes are active at a few slots randomly in the working period. In a single working period, pipeline transmissions in LIRE become longer. It means that in LIRE more nodes are scheduled DRAFT in one working period while NDAS waits for the next working periods to schedule nodes having the same parents.

D. Impact of starting active slot in aggregation tree construction
The aggregation delay is also influenced by the starting active slot chosen when building the aggregation tree. We conduct the simulations to compare the aggregation delay when starting from the highest active slot and the lowest active slot values with 1 channel (From high active slot-1, From low active slot-1) and 2 channels (From high active slot-2, From low active slot-2) in varying the network sizes 200, 500, and 1000 as shown in Fig. 11. In this experiment, the number of active slots is changed from 2 to 7 in a working period length L = 10, nodes in the network are active at 2 slots. The results show that starting from the highest active slot of the sink achieves lower aggregation delay than from the lowest active slot, i.e., at most 6.13% (for 1 channel), 10.5% (for 2 channels), and 11.88% (for 2 channels).

VI. CONCLUSION AND FUTURE WORK
In this paper, we study the DDA problem in multichannel duty-cycled IoT sensor networks.
LIRE scheduling is a novel aggregation approach that consists of two phases. First, we present a new top-down approach to construct the aggregation tree which leverages the gap between nodes' active slots. Second, scheduling algorithms schedule nodes in the tree in a bottom-up manner from leaf nodes. The reinforcement algorithm improves node scheduling by utilizing unused channels and time slots left by the main algorithm after each scheduling round. With extensive simulation scenarios, the results show that LIRE performs vastly superior comparing to the only best algorithm that pays attention on this problem in multichannel duty-cycled IoT networks, i.e., up to 81%. We intend to use a machine learning concept to solve this problem in the future, as well as consider energy efficiency together with the delay efficiency of the IoT networks in our research.