Active Neighbor Exploitation for Fast Data Aggregation in IoT Sensor Networks

Fast data aggregation is crucial for facilitating critical Internet of Things services as it enables the collection of sensory data within strict volume and time constraints. Over the past decades, the data aggregation scheduling problem for minimum latency has garnered significant research attention. Existing approaches to this problem typically schedule all data transmissions based on an aggregation tree, which is constructed without secondary interference. However, such interference can introduce delays when scheduling a transmission from a node to its parent in the tree. To this end, this study proposes an approach called active neighbor exploitation (ANEX) that enables sensor nodes to switch their parents by identifying active neighbors for potential connectivity, irrespective of the receivers established in the tree. Additionally, the scheme prioritizes scheduling nodes with the fewest unscheduled active neighbors, thereby allowing for more concurrent transmissions. ANEX is evaluated through theoretical analysis and extensive simulations under various scenarios. The results demonstrate that ANEX achieves up to 86% faster aggregation compared to the state-of-the-art approach while maintaining an equivalent time complexity.


Active Neighbor Exploitation for Fast Data Aggregation in IoT Sensor Networks
Van-Vi Vo , Duc-Tai Le , Syed M. Raza , Moonseong Kim , and Hyunseung Choo , Member, IEEE Abstract-Fast data aggregation is crucial for facilitating critical Internet of Things services as it enables the collection of sensory data within strict volume and time constraints.Over the past decades, the data aggregation scheduling problem for minimum latency has garnered significant research attention.Existing approaches to this problem typically schedule all data transmissions based on an aggregation tree, which is constructed without secondary interference.However, such interference can introduce delays when scheduling a transmission from a node to its parent in the tree.To this end, this study proposes an approach called active neighbor exploitation (ANEX) that enables sensor nodes to switch their parents by identifying active neighbors for potential connectivity, irrespective of the receivers established in the tree.Additionally, the scheme prioritizes scheduling nodes with the fewest unscheduled active neighbors, thereby allowing for more concurrent transmissions.ANEX is evaluated through theoretical analysis and extensive simulations under various scenarios.The results demonstrate that ANEX achieves up to 86% faster aggregation compared to the state-of-the-art approach while maintaining an equivalent time complexity.Index Terms-Coloring method, data aggregation, duty cycle, Internet of Things (IoT), multichannel, wireless sensor networks (WSNs).

I. INTRODUCTION
F AST data aggregation plays a crucial role in various Internet of Things (IoT) applications, where an efficient collection of sensory data under specific volume and time constraints is essential.Consequently, the research community has focused on addressing the minimum latency data aggregation scheduling (MLAS) problem.This problem centers around optimizing the scheduling of sensor nodes in a network to transmit data to a base station or sink while minimizing interference and achieving the lowest possible aggregation latency.To extend the lifespan of sensor nodes, the widely adopted duty-cycling technique has been embraced.By alternating between active and dormant states, duty cycling effectively conserves energy and enables prolonged network operation [1].Although the duty-cycling technique in wireless sensor networks (WSNs) is effective in energy conservation, it introduces longer dormant periods, resulting in increased data aggregation delays [2].
To address this challenge, the adoption of multichannel technology is promising for reducing aggregation time by enabling interference-free simultaneous transmissions [3], [4], [5], [6].Despite extensive research on the MLAS problem in singlechannel and always-active multichannel networks [7], [8], limited investigation has been conducted specifically focused on multichannel duty-cycled IoT sensor networks (MC-DC-MLAS) [6], [9].This research gap arises from the unique characteristics and challenges posed by the combination of multichannel technology and duty cycling.More precisely, the challenge lies in addressing suboptimal aggregation delays observed in prior two-phase solutions involving aggregation tree construction and node scheduling.This challenge stems from the unique characteristics and complexities introduced by the simultaneous use of multiple channels for communication while adhering to duty cycling schedules.Previous approaches encountered constraints in effectively optimizing data aggregation due to their strict adherence to senderreceiver pairs established during tree construction, thereby limiting their overall effectiveness.Consequently, in this study, we address the challenges presented by the MC-DC-MLAS problem and propose a novel approach to overcome the suboptimal aggregation delays observed in previous two-phase solutions.
This study aims to achieve low data aggregation latency in multichannel duty-cycled IoT sensor networks by proposing an active neighbor exploitation (ANEX) scheme.ANEX first constructs an aggregation tree using our earlier work [10] such that any transmission link between two sensor nodes forms a minimal sleep delay.At each time step, dynamic scheduling is applied, allowing eligible senders to adjust their receivers by leveraging active neighbors regardless of the predetermined receivers established during the tree construction phase.Additionally, the scheme prioritizes scheduling nodes with the fewest unscheduled active neighbors, enabling more concurrent transmissions in each time step.The combination of the minimal sleep delay tree construction and the dynamic scheduling algorithm in ANEX results in a significantly faster data aggregation for duty-cycled IoT sensors using 2327-4662 c 2024 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
multichannel technology.In summary, the major contributions of this research are stated as follows.1) A novel aggregation tree construction method based on the time difference between active slots of neighboring sensors is proposed.This method establishes senderreceiver pairs in the tree when the sleep delay between them is the smallest.2) A dynamic scheduling algorithm for duty-cycled IoT sensors using multichannel technology is proposed.This algorithm allows sender nodes to switch to new parents for prompt scheduling by discovering active neighbors for potential connectivity, regardless of the predetermined receivers during the tree construction phase.Furthermore, the scheme prioritizes scheduling nodes with the fewest unscheduled active neighbors, increasing the number of concurrent transmissions.3) We conduct comprehensive simulations to evaluate the performance of the ANEX algorithm in terms of aggregation delay and compare it with state-of-theart approaches.The experimental results demonstrate that the ANEX algorithm achieves impressively low aggregation latency in multichannel duty-cycled IoT.Additionally, we analyze the impact of initial aggregation tree construction methods in the first phase and types of node scheduling in the second phase on aggregation time.The remaining sections of this article are structured as follows.Section II provides a review of related works.In Section III, we present the network model, assumptions, and problem formulation.The ANEX scheduling scheme is described in Section IV.In Section V, we present the results of significant simulations and compare the proposed scheme with well-known and state-of-the-art approaches.Finally, we conclude our work and outline the future research directions in Section VI.

II. RELATED WORK
IoT devices are equipped with small sensors that collect data, interact with the environment, and communicate with each other.However, owing to their limited computational capabilities and power constraints, extending the lifetime of these sensor nodes for environmental sensing and data communication is a critical challenge in the literature.Data aggregation has emerged as a promising technique to address this problem.Numerous studies have investigated the application of data aggregation in sensor networks to achieve energy efficiency [11], [12], [13], [14], [15] or time efficiency, specifically in the context of MLAS [6], [16], [17], [18], [19], [20].Regardless of the specific objective, the ultimate aim of this research is to minimize data aggregation latency for enhancing the lifespan of sensor networks.
In always-on WSNs, sensor nodes are powered by batteries and remain active to continuously sense the environment or communicate with other nodes.The MLAS problem in always-on sensor networks has been extensively studied [16], [17], [21], [22], [23].Initially, the MLAS problem was proven to be an NP-hard in [21].The authors proposed a ( − 1) approximation algorithm, where + 1 represents the maximum node degrees within the network.Xu et al. [22] introduced a distributed scheduling scheme with a latency bound of 16R + − 14, where R denotes the transmission radius and represents the maximum node degree in the network.This proposed approach significantly improved upon the previous best data aggregation scheme [16], which had a latency bound of 24D + 6 + 16 timeslots, where D signifies the network diameter (D ≥ 2R).Another distributed algorithm proposed in [17] integrates the tree construction and node scheduling phases to enhance aggregation time efficiency.In the tree construction phase of this scheme, the goal is to maximize the available choices for parent selection for each node, thus optimizing timeslot reuse in the subsequent node scheduling phase.Similarly, a distributed algorithm proposed in [23] constructs the aggregation tree and schedules nodes simultaneously achieving faster data aggregation with an upper bound of 12R + − 2.
In duty-cycled networks, sensor nodes alternate between active and sleeping modes to conserve energy.However, in such networks, data transmission is delayed because sensor nodes can only receive data when they are in the active mode.Even if the sender nodes are ready to transmit data, they have to wait for the receivers to wake up for communication.Consequently, the aggregation time in duty-cycled networks is prolonged.The DC-MLAS problem was initially explored in [24], where it was proven to be NP-hard.The proposed algorithm, based on a connected dominating set (CDS), aims to minimize the aggregation delay with an approximation time not exceeding (15R(G, s) + − 3)|W|, where R(G, s) denotes the radius of the graph G with sink s, represents the maximum node degree in the communication graph, and |W| is the length of the working period.Subsequently, significant research efforts have been dedicated to the DC-MLAS problem [18], [25], [26].Ha et al. [25] proposed a time-efficient scheduling scheme that utilizes a balanced shortest path tree as the aggregation tree and an interferencefree node schedule.Another scheme was introduced in [18], which employs a CDS to construct the aggregation tree in the initial phase.The scheduling scheme then prioritizes the earliest available transmissions for fast data aggregation toward the sink.However, the aforementioned schemes strictly adhere to the predetermined aggregation tree during node scheduling.If the aggregation tree is not constructed optimally, numerous timeslots are wasted.Hence, Nguyen et al. [26] presented a dynamic scheduling algorithm that first constructs an aggregation tree based on the link delay between nodes.Subsequently, the algorithm schedules nodes using a breakand-join strategy, thereby achieving a lower aggregation time.
Although recently proposed solutions have addressed the issue by concurrently constructing the tree and performing node scheduling [19], [20], or dynamically scheduling nodes [26], these solutions assume that the entire network operates on a common communication channel.Additionally, these solutions assume that the sensor nodes function in very low-duty-cycled networks, where they are active only once in a working period.To overcome these limitations, researchers in [3], [4], and [5] have proposed algorithms for data aggregation in multichannel environments.Utilizing multiple channels enables an increased number of concurrent interference-free transmissions in each time schedule, thereby further reducing the aggregation delay.Bagaa et al. [3] presented a time-efficient aggregation scheduling approach called DAS-MC which integrates the aggregation tree construction and node scheduling phases.The scheme assigns parents to nodes in the aggregation tree to maximize parent selection and timeslot reuse.Nguyen et al. [4] introduced an extended relative interference graph to represent transmission interference and proposed a distributed algorithm that addresses the MC-DC-MLAS problem.Another distributed approach presented in [5], named DEDAS-MC, initially employs an optimal scheduling strategy to mitigate interference and minimize aggregation latency on a given tree.Subsequently, a distributed algorithm constructs an optimal aggregation tree using the Markov approximation method.
However, the aforementioned multichannel aggregation scheduling methods are designed for always-on networks only.Therefore, in [6] and [9], researchers present novel data aggregation scheduling schemes specifically tailored for multichannel duty-cycled IoT sensor networks.Jiao et al. [9] proposed two schemes, namely, EDAS and NDAS.Both schemes consist of two phases: 1) aggregation tree construction based on maximal independent set (MIS) and 2) node scheduling based on candidate active conflict graphs (CACGs) and feasible active conflict graphs (FACGs).These graphs illustrate the relationship of transmissions within one working period, allowing the schemes to avoid interference during scheduling.NDAS improves upon EDAS by considering the most recently scheduled working periods to schedule additional nodes, thereby reducing the aggregation delay.However, the use of MIS for tree construction can create bottlenecks for dominators in dense networks, as reported by [27].
Consequently, our previous work [6] introduced a novel approach for constructing the aggregation tree in situations where sensors are active at multiple timeslots within a working period.This scheme leverages the waiting time of receivers when they are active with their neighbors, resulting in a balanced number of child nodes for nonleaf nodes throughout the aggregation tree.Additionally, the constructed tree comprises pipeline links and incorporates a reinforcement scheduling algorithm that allocates more nodes to unused channels and timeslots in each scheduling round.Although these solutions have demonstrated good performance, they are limited by the fact that the node scheduling scheme strictly adheres to the predetermined sender-receiver pairs established during the tree construction phase.To address this limitation, this research introduces a new tree construction approach proposed in [10].This approach is based on the sleep delay between sensor nodes, ensuring that every sender-receiver pair has a minimum sleep-delay value.Moreover, a dynamic scheduling approach is employed, allowing senders to dynamically change their receivers for earlier data transmissions.Through the combination of the new tree construction approach and dynamic scheduling, our proposed scheme achieves fast data aggregation and outperforms state-of-the-art algorithms for multichannel duty-cycled IoT sensors.

A. Network Model and Assumptions
A static multichannel duty-cycled WSN is deployed in a specific area, consisting of |V| sensor nodes, including a sink node.Each sensor node is equipped with an omnidirectional antenna for wireless communication.We assume that two sensor nodes can communicate if their Euclidean distance is within the communication range d, signifying that they are neighbors.The sensor network is represented as an undirected graph, where each sensor node represents a vertex, and an edge connects two vertices to indicate a communication link between neighboring nodes.This graph is connected, ensuring the existence of a routing path from any node in the network to the sink.The sink node is responsible for aggregating data from all the nodes.Each sensor can switch among m orthogonal channels, denoted as F = {f 1 , f 2 , . . ., f m } for data transmissions.
In our system, we adopt time-division multiple access as the MAC protocol.Time is divided into slots and these timeslots are synchronized throughout the network using existing methods, such as Tiny-sync [28], SPiRT [29], and R4Syn [30].Sensor nodes utilize a half-duplex transmission protocol, meaning they cannot transmit and receive data simultaneously.We assume that all sensors in the network employ a dutycycle mechanism.Each sensor node independently switches between sleeping (inactive) and active modes.The entire time is divided into working periods consisting of the same number of |W| slots, denoted as W = {0, 1, 2, . . ., |W| − 1}.During data aggregation, the sensory data at the relay or sink nodes is obtained by applying a full aggregation function, such as maximum, minimum, sum, or average functions [31].Considering the characteristics of such data aggregation, all data packets transmitted during the aggregation operation have the same size.Thus, the period for data transmission remains constant, referred to as a timeslot and denoted as τ .In our work, we assume that a timeslot τ is sufficient for a round-trip data packet transmission, encompassing both data and ACK packets [32].Each sensor node is active for a randomly and independently selected number of α slots and remains in the sleeping state for the remaining slots within a working period to conserve energy.Therefore, α < |W|.Based on this, the duty cycle is defined as the ratio of α active slots to the working period length |W|, expressed as α/|W|.The sleeping and active pattern of each sensor repeats in every working period throughout its entire lifetime.Note that the terms "slot" and "timeslot" have been used interchangeably throughout this article.
In duty-cycled networks, sensors only receive data during their active slots and can wake up at any time to transmit data based on the active slots of the receivers.Other slots are left idle to conserve energy.Fig. 1 illustrates a comparison of the data aggregation process between always-on and duty-cycled networks.In this scenario, node S receives data from four senders: A, B, C, and D. In the always-on mode, S receives data from the four senders consecutively in four consecutive timeslots.However, in the duty-cycled mode, assuming that S is active during slots 0 and 3 (α = 2) in a working period of  length |W| = 5, S requires two working periods to collect data from the four senders sequentially.As a result, the aggregation process is completed in nine timeslots specifically the 4th slot of the 2nd working period.
We adopt the interference model as described in [33], which considers that sensor nodes cannot simultaneously transmit and receive data, due to their half-duplex mode.We denote the interference range of a sensor node as d I , where d I ≥ d.Interference occurs when one of the following two cases arises.
1) Primary interference occurs when two senders share the same receiver and simultaneously send data to it.This interference occurs at the common receiver.Fig. 2(a) illustrates an instance of primary interference at node A caused by nodes B and C transmitting data to A simultaneously.2) Secondary interference occurs when a receiver is in the process of receiving data from a sender and inadvertently overhears data from another neighboring node that is transmitting data to a different receiver.This results in interference at the former receiver.

B. Problem Statement
Upon network deployment, the sink node is assumed to have superior capabilities and comprehensive knowledge about the network.Once a potential data aggregation schedule is computed, the sink disseminates this schedule information by broadcasting it to all nodes within the network [34].Each node, except the sink, is assigned a unique channel and timeslot for data transmission, ensuring that concurrent transmissions are interference-free.Leveraging the data aggregation schedule, sensor nodes transmit their data, which is collected and stored in the sink.At (intermediate) relay nodes, all aggregated data is packed into a packet before being forwarded to the next hop.Note that the size of the aggregated data remains constant throughout the network.Consequently, a relay node can receive data from multiple senders (with each sender transmitting data only once) and transmit the aggregated data toward the sink in a single transmission.
Let G(V, E) represent the communication graph of dutycycled sensor nodes, where V denotes the set of nodes, including the sink denoted as s, and E represents the set of communication links among the sensors.In the context of the MC-DC-MLAS problem, we define the data aggregation schedule as follows.
Definition 1: A data aggregation schedule contains a sender set S = {S t,f }, where S t,f is the set of nodes scheduled at timeslot t (0 We aim to derive an aggregation schedule S that minimizes data aggregation delay D. We define data aggregation delay as the number of timeslots required to aggregate data by the sink from all nodes in the network interference-freely.Such aggregation schedule satisfies the following conditions. 1) to their respective parents do not interfere with each other.4) The sink s aggregates all data from other nodes in the network in D timeslots.Therefore, the MLAS problem in a multichannel dutycycled IoT sensor network aims to determine a collision-free aggregation schedule that minimizes the time required to aggregate data from duty-cycled sensors at the sink, while adhering to the protocol interference model.Formally, the problem can be formulated as follows.
Input: 1) G(V, E), the undirected graph consists of V sensor nodes (including the sink s) and E communication links.2) A(u) ∀u ∈ V, active slots of any node u in the network.
3) W, the working period.4) F, the set of available channels can be used for allocating the sensor nodes.Output: The aggregation schedule S = {(S t,f )} contains all nodes in the network scheduled with their corresponding Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.timeslots and channels, except sink s, such that data aggregation delay D is minimum.

IV. PROPOSED SCHEME
In this section, we begin by highlighting the advantages of our proposed scheme through a motivational example.Subsequently, we present our scheme, which comprises aggregation tree construction and dynamic scheduling approaches.To demonstrate the functionality of the algorithms within the proposed scheme, we provide an illustrative example using a sample network topology.All the notations and abbreviations used throughout this article are described in Table I.

A. Motivation
We commence with a motivational example that compares node scheduling between fixed scheduling and dynamic scheduling on a predefined aggregation tree, as depicted in Fig. 3.We consider a network topology consisting of five sensor nodes, with node S being responsible for collecting data from the others.Each sensor node is randomly active at two slots within a working period of length |W| = 5.Moreover, a channel set F = {1, 2} is available for sensor allocation.With an optimal scheduling scheme applied to the predefined SPT tree, node S collects aggregated data from other nodes in six timeslots (two working periods), as depicted in Fig. 3(a).
In contrast, as illustrated in Fig. 3(b), at timeslot 0, node B can transmit data to node C because C is active at that time.Consequently, B breaks the parent-child relationship with S and forms a relationship with C at timeslot 0. However, when both B and D transmit data simultaneously, an interference occurs at node A. As a result, B and D transmit data to their respective parents using different channels (D uses channel 1 and B uses channel 2) at timeslot 0. With this scheduling scheme, it requires four timeslots (one working period) for node S to aggregate data from the other nodes.

B. Minimum Sleep Delay Tree Construction
In the aggregation tree construction phase, directional childparent pairs are established to denote data transmissions, where the child node sends data to its parent.This child-parent pair can be interchangeably referred to as the sender-receiver pair used throughout this article.We define a sleep delay metric between two nodes' active slots as follows.
The sleep delay metric, initially presented in [6], represents the time interval required for a transmission between sender u and receiver v ∈ N(u) at their active slots τ u ∈ A(u) and τ v ∈ A(v) in a working period W, described as follows: The sleep delay measures the time for node u to wait until receiver v is active so that u can transmit data to v. Node u transmits data to v in the same working period when receiver v is active at a later active slot of u and u must wait for a new working period to send data to receiver v if receiver v is active before u.Intuitively, minimizing sleep delay between sender u and receiver v also reduces sleep delay waiting for transmissions of other nodes.As a result, the total latency will be reduced.Consequently, we define minimal sleep delay between sender u and receiver v as follows.
Definition 2: Given a working period W, the minimal sleep delay between sender u and receiver v ∈ N(u) is defined where we select the minimal sleep delay value between sender u and receiver v after calculating the gaps between their active slots.Algorithm 1 presents the construction of the aggregation tree based on the minimal sleep delay.The algorithm presumes the communication graph G(V, E), which includes the sink node s, and the working period W as inputs.The output is a complete aggregation tree T(V T , E T ).Initially, the sink node s is added to the node set V T , and the set of transmission links E T is empty (line 1).The network is then divided into R + 1 layers, L 0 , L 1 , . . ., L R , based on the hop distance to the sink.Layer 0 only contains the sink node.The algorithm proceeds in a top-down manner, starting from layer

Algorithm 1: Minimum Delay Tree Construction
determine the minimal sleep delay with each neighbor v ∈ L i−1 ∩ N(u) in the upper layer, using (2).The neighbor that yields the smallest value of minimal sleep delay with u is selected as the parent of u, and it is added to the aggregation tree (lines 4-7).This process is repeated for each node in each layer until all nodes are included in the aggregation tree.
Fig. 4 illustrates an example of the aggregation tree construction process, assuming a sample network comprising 15 sensor nodes [Fig.4(a)].Each node is randomly active at two slots in a working period length |W| = 5 when it is deployed.In other words, each node randomly selects two active slots in the range [0,4].The algorithm first divides the network into layers based on the hop distance of nodes to the sink S and adds S to the aggregation tree.Nodes in layer L 1 have only one neighbor in the previous layer, i.e., the sink S, so they all adopt S as their parent and are added to the aggregation tree one by one in lexicographical order [Fig.4  In the same process for the remaining nodes in the network, a complete tree is obtained as depicted in Fig. 4(d).

C. Node Scheduling
The node scheduling phase determines the time for sensor nodes to transmit data without interference, based on the constructed tree, by allocating channels and assigning timeslots to them.Algorithm 2 presumes the aggregation tree T(V T , E T ) including the sink node s, and the working period W as inputs.It iteratively identifies candidate senders and schedules eligible nodes selected from these candidates at each timeslot until all nodes, except the sink, have been allocated channels and assigned timeslots.A set SCH is initialized as empty, serving as a storage for scheduled nodes.The node scheduling process begins with t = 0 where the transmitting timeslot of all nodes in the network, except the sink, is set to None (lines 1-3).For each slot τ ∈ W, the set of candidate senders CA τ is determined to enable feasible scheduling.A node is added to the candidate senders set CA τ if it satisfies the following conditions.
Definition 3: A sensor node u ∈ V T is chosen as a candidate sender at slot τ if it satisfies the following conditions.1) The candidate senders must not have been scheduled yet.Additionally, they must either be leaf nodes or have all their children already scheduled, ensuring that the children have already collected data before the candidate senders begin transmitting the aggregated data from all their children.The last condition implies that at least one unscheduled neighbor is active at the current slot to receive the data.The selection of candidate senders based on the number of unscheduled neighbors affects the number of simultaneous transmissions because a transmission formed by a candidate sender that has more unscheduled neighbors will likely collide with other transmissions.As a result, the algorithm prioritizes selecting nodes from the candidate senders for scheduling based on their weight.The weight of a node u ∈ CA τ is defined as the number of unscheduled neighbors, i.e., N(u) ∩ (V T \ SCH) that are active at the current slot τ .The nodes in CA τ are sorted in nondecreasing order of their weight, ensuring that nodes with smaller weight are scheduled first if no data interference is detected.
After constructing the aggregation tree presented in Fig. 4, we apply Algorithm 2 to determine the candidate senders set that can be scheduled at timeslot t = 0. Starting at slot τ = 0, the candidate senders set CA 0 is determined based on the three defined conditions.In Fig. 5, CA 0 is depicted as G, H, I, J, and M. The weights of these nodes are calculated based on their unscheduled neighbors that are active at slot 0. For example, G has three unscheduled neighbors, but only N is active at slot 0, so the weight of G is 1 (i.e., G.weight = 1).Similarly, H and I also have the same weight.J has four neighbors (A, B, I, M), but only three of them (A, B, I) are active at slot 0, resulting in a weight of 3 for J (i.e., J.weight = 3).The weight of M is 3, as illustrated in Fig. 5.
Sets S τ and R τ are created to store the scheduled senders and their respective receivers.Algorithm 3 is then triggered to select scheduled nodes from CA τ .The inputs to the algorithm include the candidate senders set CA τ , scheduled senders S τ , respective receivers R τ , working period W, channels set F, scheduling time t, and the current active slot τ .The algorithm returns sets of scheduled nodes S τ and their respective receivers R τ at the current slot τ by dynamically changing the connections to increase simultaneous transmissions.The algorithm begins by initializing the sets EL and P to store eligible nodes and their respective receivers, respectively.These nodes can form interference-free transmissions, meaning they can be scheduled at the same timeslot without collisions.A node u is selected from CA τ , but u should not belong to P because if u is chosen as the parent of any node in EL during dynamic scheduling, u is discarded as a candidate sender.A neighbor Algorithm 3: Dynamic Schedule is chosen as a candidate receiver if it satisfies the following two conditions.1) Node v is active at the current slot and has not been scheduled to receive any data at this slot (line 4).2) Node v does not belong to the set P to avoid interference with other eligible nodes that have already selected nodes in P as their receivers (line 5).The algorithm allows node u to select a neighbor as its receiver regardless of whether u has already chosen its parent from the aggregation tree construction phase or if u does not have any parent at all.If the neighbor node v satisfies the two conditions mentioned above and has the smallest weight value compared to other neighbors, it is selected as the new receiver for node u (lines 6-10).Nodes u and v are added to the eligible nodes set EL and the respective receiver set P (lines 11 and 12).If the initial parent of node u, determined during the tree construction phase, has all its children scheduled in advance or has become a leaf node, it is added to the candidate senders set CA τ for possible scheduling at the current timeslot.
After examining all nodes in the candidate senders set CA τ at slot τ , we obtain a complete set of eligible nodes EL and the respective receivers set P. For each node in the eligible senders set EL and its respective receiver in P, the algorithm applies a coloring method c by assigning the same number (starting from 1) to nodes that can transmit data simultaneously without interference.Initially, a transmission formed by sender node u in EL to its parent is marked with a color value c(u) based on the sender's ID.Then, each transmission formed by a sender in EL and its parent is examined with the already colored transmission to check for any secondary interference.If interference occurs, the transmission is marked with a different color, and if not, it is marked with the same color as the already colored transmission.Depending on the number of available channels |F|, the channels assigned to nodes in EL corresponding to the coloring numbers assigned to the nodes (lines [16][17][18][19].If the coloring numbers of nodes exceed the number of available channels, those nodes are not scheduled.The nodes in EL transmit data at the same timeslot t (line 20).The scheduled senders set S τ and respective receivers set R τ at slot τ are updated by adding the scheduled nodes and their respective receivers (line 21).After receiving the updated senders and receivers sets from Algorithm 3, Algorithm 2 updates the scheduled nodes set SCH by adding the scheduled nodes S τ .The scheduling time t increases by one each time the scheduled nodes are updated.The node scheduling is considered complete when the number of scheduled nodes in SCH is equal to the number of nodes in the tree, excluding the sink.Continuing with the previous example, after determining the candidate senders CA 0 = {G, H, I, J, M} using Algorithm 2, Algorithm 3 determines the eligible senders EL for data transmissions starting at timeslot 0. Let us start with node G.After the tree construction phase, G initially adopts C as its parent.However, as C is inactive at slot 0 while a neighbor N is active, G changes its parent from C to N [Fig.6(a)].A similar process is applied to other nodes in CA 0 , resulting in the set EL = {D, G, H, I, J} and their respective receivers set P = {S, N, F, B, A} as displayed in Fig. 6(b).Here, D originally does not belong to the candidate senders set CA 0 , but as H switches its parent from D to F, D becomes a leaf node and is added to the set CA 0 , adopting the sink S as its parent.The coloring method is then applied to all transmission links formed by eligible nodes and their respective parents.Each link is examined individually: GN is colored as 1, followed by HF and IB being colored as 1 as no interference occurs when they simultaneously transmit data.However, when J sends data to A, interference occurs at B if JA and IB transmit data at the same time, so link JA is colored as 2. Link DS is colored as 3 because it causes collisions at A and B if D transmits data simultaneously with J or I. Thus, we obtain the coloring method for the transmission links formed by eligible senders and their respective receivers at timeslot 0 [Fig.6(c)].The scheduling time is incremented, and the node scheduling process continues until all nodes in this sample network are assigned channels and timeslots, as exhibited in Fig. 6(d).

D. Analysis
Theorem 1: Algorithm 3 (dynamic schedule) preserves the aggregation tree of the initial graph.
Proof: We use contradiction to prove this theorem.Assuming that the scheduling scheme completes the data aggregation for the network, a sequence of sensor nodes exists , . . ., k}, and p(v k ) = v 0 , it means that the next node in the sequence is the parent of the previous node.Upon selecting a candidate receiver v for a sender u, Algorithm 3 verifies that the receiver v has not been scheduled to send data yet at line 4, i.e., t(v) = None.This contradicts with p(v k ) = v 0 because node v 0 must be excluded when the algorithm checks the scheduled timeslot of candidate receivers Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply. of v k .When scheduling the time for node v 0 to transmit data to its receiver v 1 , the algorithm assigns a nonnegative integer to v, i.e., t(v 0 ) ≥ 0. Hence, the theorem is proven.Theorem 2: The schedule determined by ANEX is interference-free.
Proof: We use induction to prove this theorem.Assume that ANEX completes scheduling all sensor nodes in the network at timeslot t + 1 with interference-free.Proof: ANEX consists of two phases: 1) aggregation tree construction and 2) node scheduling.In the tree construction phase (Algorithm 1), the algorithm first divides the network into layers (line 2) required O(|V|) time.Then, for each layer in the network, the algorithm detects a sender for every node based on the sleep delay between them.This process requires O(|W||V| 2 ) time.Therefore, the whole process to construct the aggregation tree after combining the above steps is O(|W||V| 2 ) time.
In the node scheduling phase (Algorithms 2 and 3), at each slot τ ∈ W, Algorithm 2 determines a candidate set CA τ which consists of unscheduled nodes; their children are all scheduled or they are leaf nodes, and at the current slot τ each node has at least one unscheduled and active neighbor.Determining the candidate sender set requires O(|V| ), where is the maximum node degree in graph G.Then, the algorithm calculates the weight of nodes in CA τ one by one, this step requires O(|V| ).Thereafter, the algorithm requires O(|V|log|V|) time to sort them.At this point, the time complexity of these functions with working period W is O(|W||V| + |W||V|log|V|).
Algorithm 3 iterates candidate sender set CA τ requiring at most O(|V|) time.Inside the for loop, the algorithm first applies to change parents strategy (lines 3-7) in which the algorithm checks the number of neighbors of each node to switch to a new parent and decide if the previous parent can be added to candidate sender set CA τ , the process requires O( ) time.Then, the algorithm adds the nodes to the eligible node set EL and corresponding parents to P and decides if the old parent can be added to the candidate sender CA τ (lines 8-15), the time of this step is O( ).Finally, the algorithm allocates channels and timeslots to eligible nodes, this process yields O(|F|).Thus, the whole dynamic schedule algorithm requires O(|V| + |F|) time.The execution time Algorithm 3 for the whole working period is O(|W||V| + |W||F|).In final, the time complexity of ANEX after combining all the algorithms is O(|W||V| 2 + |W||F|).

V. PERFORMANCE EVALUATION
In this section, we conduct extensive simulations to evaluate the effectiveness of our proposed schemes.In terms of data aggregation latency, we compare the performance of our proposed scheme, ANEX scheduling scheme, with the most recent scheme [6], which introduces a time-efficient data aggregation scheduling approach called link-delay-aware reinforcement scheduling scheme (LIRE), in multichannel duty-cycled IoT sensor networks.We then evaluate the impact of initial aggregation trees by applying the various tree construction approaches to the same node scheduling algorithm.Furthermore, we compare the data aggregation latency between the dynamic scheduling algorithm, where the parent changing strategy is applied to senders during scheduling regardless of the established receivers in the tree, and the fixed scheduling algorithm, where sender-receiver pairs determined by the aggregation tree remain unchanged.Minimizing data aggregation latency in the MC-DC-MLAS problem has a direct impact on energy efficiency and network lifetime.Therefore, our evaluation primarily focuses on comparing the data aggregation delay with state-of-the-art approaches.
We use the Networkx library integrated with Python to evaluate the performance of our proposed scheme.The network size ranges from 100 to 1000 sensor nodes deployed in a fixed area of 100 m × 100 m.The communication range and interference range of sensor nodes are set to be equal, i.e., d = d i .We vary the number of available channels from 2 to 7. Additionally, we vary the working period length from 10 to 70 slots and the number of active slots in a working period from 1 to 7. The sink, denoted as s, is randomly deployed in the network area and is responsible for aggregating data from other nodes.The default simulation settings are provided in Table II.To measure the data aggregation latency, each parameter is in turn varied while maintaining the others fixed.Only the parameters that are changed are mentioned in the following subsections, whereas the unchanged parameters adhere to the simulation settings table.We generate 100 dutycycled sensor networks randomly for each parameter variation, and the results are obtained by averaging the outcomes from these 100 network topologies.

A. Impact of Active Slot
We deploy network sizes of 200, 500, and 1000 sensor nodes in the same network area.For each network size, the parameters followed Table II, except for the number of active slots of each sensor node which vary from 1 to 7 in a working period length |W| = 10.This variation corresponds to duty cycles ranging from 10% to 70%.In Fig. 7, we compare the aggregation delay between LIRE and ANEX using one channel (LIRE-1 and ANEX-1) and two channels (LIRE-2 and ANEX-2).The use of two channels helps sensor nodes avoid secondary inference during data transmission, resulting in improved results compared to using only one channel in all cases.
In both the one-channel and two-channel scenarios, ANEX achieves faster data aggregation than LIRE for network sizes of 200, 500, and 1000 sensor nodes.LIRE performs the best when the duty cycle is approximately 20%-40% and exhibits worse performance at duty cycles lower or higher than that Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.range (i.e., 10% or more than 50%).In contrast, ANEX outperforms LIRE at a duty cycle of 10% and continues to excel as the duty cycle increases up to 70%, reaching up to 85% in the 500-node network using one channel and up to 86% using two channels at a duty cycle of 70%.This superior performance can be attributed to ANEX exploring active neighbors that qualify for data transmissions at every timeslot, irrespective of the child-parent pairs determined during the aggregation tree construction phase.By employing a receiver-changing strategy, ANEX enables more concurrent transmissions at each timeslot, resulting in faster data aggregation.Moreover, as the duty cycle increased, ANEX required a shorter period to qualify the active neighbors for data transmission at each timeslot.Consequently, the performance of ANEX improved further with high-duty-cycled sensor nodes.ANEX performs superior compared to LIRE, especially when the duty cycle increases.This is confirmed by analyzing one specific case where a network consists of 500 nodes, where each node is active at seven slots in a working period |W| = 10 and uses two channels.Applying ANEX and LIRE for scheduling transmissions in this network, the results show that it takes only 30 timeslots (i.e., three working periods) for ANEX to complete data aggregation at the sink while LIRE takes 276 timeslots for the data aggregation.Fig. 8 presents the number of transmissions in each timeslot of the two schemes.As ANEX has more concurrent transmissions in every timeslot in the whole three working periods, i.e., up to 37 transmissions at timeslot 0, LIRE has some parallel transmissions (from 10 to 13 transmissions) at the first few working periods and leaves lots of timeslots for single or no transmission.Therefore, LIRE needs a lot more time to complete the data aggregation while ANEX can finish it ∼9× faster.

B. Impact of Network Size and Number of Channels
In this section, we assess the impact of network size and the number of channels on data aggregation delay.First, we vary the network sizes from 100 to 1000 sensor nodes in increments of 100 nodes while maintaining the other parameters constant.using one channel (LIRE-1 and ANEX-1) and two channels (LIRE-2 and ANEX-2).In both schemes, the aggregation delay increases with network size, which is expected as more sensor nodes need to be aggregated, resulting in a longer aggregation time.The use of multiple channels allows for the allocation of sensor nodes to avoid secondary interference during scheduling.For sparse networks (100-300 nodes), the effectiveness of using multiple channels in ANEX compared to a single channel is limited because secondary interference occurs less frequently.However, as the networks become denser, the utilization of multiple channels (ANEX-2) significantly improves aggregation delays, up to a 46% reduction compared to using a single channel (ANEX-1) in a network of 1000 nodes.Although both LIRE and ANEX experience an increase in aggregation delay with network size, ANEX exhibits a more gradual increase and consistently outperforms LIRE, achieving up to 72% improvement using one channel in a 300-node network and up to 76% improvement using two channels in an 800-node network.This superiority can be attributed to ANEX's ability to identify additional concurrent transmissions based on the active timeslot of receivers and its flexibility to change receivers determined even within the predetermined sender-receiver pairs established by the aggregation tree.
Thereafter, we vary the number of channels used for node scheduling from 1 to 7 while setting the remaining parameters for network sizes of 200, 500, and 1000 sensor nodes, as depicted in Fig. 9(b).As the number of channels increases, both LIRE and ANEX achieve lower aggregation delays as more parallel transmissions could be performed per timeslot without interference.Moreover, when the networks are denser, the use of multiple channels proves more effective in both schemes, as it enables the avoidance of secondary-interfered transmissions by operating on different channels.ANEX consistently outperforms LIRE across all cases of varying the number of channels and different network sizes.
In sparse networks (|V| = 200), ANEX demonstrates significant improvements in aggregation delay compared to LIRE, achieving up to a 70% reduction when using one channel.However, as sparse networks experience minimal interference, using more than three channels in ANEX does not provide any further delay reduction.In contrast, LIRE continues to improve its aggregation delay as the number of channels increases, resulting in ANEX outperforming LIRE by approximately 29% when using seven channels to allocate sensor nodes.In dense networks, ANEX maintains its superior performance over LIRE, achieving up to a 75% reduction in aggregation delay (using two channels) for a network size of 500 sensor nodes, and up to 79% reduction (using four channels) for a network size of 1000 sensor nodes.This advantage stems from ANEX's dynamic scheduling approach, where eligible nodes for transmissions are selected based on the number of neighboring nodes that may cause interference.By prioritizing nodes with the fewest interfering neighbors, ANEX effectively reduces delay and enhances data aggregation efficiency in dense networks.

C. Impact of Working Period Length
Subsequently, we investigate the impact of working period length on the aggregation delay for network sizes of 200, 500, and 1000 sensor nodes.While maintaining the simulation settings from Table II, we vary the working period length |W| from 10 to 70 with an increment of 10, reducing the duty cycle of the sensor nodes is reduced, i.e., from 20% to 3%.Fig. 10 demonstrates that the aggregation delay of both LIRE and ANEX increases as the duty cycle decreases.With a lower duty cycle, sensor nodes spend more time in the inactive mode, leading to increased waiting time for data transmissions by other nodes.The results in Fig. 10 indicate that ANEX using two channels displays minimal improvement compared to using one channel across all network sizes.This is because, in longer working periods, sensor nodes have different active timeslots, allowing them to transmit data without requiring multiple channels.
In terms of performance comparison with LIRE, ANEX outperforms in low-duty-cycle networks based on network sizes.In sparse networks, ANEX achieves up to a 70% improvement using one channel and up to 60% improvement using two channels.However, when both schemes use two channels for scheduling their performance becomes similar due to LIRE's aggregation tree construction, which balances the number of children among sensor nodes, enabling more concurrent transmissions.Additionally, in low-duty-cycled and sparse networks, where only a few neighbors are active simultaneously, the impact of ANEX's dynamic scheduling is less pronounced.In dense networks (|V| = 500 and |V| = 1000 sensor nodes), ANEX continues to outperform LIRE, Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.achieving up to a 79% improvement with one channel and up to a 76% improvement with two channels in data aggregation time.

D. Impact of Initial Aggregation Tree
We investigate the impact of different initial tree construction approaches in the tree construction phase while maintaining the same dynamic scheduling to evaluate the impact of the initial tree approaches on the aggregation delay while maintaining the same dynamic scheduling.In this section, we evaluate the MIS, breath-first search (BFS), and depth-first search (DFS) tree construction methods, comparing them to our proposed approach.We focus on high-duty-cycle networks with varying numbers of active slots from 1 to 7 in a working period of length |W| = 10, while maintaining the remaining parameters from the simulation settings table.
Fig. 11 displays the aggregation delay results for different initial trees.Overall, the DFS-based approach performs the worst while ANEX demonstrates superior performance compared to the other approaches.Specifically, ANEX improves the MIS-based, BFS-based, and DFS-based approaches by up to 18%, 7%, and 57%, respectively, at a duty cycle of 10%.As sensor nodes become more active, all schemes deliver similar performance, as dynamic scheduling can effectively explore active neighbors and increase the number of concurrent transmissions.
Thereafter, we set the number of active slots to 2 and vary the working period length from 10 to 70 with an increment of 10, resulting in a reduction of the duty cycle from 20% to 3%.Fig. 12 illustrates the aggregation delay results for these low-duty-cycle scenarios.As the working period length increases, the aggregation delay of all schemes increases owing to the longer periods of inactivity for the sensor nodes.Among the different approaches, the DFS-based approach exhibits the highest aggregation delay whereas the ANEX delivers the best performance.Specifically, ANEX improves the MIS-based, BFS-based, and DFS-based approaches by up to 21%, 9%, and 62%, respectively, at a network size of 500 nodes with a working period length of |W| = 70.

E. Impact of Dynamic Scheduling Strategy
In this section, we evaluate the impact of dynamic scheduling by modifying ANEX to exclude Algorithm 3, which is responsible for dynamic scheduling.In this modified version, Algorithm 2 does not invoke Algorithm 3 (line 12).Instead, after selecting the candidate senders set CA τ at slot τ , Algorithm 2 proceeds with checking possible interference to identify eligible nodes and then schedules them.The childparent (sender-receiver) pairs determined during the tree construction phase remain fixed throughout the entire scheduling phase.The fixed scheduling algorithm is implemented as follows, starting from line 12 in Algorithm 2.
1) Each node in CA τ and its corresponding receiver are added one by one into the eligible senders set EL τ and respective parents set P at slot τ , ensuring that each receiver in P has only one sender in EL τ .This is done to avoid primary interference during scheduling.2) Using the number of available channels, a coloring method is applied to nodes in EL τ .Transmissions from nodes in EL τ to their respective parents in P, which would cause secondary interference are marked with different colors.Nodes in EL τ are assigned different channels if they are marked with different colors, and the same channel if they are marked with the same color.All these nodes are assigned to the same timeslot τ .We first investigate the loads of intermediate nodes (dominators) in different aggregation trees.Loads of these dominators in the aggregation tree greatly impact the aggregation time because the node only receives data from one sender at a timeslot.In other words, the aggregation delay increases because of the bottlenecks occurring at the intermediate nodes with more children.We construct different aggregation trees and then statistic loads of dominators (nonleaf nodes) from the same network of 500 nodes in an area of 100×100 m 2 , sensors are active at α = 2 slots of the working period length W = 10.Fig. 14 presents the loads of dominators in the MIS-based tree [9], the link-delay-based tree [6], the sleep-delay-based tree (presented in Section IV-B), and the final aggregation tree obtained after applying Dynamic scheduling.The figure only displays dominators of the trees since we want to investigate the impact of dominator loads on data aggregation delay.A central point in each circle represents the dominator and the size of the circle represents how many senders (children) that the dominator has.In other words, the dominator (central point) aggregates data from more dominatees if the size of the circle is large and vice versa.
In Fig. 14, the MIS-based and sleep-delay-based trees have many high-load dominators so if the scheduling schemes follow these trees for data scheduling they produce fairly high aggregation delays as claimed in [6] and Section V-E.To increase concurrent data transmissions and reduce the burden for high-load dominators, the link-delay-based tree [6] and ANEX produce the aggregation trees in which the number of dominators enlarges and these dominators adopt nearly equal number of senders.The combination of a proper aggregation tree construction approach and an effective scheduling scheme, the total aggregation delay significantly reduces.
Fig. 13(a) illustrates the aggregation delay results obtained by ANEX with and without dynamic scheduling using 1 channel (Dynamic scheduling-1, Fixed scheduling-1) and 2 channels (Dynamic scheduling-2, Fixed scheduling-2) while varying the network size from 100 to 1000 nodes in increments of 100 nodes.As the number of nodes increases, the aggregation delay of all schemes increases to schedule more nodes.In both cases of channel usage, Dynamic scheduling outperforms Fixed scheduling and is more effective in dense networks.For example, at a network size of 100 nodes, Dynamic scheduling achieves a 39% and 42% faster aggregation than Fixed scheduling using 1 channel and 2 channels, respectively.At a network size of 1000 nodes, Dynamic scheduling improves Fixed scheduling by up to 71% and 83% using 1 channel and 2 channels, respectively.The reason for the effectiveness of Dynamic scheduling in dense networks is that, at any given timeslot, a higher number of neighboring nodes are active simultaneously.This allows ANEX to easily switch the parents of sensor nodes for faster aggregation.Fig. 13(b) demonstrates that around 60% of nodes changed their parents for faster aggregation in a network size of |V| = 100 nodes.As the network size increases, particularly beyond |V| = 400 nodes in the same area, the Dynamic scheduling scheme becomes even more effective, with over 90% of the nodes changing their parents during the scheduling phase in networks ranging from |V| = 400 to 1000 nodes.The sleep-delay-based algorithm produces an aggregation tree containing high-load dominators as presented in Fig. 14, senders of these dominators have to wait for some time for their turn to transmit the data.With the same network area, the number of sensors deployed increased meaning that the network density increased.Therefore, the node degrees in the dense networks are high.During the data scheduling process, at a certain timeslot, the scheduling algorithm tries to maximize concurrent transmissions, regardless of predetermined sender-receiver pairs in the sleep-delay-based tree, the candidate senders select receivers active at the current slot.Herein, they have more respective candidate receivers in the dense network so they easily switch to new receivers for prompt data transmissions.
To illustrate the impact of Dynamic scheduling, we analyze the number of children for a network size of |V| = 500 sensor nodes before and after applying Dynamic scheduling using 2 channels, as indicated in Fig. 13(c).Initially, after the tree construction phase, there are nodes with up to 14 children.With Fixed scheduling alone, these nodes experience bottlenecks, resulting in high aggregation delays.However, after applying Dynamic scheduling, sensor nodes switched their parents to enable earlier transmissions, leading to a more balanced distribution of children among sensor nodes, with an average of eight children.These statistics, as depicted in Fig. 13(b) and (c), explain why Dynamic scheduling achieves faster aggregation.

F. Simulation With Intel Berkeley Research Lab Network Settings
Intel Berkeley Research Lab (IBRL) sensor data [36] is collected from 54 sensors deployed in the IBRL between 28 February 2004 and 5 April 2004, as shown in Fig. 15(a).

TABLE III ENERGY CONSUMPTION MODEL
Mica2Dot sensors with weatherboards collected timestamped topology information, along with sensory data values once every 31 s.Data was collected using the TinyDB in-network query processing system, built on the TinyOS platform.
With the real locations of Mica2Dot sensors deployed in the IBRL, we simulate them by using the Networkx library in Python with the communication range d = 10 m, as presented in Fig. 15(b).Herein, we assume that node 1 is responsible for aggregating data from others.By default, we simulate the network using two channels (|F| = 2), each sensor is active at two slots (α = 2) in a working period length |W| = 10, and the communication range of the nodes is d = 10.We apply the ANEX scheme and compare the performance with the LIRE [6] and NDAS [9] schemes in terms of aggregation delay by varying the number of active slots α [Fig.16(b)] and communication range d [Fig.16(c)].The ANEX scheme achieves the best performance in both cases.When the number of active slots is 2 (duty cycling 20%) and the transmission range is 10 m, the ANEX scheme takes a total of 24 timeslots for node 1 to collect data from other nodes, while LIRE and ANEX need 173 and 53 timeslots, respectively.
According to [37], the energy consumption of Mica2dot sensors is measured and calculated as presented in Table III.Herein, the data rate is 12.4 kb/s and a data packet size is Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.assumed to be 1000 bits.As claimed by [38], the energy consumed for channel switching is very small so it is negligible.We assume that every node's initial power E i is 3 W.We calculate the lifetime of the network when applying different data scheduling schemes.In one data aggregation round, each sensor node transmits data only once so the number of data transmissions of a node is N t = 1, the number of data receptions depends on the number of children of the node expressed as N r = |C|, where |C| is the number of children that the node has, the number of times that the sensor is idly active in one aggregation round is N ilde = α (D/|W|) − |C|, and the total times that the sensor is in sleep mode presented as N sleep = (W − α) (D/|W|) − 1.We put it all together, the total energy consumption of a sensor in one data aggregation round is computed as From (4), we calculate the network lifetime when different scheduling schemes are used.The network lifetime is considered as the period from when the network operates before the first sensor is out of battery where n is the number of data aggregation rounds and E i is the initial energy of the sensor node.We summarize the values for calculating the energy consumption of the sensors when We apply values in Table IV to calculate the energy consumption in one aggregation round E and the number of aggregation rounds using different scheduling schemes as follows.
Based on the results calculated above, NDAS consumes the most energy among the three schemes, while ANEX consumes the least energy for data scheduling.With the initial energy E i = 3000 mW, the NDAS scheme can aggregate data only one round, the LIRE scheme aggregates data nine rounds, and the ANEX scheme can perform data aggregation up to ten rounds.
The operational viability of ANEX in real IoT sensor networks is confirmed through experimentation using a proofof-concept testbed consisting of three Raspberry Pi boards and ultrasonic sensors.The initial results have confirmed that ANEX achieves faster data aggregation due to dynamic parent switching capability compared to a fixed scheduling scheme.

VI. CONCLUSION
This study proposes a centralized data aggregation scheduling scheme, ANEX scheduling scheme, which consists of two main phases.The first phase involves constructing a minimal sleep delay tree, which establishes sender-receiver pairs based on the active time difference between neighboring sensor nodes.The sender-receiver pairs are determined by minimizing the sleep delay between a node and its neighbor in the upper layer.This proposed tree approach works effectively on duty-cycled networks where the sleep delay metric is well defined.In the second phase, a node scheduling algorithm is applied at each time step.Initially, candidate senders are selected based on the number of unscheduled neighbors, regardless of the active slots of their receivers.Among the selected candidates, a dynamic scheduling algorithm is employed.This algorithm allows senders to switch to new parents that are active at the current time or have fewer unscheduled neighbors.The parent switching mechanism only operates on duty-cycled networks and works more efficiently in dense networks using multiple channels.This approach enables sensor nodes to transmit data earlier to their new parents, increasing the number of concurrent transmissions and resulting in faster data aggregation.
The effectiveness of ANEX in achieving efficient data aggregation is demonstrated through theoretical analysis and extensive simulations, particularly in terms of aggregation delay compared to a modern approach.ANEX consistently Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
achieves significantly lower data aggregation latency, especially in dense networks using at least two channels, the improvement is up to 86% compared to the most recent scheme addressing the MC-DC-MLAS problem.In the future, we plan to explore the application of machine learning algorithms to solve the MLAS problem for both battery-powered and battery-free IoT sensors.Additionally, we aim to address the energy-efficient data aggregation in IoT to prolong the network lifetime.

Fig. 2 .
Fig. 2. Two types of interference in wireless communication (d I = d).(a) Primary interference at A. (b) Secondary interference at A.

Fig. 2 (
b) provides an example of this interference scenario.Node A has two neighbors, namely, B and C. While node A is receiving data from sender B, node C transmits data to its receiver D. Due to wireless communication, node A inadvertently overhears the data being transmitted by node C. If both B and C transmit data to their respective receivers (A and D) simultaneously, interference occurs at node A.

Fig. 3 .
Fig. 3. Motivation example of dynamic scheduling: fixed scheduling based on the SPT tree needs six timeslots (a) and parent switching by node B from S to C at timeslot 0 while dynamic scheduling needs four timeslots (b).

Fig. 4 .
Fig. 4. Minimum sleep delay tree construction process.(a) Communication graph.(b) Tree construction for node B in layer L 2 .(c) Tree construction for node G in layer L 2 .(d) Complete tree.
(b)].Fig. 4(c) presents nodes in layer L 2 examined individually in lexicographical order.Starting with node B, it has two neighbors in the previous layer, namely, A and D, the minimal sleep delay between B and the neighbors is denoted as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
15 return S d min (B, A, |W|) = d min (B, A, 5) = 1 and d min (B, D, |W|) = d min (B, D, 5) = 1.As d min (B, A, 5) = d min (B, D, 5) = 1, B adopts A as its parent by the lexicographical order.Node F gets K to be its parents as F has only one neighbor K in the previous layer.The minimal sleep delay values between the node G and previous layer neighbors are d min (G, C, 5) = 1 and d min (G, K, 5) = 2. Therefore, G adopts C as its parent.

Fig. 6 .
Fig. 6.Dynamic scheduling process.(a) Dynamic parent selection applied for node G. (b) Eligible senders with their respective receivers at timeslot 0. (c) Coloring method applied for transmission links formed by eligible senders with their respective receivers.(d) Complete schedule with parent changing strategy of all nodes.

Fig. 9 .
Fig. 9. Aggregation delay with the varying (a) number of nodes and (b) number of channels.

Fig. 9 (
a) illustrates the aggregation delay of LIRE and ANEX Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 13 .
Fig. 13.Effects of dynamic scheduling on aggregation delay.(a) Aggregation delay of dynamic and fixed scheduling approaches.(b) Percentage of nodes changing parents.(c) Node degrees before and after dynamic scheduling.

Fig. 16 .
Fig. 16.Aggregation delay with the varying number of active slots and communication ranges.(a) Load distribution of dominators.(b) Effect of active slots.(c) Effect of communication range.
By induction, we prove that ANEX schedules all nodes in the network interference-free.Theorem 3: The computation time complexity of the proposed approach ANEX is at most O(|W||V| 2 + |W||F|).
1) At timeslot τ = 0, Algorithm 2 first computes a candidate sender set CA 0 .Then, the algorithm triggers Algorithm 3 to obtain eligible sender nodes EL and corresponding receiver nodes P for scheduling.Primary interference is omitted in transmissions between nodes in EL and nodes in P, i.e., ∀v ∈ P, ∃!u ∈ EL|p(u) = v.Then based on the channel set F, Algorithm 3 applies t , i.e., CA t+1 ∩ CA t = ∅.Algorithm 3 eliminates primary interference and obtains eligible sender set EL.After that, the algorithm applies the coloring method on nodes in EL to avoid secondary interference.Therefore, the set of scheduled nodes at timeslot τ = t + 1 are interference-free and distinguished from scheduled nodes at timeslot τ = t.

TABLE IV STATISTICS
VALUES FOR ENERGY CONSUMPTION IN DIFFERENT SCHEDULING SCHEMES applying different scheduling schemes in Table IV based on Fig. 16(a)-(c).