Machine Learning for Disseminating Cooperative Awareness Messages in Cellular V2V Communications

—This paper develops a novel Machine Learning (ML)-based strategy to distribute aperiodic Cooperative Awareness Messages (CAMs) through cellular Vehicle-to-Vehicle (V2V) communications. According to it, an ML algorithm is employed by each vehicle to forecast its future CAM generation times; then, the vehicle autonomously selects the radio resources for message broadcasting on the basis of the forecast provided by the algorithm. This action is combined with a wise analysis of the radio resources available for transmission, that identiﬁes subchannels where collisions might occur, to avoid selecting them. Extensive simulations show that the accuracy in the prediction of the CAMs’ temporal pattern is excellent. Exploiting this knowledge in the strategy for radio resource assignment, and carefully identifying idle resources, allows to outperform the legacy LTE-V2X Mode 4 in all respects.


I. INTRODUCTION AND STATE OF THE ART
Present days witness an increased and widespread sensitivity to road safety and sustainable transports. Day 1 safety applications are already present on vehicles, to increase space awareness and grant the car and its driver more time to react to unexpected situations. Safety will be further improved by upcoming applications, whose distinctive feature is to rely on vehicular communications. The onset of vehicular networking represents a major turning point, as it lies the basis for Day N services, where fully autonomous and cooperative driving turn into reality, and the goal of secure and more environmentfriendly transports is accomplished.
In the field of vehicular communications, Long Term Evolution Vehicle-to-Everything (LTE-V2X) is the current cellular standard, and Mode 4 represents the baseline approach for safety services, as its communications occur with no network assistance The performance of Mode 4 distributed radio resource selection and scheduling has been investigated by numerous works [1]- [6]. Recently, some investigations outlined that LTE-V2X falls short when dealing with aperiodic, unpredictable packet flows [7] [8], and also struggles when transmitting aperiodic messages of variable size [10]. New Radio (NR)-V2X, the LTE-V2X evolution in the fifth generation (5G) of cellular networks, inherits the majority of LTE-V2X core choices and, as a consequence, the question of how to schedule aperiodic traffic remains unanswered.
The current work intends to offer a contribution in this domain, taking a fresh look at the problem of aperiodic safety message dissemination. It concentrates on the main traffic type that LTE-V2X was designed to deliver, namely, Cooperative Awareness Messages (CAMs), application-layer packets standardized by the European Telecommunications Standards Institute (ETSI), and it proposes to harness Machine Learning (ML) to effectively broadcast such messages.
As matter of fact, ML has recently stirred an unprecedented interest and consensus in numerous wireless settings. This major branch of artificial intelligence is often seen as the appropriate tool to pick the lock of complex problems encountered in, e.g., radio resource allocation and optimization; with no ambition for completeness, [11]- [13] represent captivating examples in the field. The survey in [14] offers an excellent portray of the recent ML applications to the specific domain of vehicular networks. However, to the authors' knowledge, none of the studies in the field have scouted the adoption of ML in V2V safety communications.
This study proposes to interpret the aperiodic CAM sequence as a series of sub-sequences that are periodic over a short time scale, and to rely on ML to forecast the subsequence length and periodicity. Then, the idea is to tailor LTE-V2X radio resource reservations so as to fit the period and length of the single sub-sequence forecast by ML, reducing the risk of future collisions. Additionally, the identification of free radio resources performed by LTE-V2X (and NR-V2X) is modified and made more effective. The main outcomes of this work are as follows: • ML achieves an excellent accuracy in predicting the temporal patterns of CAMs; • the new, ML-enhanced scheduling of resources outperforms LTE-V2X Mode 4 under all points of view, warranting higher rates of packet delivery, fewer collisions and better channel utilization. When the literature on CAM distribution is explored, it is worth recalling [15], which determined an accurate broadcasting threshold and broadcasting interval, as well as [16], that proposed a novel triggering condition based on the road radius, assumed as a risk indicator. Moreover, the authors of [17] explored a simple mechanism to confine the queueing delays suffered by CAMs when they coexist with different traffic types. In the above-referenced papers, the authors intervened beforehand, modifying the generation pattern of CAM traffic.
As opposed to such contributions, the current work forecasts when next CAMs will be generated in accordance with the ETSI EN 302 637-2 standard [18] and reserves resources accordingly, modifying the LTE-V2X scheduling in a very effective manner.
An alternative approach was taken in [19], where CAMs were compressed to reduce the channel load, and in [20], where the LTE-V2X resource allocation was tuned to the different sizes that data packets exhibit. In the last two references, the issue of radio resource assignment was considered; yet, in the former contribution, the authors themselves evidenced that compressing and decompressing is time consuming; as regards the last paper, it is observed that the size of CAM messages is not known a priori and can vastly vary, which prohibits an effective adoption of the second solution. On a different rim, both [21] and [22] considered message delivery for cooperative awareness, but focused on Carrier Sense Multiple Access/Collision Avoidance (CSMA/CA), the access strategy adopted in the Medium Access Control (MAC) sublayer of 802.11p. Namely, [21] considered a simplified, periodic model for CAM traffic and leveraged on full-duplex transceivers; [22] highlighted the impact of realistic mobility patterns on the 802.11p operation. On the contrary, our contribution is centered on LTE-V2X, the competitor standard; it is the latter that represents the actual term of comparison when assessing the behavior of the newly proposed, ML-based scheduling scheme.
Additional references are represented by [23] through [26]: the authors of [23] examined LTE-V2X Mode 3, hence the scenario where the eNodeB controls the allocation of resources to V2V communications; the authors of [24] investigated a centralized multicast/broadcast approach too. Conversely, our solution is totally decentralized, as LTE-V2X Mode 4 mandates. The study in [25] faced the design of V2V communications and employed the sub-6 GHz band exclusively for the control plane, whereas the data plane was positioned at mmWave frequencies. Similarly to [25], [26] considered mmWave communications and allowed for multihop transmissions among vehicles. On the contrary, the current investigation is sub-6 GHz centered and examines single-hop transmissions, adhering to the standard guidelines for cellular vehicular communications. Within this framework, [7] and [8] already highlighted how the non-ideal periodicity of packet generation affects the operation and performance of C-V2X in LTE; 5G vehicular communications were studied in [9] and similar conclusions were drawn with reference to aperiodic traffic. Here, our former studies are continued and a new research path is paved, as: • it is asked whether ML can help in serving aperiodic traffic in LTE-V2X, given the latter is a recognized benchmark for safety communications in a vehicular environment; • a largely positive answer is provided. The LTE-V2X standard is therefore enhanced with a mechanism to predict when CAMs will be generated and when to reserve radio resources on the time-frequency grid of LTE.
The rest of the paper is organized as follows. In Section II, the main features of LTE-V2X are recalled, along with the challenges that the standard Mode 4 faces in the dissemination of aperiodic traffic. The generation rules of CAMs and their intrinsic aperiodicity are also discussed. In Section III, the ML-based policy to accommodate aperiodic CAM traffic on the time-frequency grid that LTE-V2X adopts is presented in detail. In Section IV, the metrics to evaluate the performance of any radio resource assignment strategy in a vehicular environment are introduced. In Section V, the simulation results are presented and the conclusions are drawn in Section VI.

A. LTE-V2X in Release 14
The LTE-V2X solution for vehicular communications has been standardized by the Third Generation Partnership Project (3GPP) in Release 14. Also known as Cellular Vehicle-to-Everything (C-V2X), this technology was mainly designed to disseminate CAMs, Decentralized Environmental Notification Messages (DENMs) and Basic Safety Messages (BSMs), and therefore to allow the development of a first, fundamental set of safety applications. In order to support vehicular communications in both in-coverage and out-of-coverage scenarios, LTE-V2X introduced two different resource allocation schemes known as Mode 3 and Mode 4. Mode 3 delegates the selection of collision-free radio resources to the evolved Node B (eNB), which coordinates the assignment of resources to all vehicles under cellular coverage. However, safety-critical applications cannot depend on the availability of the cellular infrastructure; hence, Mode 4 has been designed to allow vehicles to select resources via an autonomous and distributed approach that requires no eNB assistance.
In LTE-V2X Mode 4, vehicles communicate over a 10 or 20 MHz wide channel located in the 5.9 GHz Intelligent Transport System (ITS) band. At physical layer, Orthogonal Frequency Division Multiplexing (OFDM) is employed with a fixed 15 kHz subcarrier spacing, and transmission resources are arranged over the time-frequency resource grid exemplified in Fig. 1. The time unit is the subframe, whose duration is t s = 1 ms, whereas the basic frequency unit is the Resource Block (RB), 180 kHz wide. A group of adjacent RBs within the same subframe is called a subchannel. In LTE-V2X, every packet is encapsulated within a Transport Block (TB), whose transmission requires a different number of subchannels, depending on the TB size. Moreover, the transmission of each TB is complemented by the corresponding Sidelink Control Information (SCI), which contains decodingcritical information and is transmitted over two RBs, which are frequency-adjacent to the associated TB.
In Release 14, the Mode 4 resource allocation mechanism has been mainly tailored to serve periodic traffic. This is manifest in the Sensing-based Semi-Persistent Scheduling (SSPS) algorithm that the vehicles adopt for the distributed selection of transmission resources. The outcome of the SSPS mechanism is the selection -and reservation -of a collisionfree Single-Subframe Resource (SSR), defined as the set of subchannels able to accommodate the transmission of the TB and of its associated SCI. Let us indicate the vehicle that needs to transmit a message and runs the SSPS algorithm as the egovehicle. The steps that it goes through are the following: 1. List creation: in the first phase, the ego-vehicle focuses on the Candidate Single-subframe Resources (CSRs) included within the selection window, W . As Fig.1 indicates, the selection window is the interval that goes from the time the packet is ready for transmission up to its latency deadline, dependent on the Packet Delay Budget (PDB). The ego-vehicle exploits the channel status information collected during the previous 1000 subframes, the so-called sensing window S, to learn which resources in W are already reserved by other vehicles. The egovehicle therefore builds a list, L 1 , removing from the selection window the CSRs that satisfy the following two conditions: (i) the ego-vehicle has received an SCI indicating that the CSR will be used by another vehicle; (ii) the Reference Signal Received Power (RSRP) averaged over the RBs of the examined CSR is higher than a given threshold. Such threshold is a configurable parameter and its value is iteratively increased by 3 dB until list L 1 includes at least 20% of the initial CSRs. Last, the egovehicle builds a second list, L 2 , including the top 20% of the CSRs in L 1 with the lowest average Received Signal Strength Indicator (RSSI). The RSSI value is averaged in a periodic fashion over the 10 previous occurrences of the examined CSR, equally spaced of 100 ms. 2. Resource Selection and Reservation: in the second phase, the ego-vehicle randomly selects an SSR from list L 2 and also randomly sets the reselection counter C resel in [C min , C max ], indicating the consecutive number of times the resource will be reserved. For a packet periodicity T ≥ 100 ms, C min = 5 and C max = 15 [27]. The time interval between consecutive reservations is termed Resource Reservation Interval (RRI), and it matches the packet generation period T , RRI = T . After each transmission, the reselection counter is decremented by one; when it expires, the SSPS algorithm is invoked again with probability 1 − P , P ∈ [0, 0.8] as indicated in [28].
Once the SSR has been selected, the ego-vehicle broadcasts the TB and the SCI, the latter including the RRI value. Neighboring vehicles are informed that the ego-vehicle intends to employ the same SSR for the next transmission after RRI  ms, and avoid using that resource. If the ego-vehicle does not maintain the current reservation when the reselection counter expires, it notifies others by setting the RRI in the SCI equal to 0. Fig. 1 visually summarizes the relevant elements of the SSPS algorithm.

B. Impact of Aperiodic Traffic on Mode 4
When periodic traffic is examined, the RRI setting is a simple task, as the RRI has to match the packet generation period T . Depending on the value of the reselection probability P , Mode 4 is forced to select new resources only when C resel expires; following the vocabulary in [10], this is an event termed counter reselection throughout this work. Note that the number of counter reselections a vehicle performs depends on the reselection probability P and on the average C resel value.
However, when aperiodic traffic is considered, additional and unforeseen resource reselections can be triggered. Specifically, when resources are reserved with an RRI larger than the current packet inter-arrival time, then the so-called latency reselections [10] occur.
The situation is exemplified in Fig. 2(a): here, it is assumed that at t gen1 an incoming packet triggers a counter reselection: the next two SSRs are reserved at t res1 and t res2 , t res2 = t res1 + RRI. Then, let next packet be generated at t gen2 , t gen2 < t res2 , but t res2 − t gen2 > P DB; it follows that the reserved resource is not able to cope with the packet latency deadline. Therefore, a latency reselection is triggered at t gen2 , and a new set of subchannels is selected and reserved at time t res3 replacing the original reservation. Latency reselections should be avoided as much as possible, as they increase the probability of packet collisions.
Aperiodic traffic is also responsible for the phenomenon of unused reservations [10], which are observed when resources are reserved with an RRI lower than the current packet interarrival time. This circumstance is illustrated in Fig. 2(b), where the packet generated at t gen1 triggers the reservation of resources at t res1 , t res2 and t res3 , with t res2 = t res1 +RRI and t res3 = t res1 +2RRI. However, the second packet is generated at t gen2 > t res2 , hence leaving the reservation at t res2 idle.
Unused reservations negatively affect Mode 4 performance in two different ways: first, a fraction of the overall system capacity is wasted, as the reserved resources are not utilized by either the ego-vehicle or the neighboring vehicles. Second, as shown in Fig. 2(b), the unused reservation at t res2 does not allow the ego-vehicle to broadcast the corresponding SCI and announce the next reservation at t res3 ; the resources employed by the ego-vehicle at t res3 are therefore sensed free from nearby users, increasing the risk of packet collisions.
To summarize, the RRI configuration is a key element for the proper operation of Mode 4 SSPS mechanism. Ideally, the RRI value should match the time pattern of the traffic profile, therefore varying over time. However, this task cannot be accomplished when aperiodic traffic is considered, and the inevitable mismatch can severely affect Mode 4 communication effectiveness. In this regard, the authors in [7] and [10] showed that the performance of LTE-V2X is degraded to a remarkable extent, when aperiodic messages are considered.
To the authors' knowledge, NR-V2X has not identified a solution to cope with aperiodic traffic. The question of how to accommodate such traffic type therefore remains open and it is addressed by the current work in the case of aperiodic CAM dissemination. To this aim, the next Section will elaborate on CAMs; the goal is to substantiate that CAMs are aperiodic, but their occurrence pattern can be forecast.

C. ETSI-Generated CAM Sequences
CAMs are facility-layer packets devised to regularly broadcast and exchange information among vehicles, and between vehicles and the roadside infrastructure. They represent the fundamental elements to build road safety and traffic efficiency applications [18]. When initially investigating LTE-V2X performance, CAM occurrences were modeled as periodic packets [29], a choice that perfectly suits the use of Mode 4. However, the standard algorithm for the generation of CAMs released by ETSI [18] indicates that the inter-arrival time between consecutive messages, T GenCAM , is variable. Its duration strongly depends on the vehicle dynamics: if the vehicle modifies its trajectory, if its speed or acceleration/deceleration are sufficiently high, then T GenCAM shortens and CAMs become more frequent. In greater detail, the ETSI algorithm defines the upper and lower bounds for T GenCAM , namely: where T GenCAM Min = 100 ms and T GenCAM Max = 1000 ms, the latter also representing the default value for the generation period. Within such limits, CAMs are triggered depending on the transmitting vehicle dynamics, which have to be sampled every T CheckGenCAM milliseconds, T CheckGenCAM ≤ T GenCAM Min . The typical setting is T CheckGenCAM = T GenCAM Min = 100 ms. Specifically, a new CAM shall be immediately generated every time one of the following conditions is met [18]: • the absolute difference between the current heading and the heading included in the previous CAM is greater than 4 • ; • the absolute difference between the current speed and the speed included in the previous CAM is greater than 0.5 m/s; • the time elapsed since last CAM generation is equal to or greater than T GenCAM Max .
Besides T GenCAM specifications, [18] details the mandatory and optional fields in a CAM, allowing for variable size messages. The rules of the standard lead to CAM traffic which in most of the cases exhibits aperiodic inter-arrival times and variable CAM sizes. The last remark is well documented by the experimental survey in [30], which offers an analysis of CAMs collected during actual test-drives. The study reveals that CAM inter-arrival time often changes from one message to the next, that its distribution is very diverse, and heavily dependent on the drive scenario (urban, suburban, or highway). Similar conclusions hold for the size variability of CAMs. The correlation between CAM inter-arrival times and the vehicle behavior is exemplified in Fig. 3, where the temporal sequence of T GenCAM values, i.e., a CAM trace, and the vehicle speed are reported as a function of the T GenCAM sample index. These patterns refer to a vehicle moving along a straight trajectory, that initially decelerates until a complete stop, at T GenCAM index = 4, and then starts to accelerate again from T GenCAM index = 23, causing T GenCAM to accordingly vary. Fig. 3 shows that when the vehicle decelerates (accelerates) in the first (last) portion of the CAM trace, CAMs are more frequently issued. On the other hand, T GenCAM settles at 1000 ms when the vehicle stops, in the central portion of the trace. Here, variations in heading, position, or speed are not sufficiently large for generating a CAM before the timeout condition, T GenCAM Max = 1000 ms, occurs. Such a simple, yet exemplary instance is extracted from a wider measurement campaign we performed in different settings [31].
The strong correlation between CAM inter-arrival times and vehicle dynamics suggests that the adoption of ML can be beneficial to predict the temporal evolution of CAM sequences and in turn, lead to an effective reservation policy of radio 5 resources. Indeed, a carefully chosen set of input features, that the vehicle locally retrieves, can be used to feed an ML algorithm, producing the desired outcome, i.e., when next CAMs are likely to occur.
The following Section will therefore illustrate a novel approach to deliver aperiodic CAMs, removing the intrinsic inefficiencies that plague the original Mode 4.

III. THE PREDICTIVE RESERVATION FRAMEWORK
Subsection II-B highlighted the mismatch between aperiodic traffic and the periodic reservation strategy of Mode 4, causing the undesired phenomena of latency reselections and unused reservations. Moreover, Sec. II-C dwelled on the aperiodicity of actual CAM sequences, suggesting that their temporal evolution can be successfully predicted.
The key proposal of this paper is therefore the following: adopt ML to forecast what the next T GenCAM value will be, and how many occurrences of it will appear. Next, exploit ML prediction to set: (i) the resource reservation interval RRI; (ii) the reselection counter C resel , whose value is no longer randomly chosen, rather, it exactly matches the number of occurrences forecast by ML.
Additionally, the current study significantly intervenes in the list creation phase of the original SSPS algorithm. As better explained in the next subsection, it builds a more reliable list of available candidate resources than the one produced by the legacy SSPS.

A. Modified SSPS Implementation
In our proposed solution, resources are drawn from list L 1 , as opposed to list L 2 . As a matter of fact, in the presence of aperiodic traffic, L 2 is not as meaningful as when vehicles periodically generate packets. It is not a case that NR-V2X will no longer use L 2 [32]. Moreover, our proposal sets the selection window W = 100 ms, the minimum CAM interarrival time, to avoid broadcasting out-of-sequence messages. To better understand the last statement, recall that CAM interarrival times can take on any value in the [100, 200, . . . , 1000] ms set; hence, if the selection window W is wider than 100 ms, the (j + 1)-th CAM might be transmitted before the j-th, an event that has to be prevented.
An additional and meaningful modification concerns the list creation phase of the original SSPS algorithm. Given the CAM selection window W is 100 ms wide and that the RRI is dynamically determined via ML, observe that not all ongoing reservations fall within W and are spotted by the ego-vehicle. It follows that the original SSPS list creation mechanism loses effectiveness, increasing the risk of packet collisions.
For this reason, we propose a new version of the SSPS process leading to the creation of list L 1 , that we name lookahead SSPS version. This SSPS reworking requires that the SCI also includes the current C resel value, in addition to RRI. It is a minimal modification with respect to the choice of the legacy algorithm, necessitating very few bits. Yet, it remarkably extends the collision-avoidance capability of the original SSPS algorithm, as the numerical results will show. As a matter of fact, if the ego-vehicle exploits the knowledge of the reselection counter of nearby vehicles, it can detect potential collisions for any possible combination of the resource reservation intervals used by itself and by its neighbors. To further clarify such enhanced capability of identifying collisions, Figs. 4(a)-(c) exemplify the SSPS operation in three different scenarios. In these figures, the candidate resources examined by the ego-vehicle are represented in green, the subchannels in use by the generic neighboring vehicle are indicated in red, the selection window in yellow. Furthermore, RRI T X indicates the resource reservation interval adopted by the egovehicle, whereas RRI RX represents the reservation interval of the generic nearby vehicle, heard by the ego-vehicle in the SCI it receives. In Fig. 4(a), the candidate resource examined by the ego-vehicle is immediately excluded, as it coincides with the reservation placed by the nearby vehicle inside the selection window. The collision is avoided in the case exemplified in Fig. 4(b) too, as the ego-vehicle also verifies if any of its future reservations outside of the selection window coincides with the very next resource reserved by the nearby vehicle. Fig. 4(c) portrays an instance where the reservation heard by the ego-vehicle is not included within its selection window and RRI RX is lower than RRI T X . In this case, the original SSPS algorithm cannot detect the future collision, as the egovehicle is exclusively aware of the first reservation from the neighboring vehicle, after RRI RX ms, and therefore does not exclude the examined resource. Here, the future collision would be spotted only if the ego-vehicle additionally knew the remaining number of ongoing reservations, i.e., the current C resel value of the nearby vehicle, in addition to the periodicity of ongoing transmissions. Our look-ahead version of SSPS proposes to exploit the C resel knowledge and performs this further check. Therefore, it creates a smaller, yet more reliable L 1 list, detecting and avoiding all the potential collisions exemplified in Figs. 4(a)-(c). Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58 59 60

B. Machine Learning to Predict CAM Sequences
When the proposed strategy enters the resource selection and reservation phase, the first step that the ego-vehicle accomplishes is to forecast through ML the very next T GenCAM value, as well as the length of the next sequence of identical T GenCAM inter-arrival times. To do so, ML explores a large set of CAM traces to identify correlation patterns between some user-defined input features and the CAM traces. Then, such knowledge is leveraged to anticipate future CAM inter-arrival times [33]. In this work, the set of input features taken into account are: • trajectory, current speed and position of the ego-vehicle; • current speed and position of the vehicle immediately preceding the ego-vehicle. We choose to predict the next CAM inter-arrival time through the k-Nearest Neighbors (KNN) ML algorithm, an instancebased learning technique used for both regression and classification problems. KNN simply stores the training data without attempting to infer a general structure out of them. Moreover, KNN is inherently designed for multi-class problems and its classification consists in assigning the input features the most common label, i.e., next predicted T GenCAM value, among the k nearest neighbors.
The second action of the ego-vehicle is to dynamically set the (RRI, C resel ) pair employed by the SSPS strategy in accordance to the ML forecast.
In greater detail, whenever SSPS triggers the resource selection and reservation phase, Algorithm 1 is invoked. The algorithm exploits KNN to predict the very next T GenCAM value, T GenCAM1 , and sets RRI equal to it, that is, RRI = T GenCAM1 , while C resel is initially set equal to 1. Then, as long as the next predicted inter-arrival time T GenCAM i+1 coincides with the previous T GenCAM i , the algorithm keeps incrementing the estimate of the reselection counter C resel . Furthermore, when KNN outcome indicates that more than  3 consecutive CAM inter-arrival times will display the same value, the actual reselection counter value is randomly selected within the [cntr min , cntr max ] interval, where cntr min = 3 and cntr max is the current C resel estimate. This expedient avoids repeated packet collisions on resources reserved by different vehicles, a circumstance that might occur when vehicles generate CAMs with the same periodicity, e.g., in a congested intersection. The output of Algorithm 1 is finally used to set RRI and the reselection counter C resel that indicates how many times the selected resource is reserved. Note that there is a maximum allowed value for C resel , indicated by N . Moreover, observe that inequality C resel ≥ 1 reveals that at least one reservation has to be placed.
The overall flowchart of the proposed solution, termed KNN-look ahead from now onward, is reported in Fig.5.

IV. KEY PERFORMANCE INDICATORS
When assessing the performance of a vehicular radio access solution, there are several Key Performance Indicators (KPIs) that are worth being considered.
One of the most widely adopted is the Packet Reception Ratio (PRR). Its definition relies on the notion of distance slice; the i-th distance slice is defined as the set of transmitterreceiver distances that fall within the (a i , b i ] range, a i = i · 20 m and b i = (i + 1) · 20 m. For the i-th slice, the PRR is defined as [29]: Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60 where X j i , indicates the number of vehicles within the ith slice that successfully received the j-th packet, Y j i is the number of vehicles within the i-th slice when the j-th packet was transmitted and M denotes the number of packets generated during the simulation. The PRR is a reliability indicator, quantifying the probability that the message being broadcast by a vehicle can be heard at a given distance slice.
An additional standard-compliant reliability indicator is the Packet Inter-Reception (PIR). For a given transmitter-receiver pair, the PIR is defined as the time between two consecutive successful receptions of packets belonging to the same application flow, assuming the transmitter-receiver distance is within the (0, D] range. Usually, its Cumulative Distribution Function (CDF) is provided, considering all transmitter-receiver pairs involved in the simulation.
Two additional KPIs are the Propagation Losses Ratio (PLR) and the Collision Losses Ratio (CLR). For the i-th slice, the PLR is defined as and similarly, the CLR value is determined as where: • P L i is the number of packets that were lost due to poor propagation conditions within the i-th slice, i.e., the packets that did not collide, but experienced a Signal-to-Noise Ratio (SNR) not sufficient for the correct decoding of either the TB or its associated SCI; • CL i is the number of packets lost within the i-th slice because of a collision, i.e., the packets that were caught in a collision and whose reception failed because the Signalto-Interference-plus-Noise Ratio (SINR) did not allow a correct decoding of either the TB or the SCI.
• SR i is the number of successfully received packets within the i-th slice. In the following, subscript i will be omitted, unless strictly necessary.
We observe that the PLR measures the fraction of radio resources that could not be successfully employed because of errors introduced by lousy propagation conditions. As such, it is influenced by the choices performed at physical layer, by the channel model adopted in the geographical area that is being examined, and by the CAM size.
On the contrary, the CLR indicates to what extent harmful collisions could not be avoided, and it is therefore dictated by the radio resource assignment strategy.
A parameter also worth being monitored is the Channel Busy Ratio (CBR), which is defined as follows: given the nth subframe, the CBR is the fraction of subchannels whose RSSI exceeds a given threshold over subframes [n−100, n−1]. The CBR is relevant to understand the load currently insisting on the radio channel. Additional metrics specific to LTE-V2X are [10]: • the Latency Reselections Ratio (LRR), defined as the fraction of message transmissions that triggered a latency reselection over the total number of transmitted messages; • the Unused Reservations Ratio (URR), defined as the fraction of unused reservations over the total number of resource reservations that were performed; • the Counter Reselection Ratio (CRR), defined as the fraction of message transmissions that triggered a resource reselection due to the depletion of the reselection counter over the total number of transmitted messages.

A. Physical and Medium Access Control Layer Configuration
As regards the Physical (PHY) and Medium Access Control (MAC) layers, this work relies on the custom ns-3 C-V2X module first introduced by the authors in [6] and finalized in [7]. The development of the simulator adheres to 3GPP Release 14 and Release 16 specifications and features all the elements that characterize Mode 4 communications. Vehicles have been configured to transmit their messages over the 10 MHz wide channel located in the 5.9 GHz ITS band, with 15 kHz subcarrier spacing. The 10 MHz channel is divided into 4 subchannels that consist of 12 RBs each, assuming adjacent transmission of the TB and of its associated SCI. The size of CAM messages, indicated by X, is fixed to either 190 or 470 bytes, which are the smallest and the largest statistically relevant sizes of CAMs [30]. Vehicles transmit their packets using QPSK modulation with 0.7 code rate, therefore mapping the 190 and 470 byte-long packets into 1 and 2 subchannels, respectively. The transmission power is set to 23 dBm and the receiver sensitivity to −90.4 dBm. As in [10], the RSRP threshold is −140 dBm. The PHY layer impairments introduced by the radio channel are captured using the 5G-compliant error model presented by the authors in [7]. In greater detail, shadowing is modeled via a lognormally distributed random variable and small-scale fading is evaluated using two different Clustered Delay Lines (CDL) corresponding to the Line-Of-Sight (LOS) and Non-Line-Of-Sight (NLOS) scenarios, as detailed in [29]. The Packet Error  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60 Rate (PER) curve for the TB carrying the actual CAM and the associated SCI are reported in [7].
B. Outcomes 1) Suburban setting: The first set of results refer to some outskirts of the Italian city of Modena, that we classify as a suburban setting example. Here, microscopic vehicular mobility has been simulated through SUMO [34]. The examined road topology is reported in Fig. 6 and it has been imported in SUMO using Open Street Map [35]. The area is approximately 2.5 km wide and 3 km long. Vehicles have been randomly generated at the area edges and have been assigned random trajectories that traverse the entire topology. The average vehicular density is 42 vehicles/km and the vehicles speed varies in the [50, 100] km/h interval, depending on traffic conditions and on the vehicle speedFactor, a SUMO parameter that defines the maximum velocity of each vehicle as a function of the lane speed limits.
We have additionally developed a set of custom tools based on the SUMO Traffic Control Interface (TraCI) [36], to extract the elements that characterize the behavior of every vehicle, namely, heading, position, and speed; the periodicity for their collection was coincident with T GenCAM Min = 100 ms. They have allowed us to generate CAM messages in accordance to the rules set by the ETSI algorithm recalled in Sec. II-C. For every car, we also recorded the position and speed of the preceding vehicle, to complete the set of input features used by ML. As requested by Algorithm 1, these features fed a real-time implementation of the KNN algorithm, to predict the longest sequence of T GenCAM values with the same periodicity. The number k of KNN nearest neighbors was taken equal to 3.
The dataset of CAM traces was collected from a total of 6800 vehicles during 20 minutes of SUMO simulation. The least represented T GenCAM values in the dataset were oversampled using the Synthetic Minority Oversampling TEchnique (SMOTE) [37]. All input features were further normalized using min-max normalization, i.e., their range of values was re-scaled between 0 and 1. The training set and the test set were generated employing a 70 − 30 split ratio.
First, Figs. 7(a)-(c) delve into the ability of the KNN algorithm to predict the upcoming sequence of CAM messages, reporting the confusion matrix for three different values of the T GenCAM index i, i = 1, 5 and 10. The confusion matrix is a two-dimensional matrix indexed with the true and predicted class labels, and it is commonly used to visualize the performance of an algorithm. Fig. 7(a) reports the confusion matrix of KNN for i = 1, that is, when KNN forecasts next T GenCAM value. Figs. 7(b) and (c) show the confusion matrix when KNN predicts the fifth (i = 5) and the tenth (i = 10) T GenCAM value, respectively. These figures reveal that KNN is able to accurately forecast T GenCAM 1 value, and that the degradation in predicting T GenCAM5 and T GenCAM10 values is modest.
In the next set of figures, the focus shifts on the performance of the proposed KNN-look ahead solution. First, Fig.9 shows the propagation losses ratio PLR as a function of the distance D between the transmitting and the receiving vehicle. Solid lines refer to X = 470, dashed lines to X = 190 bytes. Recall that the PLR measures the amount of packets that were lost because of scarce propagation conditions over the total; as a matter of fact, these curves do not depend on the resource assignment strategy, but are exclusively determined by the PHY layer choices and by the CAM size. So, when the radio propagation environment is more hostile (e.g., greater D values) and the CAM size is longer, the PLR increases. For these curves, as well as for the results shown next, a proper number of simulations has been executed to obtain sufficiently tight 95% confidence intervals. To avoid border effects, the results have been collected only from the central area of the setting; this corresponds to the green-shaded area in Fig. 6.
In the following figures, the proposed approach is confronted against the original SSPS algorithm with RRI = T GenCAM Min = 100 ms; the latter is a convenient setting, as it guarantees that CAMs gain access to the channel without generating any latency reselections. Adhering to the standard, our SSPS implementation randomly chooses the actual C resel value in [5,15]. In accordance with [2], we set P = 0, that is, every time the counter expires, the vehicle has to select a new transmission resource with probability 1 − P = 1.
To quantify how effective the KNN choice is within the ML domain, the Ground Truth (GT) benchmark is considered: this benchmark exploits the a priori knowledge of the CAM sequences to assign radio resources and place reservations that perfectly match the actual CAM sequences.
Figs. 10(a) and (b) report the PRR curves for the original SSPS mechanism with RRI = 100 ms (black curve, diamond markers), the curves obtained when the KNN-look ahead  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58 59 60 proposal is adopted (blue curve, circle markers), as well as the PRR values corresponding to the ideal GT benchmark (red line, circle markers). When the CAM size is 190 bytes, Fig. 10(a) indicates that our proposal guarantees an attractive improvement, and Fig. 10(b) reveals that the gain becomes significant when a larger size (X = 470 bytes) is considered, that is, when the load on the radio channel increases. Both figures also reveal that the KNN-look ahead approach attains a performance that is very close to the GT benchmark, i.e., to the ideal performance. The existence of a CLR "floor" is justified observing that, even if every vehicle were able to perfectly forecast its CAM transmission requirements over time and to select resources accordingly, its selection could nevertheless coincide with the choice performed by other vehicles. This phenomenon is intrinsic to the distributed nature of the channel access mechanism and cannot be further reduced, unless a total redesign of the radio access technique is undertaken.
The effectiveness of the KNN-look ahead approach is further evidenced by the values provided in Table I, Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58 59 60 values of these ratios for the original SSPS mechanism with RRI = 100 ms and for the GT benchmark. The Table shows that SSPS with RRI = 100 guarantees no latency reselections, as it respects the most stringent delay requirement, but the URR climbs to 0.61. At the other end of the scale, the GT benchmark perfectly eliminates unused reservations and reselections. The proposed solution lies in between, being able to significantly reduce the unused reservations ratio from 0.61 to 0.12. However, this improvement is achieved at the expense of a non-zero fraction of latency reselections. It is worth noting that, as long as they are not prevalent, latency reselections do not have the same negative impact on communication reliability as unused reservations. For the sake of completeness, the Counter Reselection Ratio CRR is also reported in the last column of the table: as expected, its value increases for the proposed solution and even more for the GT benchmark, as reselections become more frequent to track T GenCAM variability.
Next, Fig. 13 reports the Probability Mass Function (PMF) of the T GenCAM values observed in the suburban scenario. It is interesting to note that the PMF mainly concentrates around two values, 200 ms and 300 ms. As they are not integer multiples, the SSPS algorithm with RRI set equal to 100 ms is not very effective in detecting potential collisions. This explains why we observed fairly low PRR values for it.
Given the a posteriori knowledge that the PMF reported in Fig. 13 provides, Fig. 14 shows the PRR attained by the legacy SSPS strategy when its reservation periodicity RRI is set so as to match the first or the second most frequently observed T GenCAM value; that is, RRI = 300 ms (dashed Fig. 13. T GenCAM PMF, suburban scenario To further complete the assessment picture, Table II reports the CBR levels observed in the suburban scenario. The CBR of the generic vehicle has been computed every 0.2 seconds, the values have been time averaged over the central portion of the simulation time and finally averaged over all vehicles. The RSSI threshold to discriminate between a busy and an idle channel is set to a value 0.5 dB greater than the receiver sensitivity level, therefore to −89.9 dBm. The CBR values reported in Table II reveal the magnitude of the channel load increase due to a larger packet size. Moreover, the CBR is not only useful for assessing the amount of traffic insisting on the communication channel. Given a specific setting, the CBR also reflects the effectiveness of the adopted access strategy: a more accurate scheduling mechanism maximizes the use of the available transmission resources, resulting in larger CBR values. This is the case encountered here, where the KNNlook ahead approach achieves higher CBR values than SSPS with 100 ms.  2) Highway setting: We also considered a second setting, termed highway, represented by a 5 km-long highway trunk, where six 4-meter wide lanes are deployed. Adhering to the specifications in [29], the vehicles' speed is 70 km/h and the vehicular density is 120 vehicles/km. For these numerical choices, Fig. 16 compares the PRR of the proposed KNNlook ahead solution to the PRR of the SSPS algorithm with RRI = 100 ms and to the GT upper bound, for the most demanding CAM size X = 470 bytes. The figure shows that the KNN-look ahead approach (blue line, circle markers) leads to a remarkable improvement in the PRR performance with respect to the original SSPS mechanism (black line, diamond markers), achieving PRR levels very close to the GT benchmark (red line, circle markers). It is however known that SUMO reveals some limits in the highway set-up: the constant speed and the nearly straight vehicular trajectories lead to an almost constant CAM inter-arrival time, T GenCAM = 300 ms. The same behavior was observed when the vehicular speed varies within the [70, 140] km/h range: here too, T GenCAM is nearly constant and equal to 200 ms. We have overcome this simulation hurdle employing one of the empirical models for the generation of CAM messages that were proposed in [38]. These models are derived from real-world traces of CAM traffic collected on a highway trunk [30], for different implementations of the ETSI algorithm by two Original Equipment Manufacturers (OEMs), Volkswagen and Renault. They consist of m-th order Markov sources that model: (i) CAM size and T GenCAM variability; (ii) CAM size  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58 59 60 Fig. 18. CLR as a function of D, highway scenario, CAM trace Markov model variability only; (iii) T GenCAM variability only. We adopted the model that seizes CAM temporal variability, drawn from the Volkswagen CAM traces, setting m = 5. For this model, the average T GenCAM value is 330 ms, close to the constant T GenCAM value characterizing the SUMO implementation at 70 km/h constant speed. With the help of this analytical tool, we associated to every vehicle a specific CAM trace. Unfortunately, such empirical models have no notion of vehicle dynamics, so they do not provide the input features the KNN algorithm requires. Nonetheless, the reproduction of highway CAM traces allows to determine the GT performance, and therefore to assess the maximum improvement that ML achieves. In this respect, Fig. 17 concentrates on the PRR performance considering two different packet sizes, X = 190 bytes (dashed lines) and X = 470 bytes (solid lines). Adopting the same choice of colors and markers of Figs. 10(a)-(b), the black curves correspond to the original SSPS implementation with RRI = 100 ms, whereas the red curves refer to the GT benchmark, identifying the PRR upper bound. The significant improvement achieved by the GT solution with respect to the original SSPS mechanism is evident and becomes remarkable when X = 470 is considered. The original SSPS performance drops below 0.6 when D ≥ 450 m, whereas the GT sets at P RR = 0.85. Fig. 18 is the counterpart of Fig. 17 on the (CLR, D) plane. This figure further highlights the enhanced collision-avoidance capability of the ML-based strategy with respect to the standard-compliant solution, that increases for increasing packet sizes. Its superiority is substantiated by the CBR values reported in Table III. The first column of the to 0.39 when moving from SSPS with RRI = 100 ms to the GT benchmark. Likewise, the CBR rises from 0.49 to 0.61 in the second column that refers to X = 470 bytes, once more highlighting the significant impact of T GenCAM predictions on the selection of collision-free resources. The PMF of the T GenCAM samples generated in the highway scenario is shown in Fig. 19. As in the suburban setting, the PMF mainly condenses around two values, 200 ms and 400 ms. Finally, Fig. 20 reports the PIR CDF when D = 520 m and X = 470 bytes. Here too, the GT benchmark provides an upper bound to the PIR achievable performance, highlighting the maximum amount of improvement with respect to the original SSPS reservation strategy.
It is worth observing that the implementation of the proposed approach on an actual vehicle is feasible, as the input features that KNN employs can be easily retrieved. The egovehicle position can be obtained via the Global Navigation Satellite System (GNSS), its speed can be measured by invehicle sensors, the use of on-board lidars and radars can offer accurate estimates of the position and speed of the preceding vehicle. The vehicle trajectory prediction is a widely investigated topic in the industrial and the academic world, and algorithms like the one reported in [39] can estimate the ego-vehicle trajectory in an accurate manner.
As regards the introduction of ML, we showed that a simple technique such as KNN leads to a remarkable performance improvement with respect to the original LTE-V2X Mode 4. The selection of a more sophisticated ML algorithm, although possible, would only lead to incremental improvements and to unnecessary complexity.

VI. CONCLUSIONS
In this paper, an ML-based solution has been proposed to distribute aperiodic CAMs to vehicles. The approach relies on a limited set of features, that each vehicle employs to forecast its next CAM generation times. The ML outcome is combined with a careful selection and reservation of the radio resources available for transmission. The simulation results indicate that the proposed KNN-look ahead solution achieves an excellent accuracy and that the new strategy outperforms the legacy 3GPP V2V approach for all metrics.

AE:
After careful consideration of the reviewers' comments, the decision has been made not to publish this paper in the IEEE Transactions on Vehicular Technology. Based on the reviewers' comments and my own opinion, the publication of the paper in its present form is not recommended. Reviewer 1 has concerns about the novelty of the work as per the existing state-of-the-art. Reviewer 2 has concerns about the motivations of the work. Furthermore, reviewer 3 also has found issues in describing the use of ML in the scheme. In addition to the reviewer's comments, I strongly recommend to compare the proposed scheme with recent related works as mentioned in the paper.

Authors response to the Associate editor
Dear editor, thank you for allowing a resubmission of our manuscript, with an opportunity to reply to the reviewers' comments. We revised the manuscript, striving to address the remarks of all reviewers in a careful manner.
In summary: • the novelty of our approach and a critical comparison between its contributions and the existing works was provided; • the Introduction was significantly modified, to better motivate our work. Several new paragraphs were introduced, to provide a critical review of the state of the art; • the body of references was revised and increased; • the rationale behind the adoption of Machine Learning (ML) was provided.
Moreover, the manuscript was revised by an English-mother tongue, to remove grammatical errors and awkward sentence structures.
We also addressed the reviewer's concerns on a point-to-point basis, as reported below.
We are therefore re-submitting the revised manuscript as a new submission and do look forward to receiving your feedback soon.
Best regards,
There is no related work in this paper which is why it is hard to convince oneself about the outperformance of this scheme. Therefore, through related work analysis is needed and the need for 'yet another' CAM dissemination scheme should be justified. Reviewer #1 concern #2: "Performance evaluation metrics should be discussed." Author response: The performance evaluation metrics that are relevant for the vehicular broadcasting scheme are defined in Section IV. The metrics employed for Machine Learning have been newly introduced in Subsection V-B, page 8.
Reviewer #1 concern #3: "The contributions should be clearly mentioned (also in the light of the existing works)." Author response: We thank the reviewer for his/her valuable comment. We significantly modified the Introduction, to clearly state the contributions of our work. Several examples of existing works were newly added, and a critical comparison against our research was provided.
Reviewer #1 concern #4: "There are typos and grammatical mistakes in the paper that should be removed and corrected, respectively." Author response: Urged by the reviewer's remark, we had the manuscript carefully revised by a professional language editing service. All the typos and grammatical mistakes were corrected.

Reviewer: 2
In this manuscript, the authors have proposed a machine learning-based approach to disseminating aperiodic cooperative awareness messages in V2V communication. The topic is interesting and timely. However, I have some concerns which I am listing below.
Reviewer #2 concern #1: Why there is a need for an ML-based approach in data disseminating in V2V communication is not explained properly. Authors should clearly mention how the heterogeneity of vehicles is going to affect the overall system decision.

Author response:
When broadcasting aperiodic traffic, like CAMs, Machine Learning plays a central role. Its accurate prediction of future CAM inter-arrival times allows the ego-vehicle to reserve resources using the optimal configuration of the Resource Reservation Interval (RRI) and of the reselection counter. This is explained in the last paragraph of Subsection III-B. It follows that the adoption of Machine Learning minimizes the latency reselections and the unused reservations, which are the main sources of performance degradation when aperiodic traffic is considered. As a matter of fact, latency reselections should be avoided as much as possible, as they ultimately increase the probability of packet collisions. Unused reservations do not allow vehicles to correctly announce their reserved resources, which are therefore sensed as free from neighboring users; in this case too, the probability of packet collision increases. These two effects are discussed in Subsection II-B.
As regards the impact of the heterogeneity of vehicles, the larger the number of vehicles employing the ML-based approach, the better the overall system performance will be. If some vehicles were not to employ the KNN algorithm for the optimal configuration of the RRI and Cresel parameters, latency reselections and unused reservations would inevitably increase; in turn, the probability of packet collisions would increase.
Reviewer #2 concern #2: Authors should clearly mention the existing state-of-the works in this particular area. Along with that, clearly mention what are novel contributions that authors have made in this work. Is introducing the ML approach is the only contribution? Author response: As suggested by the reviewer, the revised manuscript clearly mentions the existing state-of-the-art works and the novel contributions we provided. Our work leverages Machine Learning for predicting the CAM patterns and it improves the performance of the standard-compliant SSPS mechanism used for selecting and reserving resources. In addition, we propose to include the reselection counter value within the Sidelink Control Information (SCI) and to avoid the creation of the second list, L2, during the list creation phase. These modifications further improve the collision-detection capability of the SSPS mechanism.  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59 Reviewer #2 concern #3: In Section II, What is an ego-vehicle? Further, during list creation, are the vehicles aware of the total number of neighboring vehicles beforehand? How the effect of vehicle movement of its list creation. The authors should clearly state the system model/scenario they have considered for their analysis to improve the readability of the paper. Author response: In literature, the "ego-vehicle" identifies the connected vehicle whose behavior is of primary interest. As such, in our work, the term "ego-vehicle" is used to indicate the vehicle that selects and reserves new resources using the SSPS mechanism. Reviewer #2 concern #5: In Section II-C, the authors mentioned that the inter-arrival time between consecutive messages is variable and its duration strongly depends on vehicle dynamics. What do the authors mean by vehicle dynamics? Author response: In Subsection II-C, the term "vehicle dynamics" refers to the heading, position, and speed variations of the vehicle generating the Cooperative Awareness Messages (CAMs). In Subsection II-C, we better clarified this point.
Reviewer #2 concern #7: The strong correlation between CAM inter-arrival times and vehicle dynamics suggests that the adoption of ML can be beneficial to predict the temporal evolution of CAM sequences. Justify the reason behind it properly. In the proposed ML-based approach, what are the input parameters to the model. Does the packet collision parameter taken into consideration? Author response: Machine Learning explores the training data to identify correlation patterns between vehicle dynamics and CAM traces. Then, such knowledge is challenged on the test data, where the algorithm forecasts future CAM generation times from the input parameters. This clarification was added in the first paragraph of Subsection III-B. The input parameters of the model are: • trajectory, current speed and position of the ego-vehicle; • current speed and position of the vehicle immediately preceding the ego-vehicle. Such parameters were originally listed in the first paragraph of Subsection III-B. In the revised manuscript, we resorted to explicit items, highlighted by bullets, to better evidence them. Yes, packet collisions are taken into consideration, both when the original SSPS and the proposed solution are considered, as they affect the overall system performance.
Reviewer #2 concern #8: To show the effectiveness of the proposed algorithm, the authors must compare it with any existing state-of-the-art schemes. Author response: The existing state-of-the-art scheme is represented by the legacy SSPS standard mechanism, and we compared our results against it all throughout the paper. As commented in the revised Introduction and state-of-the-art, no specific solutions exist to satisfyingly serve real, aperiodic CAM traffic in C-V2X.

Reviewer: 3
In this paper, the authors proposed a novel Machine Learning (ML)-based method to distribute aperiodic Cooperative Awareness Messages (CAMs). By k-Nearest Neighbors (KNN) ML algorithm to predict the temporal pattern of CAMs to reduce sub-channels collision and ultimately improve Packet Reception Ratio (PRR). The topic is interesting and meaningful. However, some issues still need to be revised in this paper. Our comments are as follows: Reviewer #3 concern #1: Some sentences contain grammatical and spelling mistakes. The article needs careful editing by someone with technical English editing expertise paying particular attention to English grammar, spelling, and sentence structure. Author response: We thank the reviewer for his/her careful reading of the manuscript. The manuscript has been revised and all the grammatical and spelling mistakes were removed.
Reviewer #3 concern #3: In this paper, the SSPS process improved by the KNN algorithm, which effectively improves the PRR, but lacks the explanation process of the theoretical part, and it is recommended to supplement the theoretical part of the analysis. Author response: The reason why the KNN algorithm guarantees an improvement is that: 1. it retrieves the trajectory, speed, and position of the ego-vehicle, and also the speed and position of the vehicle preceding the ego-vehicle; 2. from these input features, it predicts what behavior the ego-vehicle will have in the very next future. As any machine learning technique, it learns from the training set and then, its performance is evaluated on the test set. In this study, the former is made of 70% of the CAM trace dataset, the latter by the remaining 30%. In Section III-B, a reference was added to provide the reader a link to the theoretical basis behind the approach.
Reviewer #3 concern #4: Why use the KNN algorithm to predict RRI and Cresel, but not other ML algorithms to achieve this process? It suggests a comparative test of other algorithms in the simulation section. Author response: We kindly point out to the reviewer's attention that KNN achieves excellent results. Its accuracy and macro-F1 metrics are well above 0.9, as Figs. 7(a)-(c) and Fig. 8 highlight. We decided to use KNN because it is fast to train and easy to understand, also for people not familiar with Machine Learning. The results demonstrate that its predictions are extremely precise. We could have used a different, more sophisticated ML algorithm to identify the optimal configuration of RRI and Cresel; yet, this would have only led to incremental improvements. Moreover, the aim of our article is to show how to improve the resource allocation process of LTE-V2X Mode 4 via the prediction of CAM inter-arrival times and we demonstrated that even the adoption of a simple ML algorithm achieves the result. We elaborated on this point in the revised manuscript, last paragraph of Subsection V-B.
Reviewer #3 concern #5: The total references number is only 23. More recent studies are encouraged to add into this paper, such as: Latency and Reliability of mmWave Multi-hop V2V Communications under Relay Selections. Author response: We thank the reviewer for the suggestion. We extended the literature review reported in the Introduction, including additional references, as well as articles that have been published after our initial submission.
Reviewer #3 concern #6: 6. In the highway scene, due to the constant speed between the vehicles, the CAM arrival time is not much different. The improved KNN algorithm used in this paper cannot achieve a good result. Comparing the results of the Ground Truth (GT) benchmark and the traditional SSPS process, there is still room for optimization in this process. Can other input features be added to KNN to optimize experimental results? Or can we adopt a new SSPS process in a separate scenario? Author response: We do not agree with the reviewer's remark that the improved KNN algorithm cannot achieve a good result in the highway setting.  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60 In detail, in the highway scenario, the proposed Machine Learning-based algorithm is analyzed considering two different CAM generation models. In the first scenario, CAMs are generated exploiting the vehicular mobility traces obtained through SUMO simulations. Here, the proposed KNN-based algorithm achieves an impressive performance, very close to the Ground Truth (GT) benchmark levels, as Fig. 16 reveals. No room is left for optimization and it is pointless to consider additional input features. In the second case, realistic CAM patterns are generated employing the mathematical models presented in [EMP-MODELS]. Unfortunately, such models have no notion of vehicle dynamics: in other words, they do not provide the vehicle trajectory, speed, and in general the input features that the ML forecast needs. Therefore, the performance of the KNN algorithm, or of any alternative ML approach, cannot be assessed. Nonetheless, such CAM traces allow to determine the Ground Truth performance, that is, the maximum theoretical improvement that Machine Learning achieves.
Reviewer #3 concern #7: In the highway scenario, this article does not give a solution on how to improve the SSPS reservation strategy? Author response: Again, we kindly dissent from the conclusion of the reviewer. We point out to the reviewer's attention that our article does provide a solution on how to improve the SSPS reservation strategy in any type of scenario, including the highway. As mentioned in the response to concern #6, Fig.16 shows that the proposed approach leads to a significant performance improvement. Figs. 17 and 18 also indicate that the maximum theoretical improvement is remarkable. The more accurate the employed ML algorithm will be, the closer the predictive reservation approach will get to the GT benchmark. We added a comment in Subsection V-B to better clarify this and the previous point.  Transactions on Vehicular Technology   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60