Novel One Item Network Coding Vectors

Herein the problem of network coding coefﬁcients overhead is revisited. A novel approach, based on modular arithmetic and prime numbers, and is inﬂuenced by the Chinese remainder theorem (CRT), is proposed to reduce the coefﬁcients overhead by augmenting only one item coefﬁcient of maximum size of four bytes to the data packet. The proposed approach successfully addresses all the limitations of the previous methods including the limitations on the generation size and density of the packets in the generation, rocoding on the intermediate nodes, and creating innovative coding vectors. Probabilistic theoretical analysis and experimental work show a superior performance of the proposed approach in terms of coefﬁcients overhead ratio, download time, throughput, and packets drop rate. This evaluation has considered three types of networks: wireless sensors network for Internet of things, conventional wireline Ethernet, and Peer-to-Peer content distribution. , and network coding operations run over GF (2 16 ) . The results also show that all the other metrics are improved, e.g., throughput reaches 1.54 packets/round and the number of dropped packets is reduced by at least 60%.

plus a vector of coefficients (ζ), a.k.a coding vector, as a header. The size of the coding vector depends on the Galois Field (GF) size, the generation size, and the coding vector representation.
On the receiver side, it is needed to receive the n×n coefficients matrix, a.k.a the transfer matrix, along with n coded packets to solve a system like in (1) and recover the original packets. The total coefficients overhead of RLNC is n log 2 q and n 2 log 2 q per packet and per generation, respectively, where q is the GF size. are drawn from GF (2 8 ), then the coding vector is 512 bytes which is more than one third of the MTU.
Example 3: Consider a P2P content distribution network that uses 16KB as a packet size and shares a 10GB File. Usually in such scenario the best generation size is 20MB [5], then the file is segmented to 10GB/20M B = 512 generations. Each block in the generation is 16KB, then number of blocks in a generation is 20M B/16KB = 1280 blocks. Thus, the coefficients matrix is 1280 × 1280 per generation and if the coefficient are drawn from GF (2 8 ) and each represented as 1 byte, then the total coefficients overhead of the file is 1280B × 1280B × 512 = 800MB per the file.

A. Methods in Literature
Many approaches in the literature are proposed to address the problem of network coding coefficients overhead. Most of these approaches are based on using sparse coding vectors, i.e., each coding vector contains only small numbers of nonzero coefficients, and thus the coding vectors can be compressed.
Mahdi J. et. al. in [3] propose first compressing coefficients vectors scheme which is based on error correcting codes and using parity check matrices. The scheme considers m nonzero coefficients, where m << n. The scheme results in coefficients overhead equals to 2m per packet.
S. Li et. al. in [6] improve on the previous scheme by utilizing a method of erasure codes and adding a segment ID to the header to enumerate the IDs of the m nonzero coefficients. The overall compression rate is improved and coefficients overhead becomes m + n/ log q. Further, the coefficients overhead becomes m + (log 2 n)/ log 2 q when a list decoding [7] method is used.
However to achieve the latter rate a GF with a large size and more complicated decoding algorithms are required. With ordinary GF and efficient decoding algorithms the coefficients June 10, 2021 DRAFT overhead becomes close to: Danilo.G et. al. in [8] propose another sparse method that only allows selecting coefficients from a small set, Q, of GF (q) elements. Specifically, Q is a subset of the primitive elements of GF (q), and can be agreed upon among the network nodes in advance. Thus, the sender instead of sending the m coefficients values as an ncoding vector, it can send only a bit vector that contains the coefficients indexes in Q and their positions in the coding vector. The overall coefficients overhead of this scheme is m(log 2 |Q| + log 2 n) bits.
Ye.L et. al. in [9], propose a method that splits the coefficients overhead into two vectors. The first one contains the m coefficients values. The second vector is an n-bit vector of 0's and 1's.
on the receiver, the 1's are replaced with the m coefficients. The overall coefficients overhead of this scheme is m log(q)/8 + n/8 bytes.
A scheme that is very related to our work is proposed in [10]. The scheme reduces the coefficients overhead per packet to be one symbol based on Vandermonde matrices. In the Vadermonde matrix, the entire row can be generated using a single seed. However, this scheme has two apparent limitations. First, because of the cyclic characteristic of the finite field, the generation size is restricted to be n <= GF (q) which means that in practical scenarios the generation size cannot exceed 32 packets. Second, recoding on intermediate nodes is highly likely to generate linearly dependent coding vectors. Another scheme, which uses a single seed that is transferred from the sender to the receiver, is proposed in [11]. However, this method requires synchronization among the participant nodes and it completely lacks the recodability on the intermediate nodes.
Niloofar.Y et. al. in [12] propose revolve codes (ReC) which are based on drawing a common element from a large GF, namely q ≥ 2 16 to guarantee low probability of linear dependency, and another n elements from GF (2 2 ). The small elements are interpreted as one item t and sent along with the large element as a compressed coefficients vector. The receiver generates the complete coefficients vector by repeatedly shifting the small elements item and each time adding to the large element. The overall coefficients overhead of this method is (log 2 q + n * t).
Most of the above studies [3], [8], [10]- [12] are unreliable in terms of recoding, generating linearly independent vectors, or\and scalability, namely, they do not support wide range of generation sizes. Moreover, the most reliable methods of these studies [6], [9] are sparse, which usually requires more complex recoding strategies to retain the structure of the sparse code [9], [12].
While this paper proposes a novel method that is fully dense, the proposed method incurs smaller overhead and performs more reliably compared the sparse coding methods for the most cases. Also, we believe that it is the most energy saver for small, e.g., sensors, devices.

B. Contributions and The Paper Organization
In this paper we address the previously mentioned problems and provide the following contributions: -We propose a new method, based on modular arithmetic and prime numbers referred to as the One Item Network Coding Vectors (OINCV). The method efficiently compresses the coefficients vectors to a tiny constant value that often does not exceed 4 bytes.
-The proposed method is dense, recodable, and reliable in the sense that it can be real alternative to RLNC without any limitations.
-The proposed method is scalable and it can be used over a wide range of applications and devices, i.e., from wireless sensors networks to high performance computers.
-Both theoretical analysis and simulation are provided to prove the correctness and features of the proposed method.
-Theoretically, the proposed method is compared with all state of art studies in terms of coefficients overhead ratio, and encoding cost.
-Experimentally, the proposed method is compared with all reliable studies in terms of coefficients overhead ratio, download time, throughput, and packets drop rate. This evaluation has considered three types of networks: wireless sensors network for Internet of things, conventional wireline Ethernet, and P2P content distribution.
The rest of the paper is organized as follow. Section II presents important preliminaries and the design model. The theoretical analysis and the proof of correctness is detailed in Section III, while the experimental work and results are presented in section IV. Section V provides the conclusion and outlines the future work.

II. PRELIMINARIES AND DESIGN MODEL
Let P be a prime and Z P = {0, 1, ..., P − 1}, if addition and multiplication are defined in Z P as being modulo P , Z P can be defined as the prime GF of order P , GF (P ). On the other hand, GF (2 r ) or GF (q) where q = 2 r and r >= 2 is referred to as the extended GF. However, elements in the extended GF are presented as polynomials of degree r − 1 and arithmetic operations are defined as being modulo irreducible primitive polynomial P (X) of degree r; such primitive polynomials are analogous to the prime numbers in Z.
Herein, both prime and extended GF are used. GF (2 r ) and GF (q) are used interchangeably and when GF (P ) and GF (q) are mentioned together, it means that P is always the largest prime number less than q. Also, P is always notated by big P , while little p i notation is used to refer to the other primes in GF (P ).
The Chinese remainder theorem (CRT) is used to solve a set of congruent equations with one variable but different moduli, which are relatively prime.
The proof and details about this theorem can be found in [13], [14].
Definition 1 (Code Density [15]:): It refers to the ratio of non-zero elements in an coefficients vector relative to the entire size of the vector. It is calculated as: Full density can be achieved by choosing a coefficient for each cell in the coefficients vector based on random uniform distribution. On the other hand, sparse density can be achieved by allowing several zeros to be positioned in the coefficients vector. Full density is desirable as it yields high degree of innovation amongst packets, while it results in higher coefficients overhead.
In practice, sparsity is widely used as it allows the coefficients vector to be compressed and hence yields reduced coefficients overhead. However, the higher the sparsity, i.e., code density becomes closer to 0, the higher the risk to get linearly dependent vectors which leads to even worse overhead. Herein, we set the code density to 0.4, i.e., m = n/2.5.
In what follows, we propose a dense coding vectors technique referred to as OINCV based on modular arithmetic and prime numbers. It incurs only 2, 3 or at most 4 bytes as an overhead per packet.
A. Primarily step : generating the primes For a generation of size n, n primes are generated independently in each network node such that the n generated primes are identical in all the nodes.
June 10, 2021 DRAFT Definition 2 (Generation Moduli Set (GMS)): For a network coding generation of size n that runs GF operations, e.g. addition and multiplication, over GF (2 r ), the generation moduli set is defined as the set of the largest consecutive n − 1 primes picked up from GF (P ) besides P itself, such that GMS={p 1 ,p 2 , ... p n−1 ,P }.
For example and without loss of generality the GMSs for a 4-packets generation that operates over different GFs are listed in table I. The reasons behind selecting the largest n primes are twofold: 1) to make the nodes synchronized, namely all the nodes generate the same GMS without the need to transfer the primes and 2) to leverage the items of the coding vector, namely, the modulo primes operation will give more items combination within a coding vector, thereby reducing linearly dependent vectors probability. We emphasize that GF (P ) is used to create the GMS, and generate or regenerate the coefficients vectors, while the needed operations by network coding are still performed over GF (q).

B. Encoding
Definition 3 (Traversed Coefficient (I)): is an integer number selected randomly by the sender such that it is a 2, 3, or 4 bytes and then it is transmitted as a header of the coded packet to generate the entire coefficients vector on the receiver side.
Definition 4 (Allowed Selection Range (ASR)): is a set where the traversed coefficients are Encoding begins by generating a proper GMS that matches the generation size, thereafter, the sender selects a traversed coefficient I from the ASR set, and then a vector of integers is constructed as follows: [I mod p 1 , I mod p 2 , ..., I mod p n−1 , I mod P ] These vector elements are GF (P ) integers, therefore, they can easily be transformed to the corresponding GF (q) elements, such that the coding coefficients vector becomes as: To compose the coded packet, each coefficient from ζ is multiplied by its corresponding plain packet and then packets are combined, exactly as in RLNC. However, instead of sending the complete coding coefficients vector ζ, the sender sends only the traversed coefficient I along with the payload. For example assume a generation size of 32 packets and RLNC is applied over GF (2 8 ). Fig. 1 shows how OINCV can reduce the network coding overhead from 32 bytes to only 2 bytes.

C. Decoding
Once the receiver received the coded packet, it uses the traversed coefficient to construct the coefficients vector ζ, in the same way the sender does. The receiver keeps filling the coefficients matrix until it is a full rank, and then solves using Gaussian elimination to get the original generation packets. The steps are presented in Fig. 2.

D. Recoding
An intermediate node, e.g. a router, recodes the incoming packets by randomly selecting two packets out of all the received packets and new two coding coefficients (ê 1 andê 2 ) from GF (2 8 ).
The intermediate node then multiplies each coefficient by its corresponding packet's payload.
Thereafter, linearly combines the new payloads together to produce the outgoing packet. Two packets are sufficient since the proposed method is fully dense. On the other hand, the overhead part is composed by attaching the new selected coefficients (ê 1 ,ê 2 ), along with the original traversed coefficients (I 1 , I 2 ) of the two selected packets to the outgoing packet as shown in Fig. 3 (a). The reason for attaching the coefficient as opposed to combining them is that if (ê 1 , e 2 ) are multiplied by (I 1 , I 2 ), then (I 1 , I 2 ) are almost impossible to be recovered on the receiver side. Thus, attaching these two coefficients reduces the extra overhead from n to a maximum 10 bytes as will be shown shortly.
When the receiver receives the recoded packet, it uses the traversed coefficients to construct the coding coefficients vectors as shown in the previous subsection. Next, each attached coding coefficient (ê i ) is multiplied by its corresponding coefficients vector ((I 1 ). Next, the two coding vectors are linearly combined to produce one vector. Finally, this vector is added as a row vector to the coefficients matrix. The complete process is illustrated in Fig .3 (b). The payload part of the recoded packet is not touched on the receiver side and it is added as is to the coded packets column.
Two essential questions about the proposed method arise. First, since the servers are not co-located and have no coordination, and the traversed coefficients are selected randomly, then selecting the same traversed coefficient by different two servers leads to constructing two identical coefficients vectors and as a consequence two linearly dependent vectors. Hence what is the probability that two or more servers select the same traversed coefficient? The second question is will two different traversed coefficients, I 1 and I 2 , generate the same coefficients vectors after the modulo operation over the GMS? These questions are addressed and analyzed in the next section.

III. THEORETICAL ANALYSIS AND PROOF OF CORRECTNESS
In this section, we provide theoretical analysis and prove the correctness of the proposed method. In addition, we compare the proposed method with all the previous studies in terms of coefficients overhead and encoding cost.

A. Analysis of network coding coefficients overhead
If recoding is not required, then the coding coefficients overhead incurred by OINCV is a very tiny constant per packet regardless of the generation size, namely, it is 2, 3, or 4 bytes per packet and 2n, 3n, 4n bytes per generation. In case of recoding, the analysis considers that a  in [10] has less coefficient overhead when the generation size is 16, but it is prone to high probability of linear dependency amongst coded packets if recoding is required. In addition, the method is viable only for n ≤ 32 [10].

B. Analysis of selecting different values from ASR
To decode a generation of n packets successfully, OINCV needs n distinguishable traversed coefficients to be selected from ASR. we examine the probability of selecting n distinguishable traversed coefficients.
Proposition 1: The probability that n traversed coefficients selected from an ASR are distinguishable (P dist ) is given by: . Since these are independent events, then the overall probability of selecting distinguishable traversed coefficient from the ASR for a generation of size n is given by: Proof: The proof is constructive by substituting in (5) and the observations from Fig.6.
-Moreover, all these calculations are supported by simulation as shown in Fig. 6.

C. Analysis of generating two identical coding vectors
We showed in the previous subsection that the probability of selecting two identical traversed coefficients can be minimized to approach 0. In this subsection, we address the following two questions: (1) do two different traversed coefficients modulo the same GMS generate the same coding coefficients vector? (2) what is the probability that two or more different coding coefficients vectors are linearly dependent? step2: Let ζ vector be written as a system of simultaneous linear congruence: . . .
since p 1 , p 2 , ..., P are relatively prime, then utilizing the CRT the above system has a unique solution which is I 1 . As a consequence, I 2 can not generate ζ unless I 2 = (p 1 × p 2 × ... × P ) + I 1 but since this value is far larger than I 1 , it is out of ASR scope.
Finally and for the sake of completeness, we analyze the probability that two or more different coefficients vectors generated by the proposed method are linearly dependent.
It is well known that an n × n matrix over GF (2 r ) has full rank with high probability for large r, i.e., r >= 8 that is specified by [16]: Since the coefficients vector for the proposed method is generated by the modulo operation over GMS, any of its coefficient's value cannot go beyond P − 1. Consequently, the n × n matrix values can only selected from GF (2 r ) − {P, P + 1, ..., 2 r } group, whose size and element diversity are identical to GF (P ). This in turn reduces the selection diversity. However, for a P that is the largest prime smaller than 2 r , GF (P ) contains at least 98% of GF (2 r ) elements when 8 <= r <= 16. Therefore an n × n matrix over GF (P ), is full rank with almost the same high probability.
Lemma 1: An n × n coefficients square matrix over GF (2 r ) where its elements are chosen June 10, 2021 DRAFT from GF (P ) is full rank with high probability that is given by: Although the cost for a single operation over GF (2 16 ) is more expensive than the cost of the same operation over GF (2 8 ), the encoding cost of the proposed method is always less than that for RNLC and RLNC-like methods. This is because the number of operations is minimized to the half and most software and hardware implementations are using lookup tables for multiplication which makes the time per an operation is almost the same even when the field sizes are different.   [6], and the method of Ye.L et. al. [9].
The evaluation considers the following performance metrics: coding coefficients overhead measured in KB, number of dropped packets on the receiver side, download time measured in terms of number of rounds, and throughput measured as number of packets per round. The remaining details needed for each network scenario are given in the respective subsections.

A. Wireless Sensors Network Case Study
In this case scenario, we consider the following settings: -Data of sizes 16KB, 32KB ,64KB, and 128KB are shared by two servers to one client that is at zero or one hops away from the servers. The servers share their packets generation by generation employing a sequential scheduling policy. The client sends acknowledgement only when it receives all the generation's packets.
-We consider the Zigbee protocol [17] whose packet size is 128 bytes but since its header is 28 bytes, the payload is only 100 bytes in size.
-We assume a generation size of 32 packets [2] and a link bandwidth of 2 packets/round.
Let the channel loss probability be = 0.2 for data packets, while the loss probability for acknowledgements is α = 0.1.
Based on these settings, we can specify the packet payload and coefficients overhead, and compute accordingly how many generations the data must be divided for each of the considered methods. For instance, when the data file size is 16KB and the RLNC method is used, then the coding coefficients overhead per packet is equal to the generation size, i.e., 32 bytes, and thus the remaining 68 bytes are for the payload. To specify the number of generations, first the data size is divided by the payload to get the number of packets, 16KB/68B = 241 packets, then the number of packets is divided by the generation size, 241/32 = 8 generations. For the methods proposed in [6] and [9], firstly the coefficients overhead is calculated according to (2) and (3), respectively, then the other specifications are determined. For OINCV the coefficients overhead is 2 bytes since GMS can be generated from GF (251), and therefore I can be selected from [252, 2 16 − 1]. Table III summarizes the specification for all methods. Figure. 7(a) shows results of the coefficients overhead in KB; the results indicate that the proposed method always incurs significantly less overhead that ranges from 0.43 KB to 3.5. While, the overhead for the next best  Figure. 7(b) shows the download time results in terms of rounds. Since the proposed method has the minimum overhead, then the payload of the packet contains more data for OINCV2 relative to other coding methods as shown in Table III.
This means the data files is split to fewer packets and generations, and hence less download time. Figure. 7(c) shows the throughput results measured in packets/round. Although the bandwidth of the channel is 2 packets/around, the maximum (ideal) throughput is 1.6 packets/round since = 0.2. The OINCV throughput is 1.48 packets/round which is the greater than that for the other methods and is very close to the ideal throughput. Lastly, Figure.  Since OINCV has always the minimum number of generations and thus fewer acknowledgement to be sent, it has always the minimum number of dropped packets.
Because the overhead for OINCV is constant and in contrast to the other methods, it is not affected by the generation size, it turns out that the generation size of 32 is not the optimal choice for OINCV. Indeed, OINCV can work using a generation size of 100 packets and perform better with the added cost of one extra byte of overhead for every packet. Figure.

B. Recoding
In this subsection, we repeat the previous experiments with the same settings except now an access point (AP) exists between the servers and the receiver, and thus recoding is required. We assume that the links between the servers and AP, and the one between the AP and the receiver have the loss probability of = 0.2.

C. Ethernet Protocol Network
In this case scenario, we consider the following settings: -File of sizes 50MB, 100MB, 250MB, 500MB are shared by 4 servers to one client who is at zero hops away from the servers. The servers share their packets generation by generation, -We consider the Ethernet MTU packet size which of 1500 bytes, and the upload rate for each server to be a maximum of 10 packets/round.
-We assume generation sizes of 256 and 512 packets.
Based on these settings and as shown in the previous section, Table V presents the specifications of packet payload and coefficients overhead, and number of generations of each network coding method when the generation size is 256. Figure. 10(a) shows results of the coefficients overhead in MB; the results show that overhead for OINCV does not exceed 1 MB for all the considered file sizes, while the next best method overhead reaches 50 MB. Figure. 10(b) shows the download time results in terms of rounds. Since the proposed method has the minimum overhead, again fewer packets and generations are utilized for the proposed method and hence less download time. Figure. 10(c) shows throughput results measured in packets/round. The results shows that throughput for OINCV is 39.5 packets/round which is the highest relative to the other methods and the closest to the ideal throughput of 40 packets/round. Figure. 10(d) shows results of the number of dropped packets on the receiver side. Since OINCV has always the minimum number of generations and thus fewer acknowledgements to be sent, it always has the minimum dropped packets.
For the generation size of 512, the coefficients overhead for OINCV is maintained fixed at maximum of 1 MB, while it is almost doubled for all the other methods. The download time and throughput are improved for OINCV, on the other hand, these metrics are degraded for the other methods. The number of dropped packets on the receiver side for all the methods is improved because the number of generations is reduced and thus fewer acknowledgements to be shared.
These observations are depicted in Fig. 11.

D. Large Scale P2P Content distribution network
In this scenario, we are interested in measuring the overall coefficients overhead and the following settings are considered:  -We consider packet size of 16KB.
-We assume a generation size of 1280 packets [5], link bandwidth of 8 MB/round that is shared evenly among the servers such that each server can upload 2 MB/round, while the receiver download rate is set to 8 MB/round.
Based on these setting, results of coefficients overhead in MB are depicted in Fig. 12    Proof: By building up the n × n matrix from the scratch. The first row can be any n−tuple vector over GF (P ), therefore we have P n − 1 choices , excluding the zero vector. The second row must be linearly independent from the first one and since there are P linearly dependent vectors of the first row, then we have P n −P choices. For the third row to be linearly independent from the first and the second rows, we have P n − P 2 and so on. In general, for the i th row to be linearly independent from the previous (i − 1)−rows, we have P n − P i−1 choices. Once we build the matrix by this way, we know that its rows are linearly independent and thus the matrix is invertible. Hence, the number of invertible matrices is given by (P n − 1) × (P n − P ) × ... × (P n − P n−1 ) = n−1 i=0 (P n − P i ) The overall n × n matrices over GF (P ) is given by: divide (8) by (9) = n−1 i=0 (P n − P i ) P n×n = (P n − 1) × (P n − P ) × ... × (P n − P n−1 ) P n×n = P n (1 − 1 P n ) × P n (1 − P P n ) × ... × P n (1 − P n−1 P n ) P n×n .