The Associative Multifractal Process: A Novel Model for Computer Network Traffic Flows

A novel constructive mathematical model based on the multifractal formalism in order to accurately characterizing the localized fluctuations present in the course of traffic flows today high-speed computer networks is presented. The proposed model has the target to analyze self-similar second-order time series representative of traffic flows in terms of their roughness and impulsivity.


I. INTRODUCTION
N high-speed computer network traffic modeling, the robust characteristics present in most network structures, such as active users, applications, protocols, and topology [1]- [4], are considered.To obtain a parsimonious model of the traffic, it is avoided to describe localized fluctuations [5], [6], that are very sensitive to the conditions of the network [7].It is argued that, in the extrapolation of the traffic observation scales [4], it can be carried out because, in the analysis of the time series of real life, they show cut-off points of lower and upper scale [1], [7], sometimes separated by several orders of magnitude [7].This point of view adopts, as an axiom, that it is not worth thinking about dichotomous behaviors [8] and even less about theories focused on the alteration of transversal phenomena as a result of their exacerbated structural reductivism.
Traffic modeling, based on conventional telephone systems, is based on two assumptions: independence between arrival times of successive frames and exponential durations in the use of resources [1], [8].
These assumptions imply restricting stochastic processes to a Poisson universe.These considerations have been useful to designers and analysts in planning capabilities and predicting performance in systems.However, there are many cases in the real world [7] in which it has been observed that the predicted results, from a queue analysis [9], differ significantly from the performance observed in reality, and that this discrepancy has its origin in that the processes of traffic flows frequently shows far-reaching variations at different time-scales [1], [7], [8].
In the Poisson models [10], which have no memory or have short-range memory, they expose the flows of traffic bursts on much smaller timescales [7], [8].As a result, overly optimistic forecast about performance are obtained due to the use of distributions with finite variance to characterize the periods of presence and absence of packet bursts [7].
Two modelling streams coexist; a conventional one, based on Poisson processes, and a self-similar one that accepts longrange dependency (LRD) as an inherent characteristic of data traffic flows in current high-speed computer networks [4], [6].
Self-similarity and LRD are not synonymous and formally one condition does not necessarily imply the other, but from the point of view of traffic modelling, self-similar processes are characterized by their invariance to scale changes and their ability to exhibit long-range correlations, that is, LRD [11].
The advantages of the LRD approach are evident over the conventional one [8], especially in reference to the more than questionable assumption of independence both between successive shipments and arrivals in traffic flows in computer networks.However, there is a certain number of researches that show a lack of consensus about the scope of applicability of self-similar models and the impact that the LRD has on the performance of communication systems [1], even though their number is quite small: Their conclusions should be carefully analyzed since they reveal a fundamental question for the multifractal analysis, which can be stated as: since traditional queuing models are not capable of evidencing self-similarity [7], their validity to predict yields would be supported by demonstrating that the self-similarity does not have a truly measurable impact on them.And even more, if it is shown that self-similar models fail to consider the impact of important individual parameters in each particular case of network or communications system [1], [7].
There is another even more critical question that points directly to the center of self-similar modelling: the validity of the Hurst exponent (H) as a descriptor of the traffic LRD.In [12] it is shown that, through an exhaustive queue analysis applied to a representative time series of Ethernet traffic traces H does not provide an accurate prediction of queue performance for a given LRD traffic [12].And furthermore, its behavior is monotonic with respect to the presence or absence of packet bursts, if the original series is broken down into other smaller ones, which implies that H also does not serve to characterize the importance of the groupings within the total traffic, thus dismissing aggregation as a method of analysis [1], [12].
The Associative Multifractal Process: A Novel Model for Computer Network Traffic Flows G. Millán, Member, IEEE, G. Lefranc, Senior Member, IEEE, and R. Osorio-Comparán I These questions find an answer in the analysis of behavior on a small scale [7], [8], a product that, quite the opposite of what happens with the large-scale analysis, this considers the localized fluctuations in the course of the samples of the time series representative of the traffic flows of traffic and does not discriminate them in favor of more robust singularities [7].
Specifically, it is about the use of the multifractal formalism understood as a set of coordinated fractal varieties capable of characterizing the evolution of a phenomenon that exhibits self-similarity at different observation scales to study traffic flows [8].

II. STRUCTURE OF THE PROPOSED MODEL
The original On/Off model provides detailed information on the generation of traffic from both individual and aggregated sources [13].
Then, it is proposed a generalization of its properties through the introduction of the multifractal On/Off model concept.

A. The On/Off Model
An On/Off process alternates between two states: On during which a source generates traffic with an A j rate and Off during which the source does not generate traffic.
Then, being X j and Y j the durations of the jth On and Off states, respectively, an conventional On/Off process is defined by the expression , where is verified that  S j : Regenerative point [14].Indicates the occurrence of the jth state On, defined by where S 0 is the beginning of the observation time and it is accepted that S 0  0.  I j (t): Indicator function; I j (t)  1 for t  [l 1 , l 2 ), where l 1 and l 2 are fixed values according to the On states.
In the On/Off model, each X j , Y j are assumed independently and identically distributed according to whether the hyperbolic tail distribution presents or not finite variance.
The Hurst exponent of an On/Off model according to (1) is given by the expression [15]   where  0 and  1 are the tail indices of the durations of the On and Off states, respectively.If the durations of the On and/or Off states have finite variance, the index of their queue is taken with a value of 2 when (2) is applied.The original On/Off process given by ( 1) is self-similar if 1/2 < H < 1, which implies that the shorter duration of one of the states (On or Off) presents a hyperbolic tail distribution so that the On/Off is self-similar.

B. The On/Off Multifractal Model
Let (t) be a conventional On/Off process given by (1).It is said that (t) is an Associative Multifractal Process,  AM (t), if the durations of the On and Off, X j and Y j states, respectively, are hyperbolically distributed with density functions given by f 1 (x), f 0 (x), respectively, and such that where f 0 (t)  f 1 (t)  0 for t  0, and time means of On and Off states, both finite, are given by  1  E{X j } and  0 = E{Y j }.

1) Comments About (5)
In the statement of ( 5) it is necessary to emphasize that 1.The expected value of a stochastic process governed by (5) is given by E{ AM (x)}   1 /( 0   1 ).
4. According to [16] the Power Spectral Density (PSD) for ( 1) is given by where F 0 (j) and F 1 (j) are the characteristic functions of f 0 (x) and f 1 (x), respectively, and  (•) the Dirac function.
Thus, for (5), (7), replacing E{ AM (x)}according to the first consideration, becomes Result that highlights the coherence between the structure of the  AM (t) processes and a well-known theory such as the multifractal theory [17], in addition to accounting for the flexibility of the original On/Off approach of [13] as regards aggregate traffic representation is concerned.5.An important subclass distribution that shows a regular variation are the hyperbolic-tailed distributions, which have a survival function according to ( ) ( ), when , where G(x) is a slowly varying function and 0 <  < 2.
With the above for (5) it is verified that: If the durations of the On and Off periods are Zipf distributed with reliability functions given by F S1 (x;  1 , k 1 ), F S0 (x;  0 , k 0 ), respectively.Then, for   0 + , from (8) we have the expression [18] 0 1 It is possible to simplify (10) without loss of generality as follows 2 0 1 ( ) , where min{ , }.
7. From [19] it is known that a random process X with finite second-order statistics is stationary with LRD in the sense of its autocorrelation, if its autocorrelation function, R xx (k), satisfies asymptotically that lim ( ) / , where 0 <  < 1, with   2  2H; and C A > 0.
An equivalent definition of LRD is described in [20] based on the process spectrum: a process has LRD if this spectrum satisfies 8.Then, an On/Off process given by ( 5) Pareto distributed On and Off periods with tail indices  1 and  0 , respectively, is LRD in the sense of ( 13) if where {0} and H given by (4).

III. THE ASSOCIATIVE MULTIFRACTAL PROCESS MODEL
In Section II the mathematical foundations that make up the structure of the model were delivered (see ( 5)).It corresponds to involve an  AM (x) process with the sources that make up a high-speed computer network.

A. General Modeling Condition
Being X j and Y j the durations of the On and Off states, for modeling purposes the following conditions are assumed 1.The processes X j and Y j [21] are independent and identically distributed with reliability functions given by F S1 (x;  1 , k 1 ), F S0 (x;  0 , k 0 ), with 1 <  0 < 1, 1 <  1 < 2, respectively.
2. The traffic generation rate, A j , during the processes X j is independent and identically distributed of Bounded Pareto with reliability function F SB (x;  B , k B ) independent of the durations of the On and Off states.

B. Previous General Comments
It is necessary to establish the following guidelines, which are valid in the general context of the model 1.The use of heavy-tail distributions for the durations of the On and Off states is based on the observations of [21], which suggest that the On and/or Off states can be very long with a high probability.2. If the durations of the On and Off states are considered to be heavy-tail distributed with finite variance, the overlap of many sources will behave as short-range dependent traffic, which conflicts with the precepts and conclusions of [22], and therefore on with the basis of this research.3. The Bounded Pareto distribution, in terms of its reliability function, is defined by [18] ( ; , ) ( ; , )( where u B (•) unit step function and B cut-off limited imposed on the random variable.Furthermore, from [18] the Bounded Pareto function has the following density probability function ( ; , ) ( ; , )( ( ; , ) ( ) where f SB (•), F SB (•) Pareto probability density and reliability functions, respectively, and  (•) is the Dirac delta function.
4. The existence of cut-off limit B causes the Bounded Pareto distribution, in contrast to the Pareto distribution, to have finite variance.

C. Traffic Generated by a Source
It is considered a source that alternates between On and Off states governed by (5).During the On states the traffic source generates traffic with constant rate A j and during the Off states the source it remains silent.
Then, the probability density function of the process  AM (t) of ( 6) under the modeling conditions established in Subsection A, is defined by with  1 and  0 defined in (5).

D. Traffic Generated by Multiple Sources 1) Recapitulation. General Modeling Considerations
In the approach of (1) it is specified that A j is the rate with which the jth source generates traffic during On states, which have a duration determined by X j process.
This idea is later reflected in (5) originating (6) under the following three key modelling assumptions 1.The transmission rate in the individual On states is constant for the different compound On states, that is, A j  C j, to obtaining (6). 2. The durations of the On and Off states, given by X j and Y j , respectively, are distributed with a heavy-tail with density functions f 1 (x) and f 0 (x), respectively, which gives rise to the associative multifractal process  AM (x) from (5), where it is agreed that f 0 (t)  f 1 (t)  0 for t < 0.
3. The means of the durations of the On and Off states;  1 and  0 , ( 1  E{X j },  0 E{Y j }), respectively, are finite, a fact by which the expected value of a process governed by ( 5) is given by E{ AM (t)}   1 /( 0   1 ).
The previous assumptions in conjunction with the definition of a Bounded Pareto distribution in terms of its reliability function with upper cut-off limit B (see (15)), give rise to (17), under the general modeling conditions given by Subsection A, expression that governs the probability density function of the  AM (x) model when it is a source.
The model for a source in terms of its density function (see (17)) determines the regime that each source follows to inject traffic into the network.Then, for the case of multiple sources, it is necessary to consider the capacity of the link over which the composite traffic flows from n sources are sent.
2. It is assumed that the traffic generation rate is constant in the individual On states, for the different compound On states, that is, A j  c, j.