Investigation of Key Parameters to Improve Denoising Images based on Diffusion Equation

—Denoising is as a vital step in image enhancement applications. While Perona and Maliks classic diffusion equation has shown remarkable performance in the partial differential equation-based image denoising techniques, a large amount of noise remains the same as impact noise. In recent years, new models were proposed with the goal of denoising the images while preserving the edges. The smoothing ﬁlters, wavelet transform coefﬁcients, and adaptive parameters have been used in the new models to address the drawbacks of previous approaches while at the same time improving their performance. This paper investigates the effect of a number of different parameters in diffusion-based algorithms and proposes an aggregated formula for modifying methods that enhance the image due to the power of the noise, image content, and user requirements. Furthermore, a monotonic decreasing convergence rate curve based on only the noisy image was proposed that its slope could determine the stop moment of the algorithm so that no additional inﬂuence occurred to avoid blurring. Results show that hybrid methods can have a good performance in objective and subjective quality assessment.

in image denoising is to make the trade-off between removing noise and maintaining edges. In recent decades, methods based on variation and partial differential equations (PDEs) have gained popularity for image restoration. These methods got very successful results in eliminating noise. The principal intuitive method for removing noise from an available image is to retrieve the good estimation of original image. The noise removal technique based on anisotropic diffusion equations which were proposed in 1990 by Perona and Malik, is usually taken as a basic work on PDEs for image restoration [7]. It is commonly believed that the Perona-Malik(PM) model provides a potential algorithm for image segmentation, noise removal, edge detection, and image enhancement. For many years, many articles analyzed the capability and drawbacks of the PM method to propose modified methods [8], [9], [10], [11], [12], [13], [14]. In recent years some hybrid models based on the PM method are proposed that show remarkable results [15], [16], [17], [18], [19], [20], [21]. For a gray-scale image in continous domain, Perona and Malik proposed an edge-controlled diffusion operator [7]: ∂u ∂t = div(g(∇u)∇u), (x, y, t) ∈ Ω × (0, ∞) u(x, y, 0) = f (x, y), (x, y) ∈ Ω (1) where f (x, y) is noisy available image, symbols div and ∇ are divergence and gradient operator with respect to spatial spatial coordinates in a domain Ω ⊂ R × R, respectively. u(x, y, t) is initialized using noisy image and the main goal is to get closer to the original image over time. g(∇u) is a non-negative function on the magnitude of local image gradient ∇u which controls the diffusion rate and has such properties as g(0) = 1, and g(∇u) → 0, as |∇u| → ∞. Two kernel that commonly used are: g(∇u) = 1 1 + (|∇u| /K) 2 (2) where K refers to the gradient threshold of the diffusion coefficient function to control edge detection and the amount of image influence. The kernel g(∇u) always named as Diffusion Coefficient Function (DCF). It is important note that, the lower the threshold level, the more final residual noise but less blurry, and the higher the threshold level, less the residual noise in the final result but more opaque.
In discrete domain, PM algorithm has used four directions (North, South, East, and West) to computing gradients and so it used Eq. 4 instead of Eq. 1, (note: in some cases for summarizing the formula x is used instead of (m, n): where τ is used for regularization and always considered as 1 4 . The term g (∇u) .∇u of Eq. 4, in one direction with respect to gradient magnitude, |∇u|), is shown considering Eq. 2 and 3 in Fig. ??. As can be seen, as the magnitude of the gradient increases, the value of the function goes to zero to prevent the image from changing at important edges. There is also protection of perfectly smooth areas. However, since using Eq. 3 leads to zero, rapidly, always Eq.2 is used.
∇u r−1 (x) in space domain could be done in eight directions to provide more improvement. In the form of eight directions, gradient is done in Eq. 4 besides other four diagonal direction as NorthWest, NorthEast, SouthWest and SouthEast as following: which using 8 directions requires adjusting τ = 1 8 . Recently, the other methods based on the fractional diffusion equation were proposed in [22], [23]. Method of [22] could be used in medical images, however, the fractional equations result in a very complicated set of equations in multiple dimensions, which are hard to solve. The authors split diffusion equations using Alternative Direction Implicit (ADI) method that leads to be solved efficiently with the tridiagonal matrix algorithm. As well, Liao and Feng proposed the time-fractional diffusion equation to effectively remove the noise of image [23]. Their algorithm has too much computation complexity as for example, when they operate the implicit model in the image with 256 × 256 pixels, it costs three hours but still cannot iterate once. So, they divided an image into blocks and applying the block method. However, their numerical scheme costs 30 minutes to run five times. Therefore, we decided to investigate in section II, the important options would be defined to optimize parameters for each noisy image to restore that, tried to be able to integrate all the algorithms into one comprehensive format and analyzed with simulations in III. We concluded this paper in section IV.

II. STUDY THE KEY EFFECTIVE POINTS
What we want from the process is to achieve three main objective goals: • Eliminate noise in image as much as possible; • Preserve the edges of the image and prevent details from fading; • Enhance the natural visual quality of the image. we suggested a comprehensive formula to adjust the smoothing and noise removal according to its intensity and all optional parameters. Initially, the corrupted images is transferred to the Y CbCr or other color space which its brightness is in one channel and two others belongs to color components. The algorithm could be applied to three-channel independently. However, we observed for simplicity, in the case of white additive noise, it is enough efficiency to apply algorithm only to the brightness component. Other channels correspond to the color components that are retained. The noise power of the image obtained from channel Y can be estimated using visibility or some formula such as total power of noisy image and its difference with an ordinary safe image. Due to the impact of noise, especially under heavy noise pollution, computation of the gradient is not robust to outliers. Hence noise still remains in the regions with gradients greater than the threshold level and, over-diffusion cause blurring in edges. To overcome this drawback some research works have been done especially in the recent decade. So, we started to define the key effects step by step to receive a complete formula using diffusion equation in image processing. At first, we write Eq. 6 as an initial modified formula: The most important factor influencing the penetration coefficient in Eq.6 is the function ψ(·), which in its simplest form is introduced in Eqs.2 and 3 named g(·) in the elementary method. This function was defined as a Diffusion Coefficient Function (DCF). ψ(∇u r−1 ) is a weighted function that allows us to change pixel values of the image. In this procedure, there are some options to consider the effect of noise on edges and smooth areas in the image. Certainly, noise could result in the detection of mistake edges, so in the computation of DCF parameters, researchers suggested considering gradients after smoothing the degraded image to reduce the noise effects in the detection of edges. Therefore, the function ψ(·) could be defined generally as Eq.7.
ψ(∇u r−1 (x)) = 1 In this equation, all options of ∇u that were explained in section I just on u r−1 (x), could be applied on u r−1 (x) * h(x) that h(x) is a smoothing operator. Therefore, there are several parameters to choose in the process. Obviously, considering Eq.7, in the initial PM method, this filter is replaced by the impulse function. In the following all items are explained.
A. Threshold Level K K is the threshold level which indicates edges with a larger value are retained and the rest move smoothly under the influence of diffusion. In uniform areas where the image gradient is low and the noise is inconsistent, the DCF, ψ(∇u r−1 ) should be greater than where the edge is. However, highpower noises may be selected and maintained as the edge by incorrectly selecting the threshold value K. So, choosing the threshold value K should be done based on the noise intensity; more intensity, larger values of K. However, there is a trade-off between the threshold value, the noise remaining, and the blur of the final result. In areas with a gradient lower than the threshold, the diffusion coefficient increases to more smoothing occur. Therefore, if the image has small details, such as a forest area, tree leaves, and or areas like zones 1-3 in Fig. 4a, they may be smoothed out in this way. Proposing values of 10 < K < 25 based on the initial observation of the noise image and brief knowledge of its contents can be appropriate.

B. Using Smoothing Filter
In 1992, for the first time, Catte et al. proposed an improved version of the PM algorithm based on the use of a smoothing filter to ensure that noisy edges are not mistakenly placed as the main edges [11]. They used the gaussian function with suitable variance. However, the amount of smoothing can also affect the main edges. In this way, the problem of not eliminating high power noise in flat areas was somewhat corrected. This is because the amplitude of the noise is reduced by the gaussian filter and maybe reaches a smaller range than the threshold level, thus leads increasing the penetration coefficient. In 1994, the exponential filter as h(x) = β 2 e −β|x| which is more accurate in detecting the edges of an image [24], was used instead of the gaussian filter, which showed better performances [12]. Fig. 2, shows the comparison of gaussian and exponential filters and their gradients. However, users can use different smoothing filters. Due to the low width and high amplitude in the center of the exponential filter, this eliminates low noise power and preserves more detail than the gaussian one. However, if noise intensity is suggested high since the gaussian filter is wider than the exponential one, will be a better choice for the smoothing step. It should be noted that increasing σ in gaussian and decreasing β in exponential smoother could result to excessive smoothing and blurring of the resulted image. Although in the case of very high noisy case, using large σ with gaussian filter was unavoidable.

C. Exponent variable α
The other option which can be studied is parameter α used in Eq. 7. This parameter was included in the initial algorithm by two. However, some other researchers considered 0 < α ≤ 1 for entire regions and 1 ≤ α ≤ 2 for between areas, edges. To sharpening the edges and enhance the texture of an image, Gue et al. [25] proposed an adaptive PM (APM) model based on variable exponent to control the diffusion rate. Gue et al. used a variable α based on the edge indicator that acts differently in proportion to the regions as: where, K α is a new threshold value which should be larger than K. If K α is considered smaller than K, at the very small values ∇u, coefficient α becomes saturated using Eq. 8 and remains at the same constant two as the original PM method.
To describe the application conditions for different gradient values, we divide the range of gradient values into several ranges: Therefore, when ∇u is larger than the threshold level K α (like in the image edge position), the variable α value tends to two, then DCF in Eq. 7 tends to zero to preserve edges with no influence. Otherwise (inside regions), which ∇u is smaller than K, α tends to zero and thus increases DCF in Eq. 7 to more diffuse to remove noise. For areas in which gradient is between K and K α , diffusion weight ψ(∇u) would be in the middle range. Fig. 3 shows influence function ψ(∇u).∇u using the constant (α = 2) and adaptive coefficient α, Eq. 8 with and without the use of the smoother. Since the smoother reduces the magnitude of the gradient, we suggested the effect of the smoother by multiplying the gradient value by 0.8 to simulate smoothness mode. It is clear, which in the case of smoothing, threshold levels increase by 10/8. Fig. 3 clearly shows that the use of the adaptive parameter α allows to increases the penetration rate of the areas of the image where the gradient is closer and larger than the threshold level K. The range of increment of the influence function is related directly to the value of the second threshold level K α . Although it should be pointed that increment of the influence effect range could be resulted blurring images. So, for simplicity in simulations, we consider K α = 2 × K. Obviously, designers could play with this parameter-dependent application and image contents. To examine the effect of smoothing kernels and parameter α of the diffusion equation procedure on the recovered images, image no.25 was selected from the database TID2013 [26]. This is an artificial image that shows different paths in several directions and widths, areas with various brightness, texts, and all possible textures in the image. The models which were used here have been defined in first six rows of table I and their details in table II that h σ (x) and h β (x) are gaussian and exponential smoother respectively. By using "ssim" evaluation criteria, (Structure Similarity Index Measure) [27] the results obtained from the implementation of different models on the sample image were compared. In ssim criteria, the larger the value of criteria shows the better improvement. The destroyed image was constructed with collective the white noise such as SN R = 16.71 dB and ssim = 0.51. In all simulations, k = 15, k α = 30, σ = 1.5, β = 1 were considered. Since the artificial image has been made according to certain textures, four areas of it were selected and numbered in the original image, 4a. In all the other images, four areas were magnified and below the resulted image displayed in left to right order. The images are shown in Fig. 4 corresponds to (e) EPM method, after 10 iterations.
(h) GAPM method, after 2 iterations. happened. Details of Models and the resulted images are in the followings: • Method PM and its conformity, APM, are the slowest of all. However, the final result has more quality value ssim than other methods, even after 23 iterations, high noise remains like impulsive noise. APM model 4d, especially could not construct zone 4 as before. • The EPM model is much faster than the APM, but the resulting image is somewhat blurred, especially at zones 3 and 4, Fig. 4e. EAPM model processes very faster and without blurring but with remained noise in flat areas, Fig. 4f. By repeating the algorithm, the noise is removed in uniform areas, but the thin edges are blurred at zones 2 and 3 and almost completely removed, and therefore the value of ssim is significantly reduced. • In GPM model, like EPM, the noise remains in the flat areas, and in areas such as zone 1, it also appears as noise spots, Fig. 4g. The model GAPM is more opaque than the EAPM in areas with thin areas and details, and this is due to the essence of smoothing in the gaussian kernel than the exponential kernel, Fig. 4h. • Despite the sharp decrease in the value of ssim in smoothing methods, the resulting shapes are completely cleared of noise, but areas with fine texture are damaged. • Convergence speed is very attractive and remarkable in methods with smoother and adaptive parameter, as in special applications, the user can achieve the processing speed in exchange for a very low quality reduction. To more analyze the effect of coefficient α, Figs. 6 can be considered. In these figures, the gradient images and their histograms in the north direction in models PM and EPM after five repetitions of images Figs. 4 show important notes. The range of values in the figures and the histograms are significant. In the case of without using a smoother, the range is in [−250, 250], and with a smoother in [−200, 200], where the reduction of the gradient value is well evident. As well, in the gradient image using smoothing, the density distribution is more concentrated around zero. Then, we calculated the value of adaptive coefficient α using Eq. 8 based on these two gradient images that lead to Figs. 6e and 6f. As seen in Fig. 6, using the adaptive α caused various values especially in flat areas which α → 0 and lead to more diffusion to more smoothing, specially in 6f. In the areas with too many edges, Figs. 6e and 6f, α increases to allow reduction in the influence amount to save edges.

D. Transform Gradient
It was recommended to use a transform domain gradient along a spatial domain [15]. So, using wavelet transform that offers a simultaneous localization in the time and frequency domain, is suggested. After one level wavelet transform, three gradient image in the vertical (V ), horizontal (H), and diagonal (D) directions which used wavelet functions and one low-pass image are provided. These three subimage H, D, and V play the role of ∇u in diffusion equation.
It is suggested not to use the transform gradient alone. Simultaneous use of weighted spatial and transform gradients can have better results especially in the case of the high-power noisy image. In this way, the sym4 wavelet transform is recommended in one layer to apply the diffusion equation in each of the vertical, horizontal, and diagonal directions. We have observed that using more than one level of wavelet transform not only does not lead to better results but also increases the computational complexity. Therefore, in this step, the designer should decide about the weights of ρ 1 for spatial and ρ 2 for transform gradients that should be ρ 1 +ρ 2 = 1. It is advised to consider ρ 1 = 0.75 and ρ 2 = 0.25.

E. Stop Condition
The experimental results indicate the need to provide a stop condition in the algorithm just optimally state or close to it to overcome the staircase and edge blurring effects. We could stop and evaluate the resulted image to carry on or finalized the restoration procedure. In noisy cases, too many mistake edges exist that when running the algorithm, in consecutive iterations gradually disappear or smooth. Therefore, it is expected that in consecutive iterations, the changes in resulted images that are dependent on the image edges, should be reduced. According to the tips provided, we consider the average absolute value of the resulting image gradient in successive repetitions as a criterion. Specifically, since the first iteration is very productive, the average gradient relative to the first iteration as Eq. 9 is measured and compared with the threshold level SC to stop the iterations.
where SC is named stop threshold condition. Experiments showed existenc dependency between noise power, image contents and the stop condition. The higher the noise power, the smaller the stop condition, and the more details in image, the larger the stop condition should be used to avoiding from blurring and degradation details. In later, we named R as the convergence rate. Note that the convergence curve is available online during processing, so, whenever it seemed sufficient, the algorithm can be finalized.  [7] classic diffusion equation APM [25] classic diffusion equation with adaptive α GPM [11] classic diffusion equation using gaussian smoother GAPM diffusion equation with adaptive α using gaussian smoother filter EPM [12] classic diffusion equation using exponential smoother EAPM diffusion equation with adaptive α using exponential smoother WPM [15] combined diffusion Equation of spatial and wavelet WAPM combined diffusion Equation of spatial and wavelet with adaptive α WGPM combined diffusion Equation of spatial and wavelet using gaussian smoother WGAPM combined diffusion Equation of spatial and wavelet with adaptive α using Gaussian smoother WEPM combined diffusion Equation of spatial and wavelet using exponential smoother WEAPM combined diffusion Equation of spatial and wavelet with adaptive α using exponential smoother

F. Comprehensive diffusion Equation
Due to the points mentioned, the restoration complete integrated formula is presented as follows: where ∇ S u r−1 (x) and ∇ W u r−1 (x) are respectively gradient in space domain and transform wavelet domain. The combined equation makes it possible to use all or part of the available options. The transform gradient can be used along with the smoothness and the adaptive α. Tables I and II consider all possible cases. Chosen models to apply the diffusion algorithm is a dependent procedure on the noise intensity and the essence of the image. Some other modified algorithms were proposed to overcome the edge blurring and staircase effect in the PM diffusion model. For example, in [13] fourth-order PDE for image denoising is designed. Although the fourth-order PDE is free of staircase effect, it often suffers over-smoothing, and the calculation of a high-order PDE is complicated and challenging. In next section, all models were experimented with two ordinary images.

III. EXPERIMENT RESULTS AND COMPARISONS
All experimented models have been summerized with abbreation names in table I and using vital equations in table II. Although we tested a large number of images, the results of two images are only displayed . These images were selected with different contexts from TID2013 dataset [26], which can be seen in Fig. 7. The selected images are in two different categories.
• 1. Image House is complex and has many distinguishable edges in different directions and also some flat areas; • 2. Image Lena consider as a medium complexity image. • 3 Image Parrots has two prominent objects and the background is almost uniform, the noise of which is obvious to the viewer, but its blur is not annoying. To show the performance of methods, we provided noisy case for each imgae as table III which can be seen in Fig. 7. Although all three images are the same in terms of the noise applied to them (quantity SN R is almost equal), quantity ssim is very different, which is the result of the type of image structure and the possibility of seeing noise in them. Reconstructed images from different models were compared using two evaluation criteria. The first one is ssim and the second one was introduced in 2018 [28] as QPSV based on singular value decomposition, the structure of images, and salience of image  • it is clear that the use of a smoother, regardless of its type, will lead to a faster reconstruction and better quality. The figures show the relative impact of the smoother on the other parameters (adaptive α and transform gradient). As mentioned, in images with high noise, the use of gaussian smoothing filters accelerated the achievement of good quality results. Please look at Figs. 8 to 10 in red and green curves that correspond to using gaussian and exponential smoothers, respectively. • In general, using the adaptative α will help accelerate convergence, especially in the absence of a smoother and is more pronounced if we are dealing with high noisy images. • The effect of using a wavelet transform alone was not effective, but when combined with a smoother. The effect of its improvement is more than using the adaptive parameter. This may be because the use of wavelet transform with wave 'sym4' actually increases smoothing. • In the case of smoother methods, which operate faster, the initial rapid convergence caused to receiving stop condition SC earlier than other methods. In continuous repetitions, infiltration changes gradually subside, although they could still provide better image quality. • It is evident and significant that convergence rate, Eq. 9 is only dependent on observed available image and independent from the original image. The convergence rate R is the only tool that we have during the processing and must pay maximum attention to its curve and properties. As shown in Figs. 10, monotonic decreasing for all methods guarantee the stability of methods. We saw that the curve slope in the first iteration is very sharp and after some iterations, the slope of the curve slows down and approaches to zero. After the iteration in which the sloop rate curve starts to slow down, dependent on the application, the algorithm could be stopped. After the stop point, image reconstruction usually tends to blur the image. This feature has less effect on image quality in images with more uniform areas, but reduces the quality of vision in complex images with details as seen in Fig.  8a. • The stop condition depends on both the nature of the image and the noise power. For example, as seen in the curves of Figs. 10, the stop condition for House picture was 0.6, but for the images of Parrots and lena, should be decreased to 0.5 to stop in the best time. However, an the number of iterations are not much conflict. It happens because the stop condition is based on resulted image gradient that is related to image contents. In images with more details, SC should be larger than images that have uniform areas.  • Criterion QPSV shows the convergence of the algorithm better than ssim. As well as, QPSV curves are more in line with convergence rate curves, Figs. 9 and 10. However, QPSV curves are plotted using reference original images, convergence rate curves are plotted without reference and can be used in real applications. The main improvement is in the initial repetition stages that after reaching the condition of stopping, the changes become minor.
• As a final test, we used Lena picture with very little (SN R = 22.5 dB ) and very much noise (SN R = 12.9 dB ), and the results of the quality value ssim and the convergence rate of the simulation with all the models in Table II are shown in  Figs. 14. As it is seen, in the case of high noise, it is possible to increase ssim from about 0.05 to 2.5, and all models eventually reach the same state, but the speed of receiving their optimal state is different. For a low-noise image, it is clear that using a smoother is not helpful (in contrast to the high (c) GPM method (6).
(h) WEPM method (7). noisy case), especially after three-four steps in which a good image is obtained, the resulting images tend to blur, which in models using a smoother, this state is done with more slope. The best image is established by APM and the least blurring is provided by WAPM at the end of 10 repetitions. Figs. 15 show some reconstructed images. Shortly, as mentioned, by playing with different values and choices, appropriate results can be achieved with the application. The comprehensive model, Eq. 10 could be introduced as the simplest and the most efficient in image denoising based diffusion equation.

IV. CONCLUSIONS
In this paper, a comprehensive overview of the various models of image denoising using the diffusion equation was provided. The gaussian and exponential filters as a preprocessing stage of the diffusion process were evaluated and compared. We investigated adaptive parameter α in the diffusion equation and found to be attractive especially in absence of smoothing operator. It must move slowly from zero to two with respect to increasing the gradient value. To improve the infiltration process especially in high power noise, using the smoothing filter was vital in denoising. The gradient calculation in the spatial domain along in the wavelet domain is suggested in the case of high noisy images. The combined algorithm can therefore be used for images with different backgrounds (smooth, semi-smooth, crowded). Considering the stop condition can also stop the iterations close to the optimal state. The convergence rate R is independent from the original image that is the only tool during the processing to controlling in real applications. Because of the existing high correlation between R and QPSV quality measurement, convergence rate could be used as no reference image quality measurement at runtime procedure. Observation results show that smoothing filters and adaptive α can achieve asymptotic quality with the increasing speed of processing.