A Multiple Huygens Surface-Based Ray Tracing Framework With GPU Acceleration

A ray tracing framework based on the utilization of multiple Huygens surfaces is introduced and evaluated. As such, complex propagation environments are divided into smaller subdomains, thereby confining the rays to propagate in a smaller and simpler space. The subdomains are surrounded by the Huygens surfaces and equivalent Huygens sources interconnect the ray-based field representations in neighboring subdomains. Compared to conventional shooting and bouncing rays (SBR)-based ray tracing simulations, which detect relevant ray hits at receivers via small reception spheres, this approach distributes the wave representation over many rays and collects their overall influence via Huygens surface integrations. By a smart choice of the Huygens surfaces, the ray coverage in complex environments can be increased considerably. Concerning diffraction computations, which rely conventionally on the uniform theory of diffraction (UTD), the flexibility of choosing Huygens surfaces allows to distribute diffraction edges over different subdomains, thus, eliminating the need for consecutive UTD evaluations and the corresponding problems in finding correct multiple diffraction paths. Together with smart ray launching strategies and quickly converging integration methods, the presented approach allows many successive diffraction evaluations with reasonable accuracy and efficiency. The implementation is based on graphics processing units (GPUs), enabling massively parallelized simulations.

One of the major challenges in SBR is the low probability for a ray to hit an intended target [3].This challenge stems from the fact that the ray density becomes increasingly sparse in space as the propagation distance increases, whereas the targets of interest are of fixed sizes.Specifically, conventional SBR simulations often launch a large number of rays in random directions and identify the ray-target intersections based on reception spheres with diameters comparable to the wavelength [6].In that case, the probability of a ray hitting a reception sphere is roughly proportional to the square of the quotient of the wavelength and its maximum propagation distance.Therefore, for electrically large scenarios, one must either launch a large number of rays or use larger reception spheres.However, launching a large number of rays leads to a large computational cost, whereas large reception spheres lead to bad accuracy.
Another challenge for SBR is the calculation of diffractions, especially multiple diffractions.The most widely applied method of evaluating diffractions in SBR simulations is the uniform theory of diffraction (UTD) [7], where the fields diffracted by an edge are represented by a set of diffracted rays.Numerous diffracted rays are launched from each ray-edge intersection and propagate in directions spanning a "Keller cone."In principle, these rays can be easily handled by a ray tracer, however, when multiple diffractions are considered, UTD can be problematic.First, diffraction is only considered when a ray hits an edge, but the probability of hitting an edge (a 1-D structure) in a 3-D propagation environment is low.As a result, many diffraction events are ignored due to the high chance of ray misses [3].This limitation becomes even more pronounced when cascaded diffractions are considered because the chance of finding the correct multiple diffraction ray path decreases exponentially with the number of edges unless the sequence of the edges intersected by the path is known in advance [8].This is, however, impractical, especially when the propagation environment contains numerous diffraction edges.Second, UTD is inaccurate when the diffraction edge is in the "transition zone" of another edge unless higher-order diffracted fields are considered, which are generally hard to compute, especially in complicated propagation scenarios [9].
The two aforementioned challenges can be partially resolved by the reciprocity theorem-based bidirectional ray tracing method proposed in [10].In this approach, a large interaction surface is introduced between the sources and the targets, and rays are launched from both sides.Finally, the rays are 0018-926X © 2023 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
collected on the surface and the transfer functions from each source to each target are obtained by reciprocity integrations over the surface.For the first challenge, this method significantly improves the probability of ray hits in two ways: First, the size of the reciprocity surface can be much larger than a reception sphere.Second, the reciprocity surface divides the propagation environment into two subdomains so that rays propagating in either subdomain need to travel shorter distances.The second challenge is addressed by choosing the reciprocity surface to distribute the diffraction edges over the two subdomains.In this way, the number of consecutive UTD evaluations can be ideally reduced by a factor of two [10], [11].
Although this method has many successful applications, its limitation is also obvious: Only one surface can be introduced, which causes a decrease in effectiveness as the propagation scenario becomes larger and/or more complicated.
In this article, a Huygens principle-based SBR framework is introduced, which works with multiple interaction surfaces.According to the Huygens principle, the original sources can be replaced by a set of equivalent sources on a surface enclosing them, whereas these new sources can again be replaced in the same way, and so on.In this way, the original propagation scenario can be divided into smaller subdomains by a series of Huygens surfaces, so that each ray only needs to propagate until its nearest Huygens surface, and the propagation beyond the surface is initiated by the equivalent sources on it.With n Huygens surfaces, one can, thus, reduce the number of consecutive UTD evaluations and the maximum ray path distance by a factor of n + 1 on average.With the interpretation of the Huygens principle that each point on a wavefront can be considered as a new elementary source [12], the utilization of multiple cascaded Huygens surfaces in ray tracing simulations is an attractive way to go.Still, it has been only recently that corresponding activities have been reported [13], [14].The efficiencies of the demonstrated algorithms are, however, not satisfactory, which is mainly due to two challenges: First, since theoretically each point on the Huygens surface is considered to be a source, this can lead to exponential growth in the number of rays.Second, the Huygens integrals contain oscillatory kernels, for which classical methods of numerical integration converge slowly.For the first challenge, our ray tracing framework reduces the complexity from exponential growth to linear growth with respect to the number of Huygens surfaces by using heuristic algorithms.For the second challenge, we utilize a locally linear phase approximation-based integration algorithm [15], which is simple to implement and shows rapid convergence.Moreover, the ray path tracking and integration over the Huygens surfaces are massively parallel and are, thus, significantly accelerated by a graphics processing unit (GPU).
In Section II, the mathematical concepts of the ray tracing framework are introduced, including a review of the Huygens principle.Then the workflow and implementation concepts are discussed in Section III.In particular, the heuristic ray tracing strategy is discussed in Section III-D.Finally, numerical experiments are carried out for several propagation scenarios, encompassing both ideal and practical cases.The results are compared with reference solutions or measurement data to demonstrate the effectiveness of the presented framework.

A. Ray Interpretation of Huygens Principle
In modern electromagnetics, the Huygens principle is also known as the surface equivalence theorem [16], by which the original sources can be replaced by fictitious current sources on a surface enclosing them.These fictitious surface current sources are known as equivalent sources [16], which are "equivalent" in a sense that they produce the same electromagnetic fields outside the closed surface.Rigorously, the electric fields produced by sources on a Huygens surface can be written as where ḠE J and ḠE M are dyadic Green's functions for electric fields due to electric and magnetic surface currents, respectively.The corresponding expression for the magnetic fields can be obtained by duality [16].The equivalent surface current sources on S are related to the fields of the original sources via where n is the outward-pointing unit surface normal of S at r ′ .Assuming free space and an observation point r, which is far from the source domain S, the surface integral (1) can be approximated in a physical optics (POs) sense [17] resulting into where r = r − r ′ , d = (r − r ′ )/ r − r ′ and β is the wavenumber.
Since the surface integral can be thought of as a sum of the contributions of each infinitesimal surface element, (3) has a ray-based interpretation.This can be shown by rewriting the integrand of (3) in a form of the ray-optics solution from the work of Luneberg and Kline [16] with the reference E-field and Thus, the propagation of the waves radiated from these surface elements can be modeled as rays in a geometrical optics (GO) sense by using a ray tracer.Since (4) only describes the phase shift and attenuation of a ray propagating in free space, each reflection will require an extra spatial divergence factor and a dyadic reflection coefficient [16] to be multiplied to (4).Taking the Fig. 1.Multiple Huygens surface with ray interpretation."Obj1," "Obj2," and "Obj3" illustrate scattering objects.
ray propagation environment into consideration, (3) can be generalized to where 1) D(r ′ , r) is a set-valued function 1 mapping each source and target position pair (r ′ , r) to the set of initial directions of all valid ray paths connecting r ′ and r (when no valid ray path connects r ′ and r, D(r ′ , r) = ∅), 2) W( d, r ′ , r) identifies a set of ray paths that start from the source location r ′ and arrive at an observation point location r, with an initial direction d, which are valid in terms of the rules of GO-UTD, 3) Rw is the path attenuation tensor [16], which equals the product of all spread factors and dyadic reflection and diffraction coefficients along the ray path identified by w, 4) l w denotes the total path length of the ray w.By introducing another surface S ′ that encloses all surface current sources on S, one can obtain a new set of equivalent current sources on S ′ by evaluating (6) for all r ∈ S ′ .Then the EM fields of the observation points beyond S ′ can be obtained by applying (6) again.In general, for scenarios involving a series of Huygens surfaces S 0 , S 1 , . . ., S n (with S i enclosing S i−1 for each i = 1, . . ., n), the electromagnetic fields outside the outermost surface S n can be essentially evaluated by a higher dimensional integral obtained by recursively applying (6).An example is illustrated in Fig. 1, where the received 1 To avoid possibly occurring singularities, D(r ′ , r) is always assumed to be a finite set in this work.
field at the receiver (RX) induced by the transmitter (TX) can be obtained via GO-UTD if its associated ray paths (marked in red) are identified.The same result can also be obtained via the Huygens principle-based approach by introducing Huygens surfaces S 0 , S 1 , S 2 and tracing the rays from equivalent sources on these Huygens surfaces.

B. Ray Tracing Methods
The ray-based Huygens principle formulation (6) directly leads to the following conceptual ray tracing procedure [12]: 1) Initiate rays in all directions from every source (antennas or equivalent sources on Huygens planes) and assign reference fields E 0 d , H 0 d according to antenna pattern or (5).
2) Trace the propagation of each ray according to GO and/or UTD, and update its associated attenuation tensor and phase factor, until it hits a Huygens surface or a receiver or leaves the scene.3) Upon a ray hitting a receiver or Huygens surface, calculate its field contribution and add it to the total field value at that point.4) For each Huygens surface hit by any ray in step 3), create equivalent sources and go to step 1).Since it is impossible to create an infinite number of equivalent sources covering a Huygens surface nor to launch an infinite number of rays covering all possible directions, numerical approximations must be introduced to achieve a feasible implementation.
A typical implementation of ray tracing is known as the Monte-Carlo ray tracing technique [13], [14], [18].The basic idea of this technique is to treat the Huygens integral as the expectation of a random variable following a certain distribution.From this point of view, the ray tracing process is considered as sampling the random variable.A naive approach is to sample the equivalent sources and the ray launching directions uniformly, so that the corresponding random variable can be expressed as where S is the area of the Huygens surface and 1 D(r ′ ,r) ( d) is an indicator function defined as Then the expectation E( Ẽ d,r ′ r) equals to E(r) and it can be approximated by the average where N is the number of chosen equivalent sources.Due to the law of large numbers [19], (9) converges to (6) in probability as N → ∞.
The general workflow of this Monte-Carlo ray tracing technique largely resembles the aforementioned conceptual ray tracing procedure except for choosing equivalent sources and launching rays.Due to the linearity of EM fields, one can also simplify steps 3) and 4) by directly calculating the equivalent current contribution from each single ray and continue propagating it in another randomly chosen direction [14].Although the Monte-Carlo method is generally considered a good solution to high-dimensional integration [20], it can take a very long time to simulate a cascaded diffraction scenario with several Huygens surfaces [14] for an acceptable accuracy.This is because the frequency involved in a typical ray tracing simulation is so large that M S , J S , and exp(−jβl) oscillate very quickly, which results in a large variance of (7), thus, leading to a very slow convergence of (9).
In this work, we launch rays from individual surface partitions instead of from each individual ray hit, which can significantly reduce the number of rays required to be traced.Specifically, the source domain S is subdivided into smaller partitions 1 , 2 , . . ., n , so that one can assume that rays launched from each source point within the same partition can reach the targets in the same way.Therefore, it is sufficient to choose one point per partition as a representative to launch rays and predict the paths for other points.Mathematically, this can be formalized by requiring the restriction of D(r ′ , r) on each partition i to have the form where τ i is a finite index set and each di j is a smooth function with respect to both source position r ′ within domain A and observation position r within domain B (A and B are assumed to be well spearated).Due to the continuity of each di j ∈ D i , (6) degenerates to a summation of PO-like integrations where Ri j and l i j are also smoothly dependent on r ′ within the surface partition i .Note that the choice of the size for partition i depends on the propagation environment and not on the wavelength, so it can be electrically large.For most application scenarios, reasonable results can be usually obtained with very few partitions, so this approach normally requires much fewer ray launches than the pure Monte-Carlo approach.
An example is illustrated in Fig. 2, where equivalent sources are on the surface S 1 and an observation point P of interest is on the surface S 2 .One can verify that rays from each point within the surface partition i reach P in the same way (rays connecting T 1 , T 2 , and T 3 to P are drawn in Fig. 2 as an example).This is equivalent to saying that all points in i are exposed to the phantom of P at P ′ .On the other hand, an outside point T 4 does not have the same type of ray path because launching a ray from T 4 toward P ′ ends up at Q Fig. 2. Rays from T 1 , T 2 , and T 3 can hit P in the same way (in direction of P ′ ), but a ray from T 4 toward P ′ does not hit P, instead, a ray can hit P in a line-of-sight path.instead of P.However, a line-of-sight ray path is valid for T 4 (and also for other points in ′ i ).Therefore, D(r ′ , P) for points in i and ′ i can be written as In this example, if the partitioning of the surface S is known a priori, finding out (12) needs only ray launches from two points, for example, T 2 ∈ i and T 4 ∈ ′ i , which can largely reduce the computational effort compared to the pure Monte-Carlo methods.In a realistic situation, one cannot give a partitioning that guarantees the condition (10) prior to the ray tracing.However, an appropriate partitioning can be generated heuristically at the cost of reasonable accuracy loss, which will be discussed in Section III-D.

C. Ray Tracing Compatible Numerical Integration Techniques
Due to the smoothness condition (10), each vector component (x, y, or z) of the surface integral in (11) can be expressed in the form of where p = x, y, or z specifies the component of the electric field, while i, j, and w retain the same indices as presented in (10).Also, A p i jw (r, r ′ ) and φ p i jw (r, r ′ ) are slowly varying with respect to the spatial variables (r and r ′ ) and ∇φ i jw ∼ O(β).Typical methods of numerically evaluating oscillatory integrals in the form of ( 13) include asymptotic methods [21], Filon-type methods [22] and Levin-type methods [23].However, all these numerical integrations rely on the values of the integrand at a predefined set of sample points, which creates two difficulties for a ray tracer: Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
First, different rays are unlikely to hit the same point, which means that the integrand of ( 13) is not explicitly provided by a ray tracer.Second, when a complex propagation environment is involved, one cannot control where rays intersect with the surface of interest, thus, making it difficult to provide data on the predefined sampling grid.
In order to mitigate these difficulties, a linear phase approximation-based integration method is utilized.The basic idea is to locally approximate the magnitude and the phase of the integrand of ( 13) to the zeroth and the first order, respectively.Specifically, one can further partition j into surface elements ξ 1 , ξ 2 , . . ., ξ n , so that the restriction of the integral (13) on each surface element ξ m induced by a certain wavefront (labeled as k) can be approximated by where r ′ k ∈ ξ m is the intersection point between the ray associated with wavefront k and the surface element ξ m , ∇ ̸ E x/y/z k and ∇ ̸ H x/y/z k are the gradients of the (unwrapped) phases of field components at r ′ 0 , and di j , Ri j are considered constants.Therefore, E i j (r) can be obtained by summing up E m i jk for each m and each k.Since r ′ k can be an arbitrary point within ξ m and the fields involved in (14) do not have to be the total field, this approach avoids the two major problems of other integration methods.In particular, if ξ m is illuminated by a single point source, where the direction di j represents a line-of-sight ray path from r ′ 0 to r, (14) can be written as where din is the direction of the ray illuminating ξ m , which is similar to the well-known far-field approximation.In (14), it is implicitly assumed that each ray hitting the surface element ξ belongs to a different wavefront.However, this is not true in general, since multiple rays from the same wavefront can hit the same surface element.Considering that a surface element ξ m is hit by (14) to each one of these rays one obtains N k different approximations of E m i jk (r).The average of these approximations usually gives a better estimate of E m i jk (r) than each single one, especially when the area of ξ m is relatively large [15].Therefore, an estimate of ( 13) can be summarized as where the E m i jk l (r) are obtained by ( 14) using the lth ray hit of wavefront k, and K m i denotes the set of different wavefronts illuminating ξ m ∈ i .
In a practical implementation, the Huygens surface is modeled as a triangular mesh with the average triangle edge length comparable to the wavelength, so each partition j is a cluster of triangles and each element ξ m can be chosen as an individual triangle within a certain cluster.In this case, the integration in ( 14) has a closed form solution, which implicitly assumes that rays initiated from any source within the same triangle reach any target in the same way.Although this assumption does not always hold, the resulting error is negligible if the ray propagation distances are much larger than the sizes of the triangles.

D. Phase Gradient Estimation
According to (14), phase gradients of the fields are required for numerically evaluating the Huygens integral.Therefore, in addition to (16), a phase estimation is also needed at the same time for a subsequent Huygens integral.This is trivial when the observation point r is illuminated by a single ray from a point source directly.In this case, the fields E(r) and H(r) are given by ( 4) and their phase gradient can be well approximated by where d is the ray incoming direction at r because the phase change of ( 4) is dominated by the exponential term exp(jβl) and ∇l = d by the physical interpretation of a ray [as illustrated in Fig. 3(a)].For reflected and/or refracted rays, (17) is also valid as long as it comes from a point source since the reflection/refraction coefficients are also slowly varying compared to the phase factor.However, for an E-field in the form of ( 16) the wavefront can be irregular [as illustrated in Fig. 3(b)] and the phase gradient may no longer be as simple as in (17).
In a more general form, the partial derivatives of the phase of each field component p with respect to each coordinate q can be written as where p, q = x, y, or z.Without loss of generality, assume that the electric field at r can be written as the superposition of the field contributions from multiple point sources and multiple wavefronts in the form of where i and k identify the source and wavefront, respectively, and E 0 dik is the reference E-field for source i in direction dik .For each term of the summation in (19), its derivative with respect to the spatial variables is also dominated by the exponential term for high frequencies.Therefore, the partial derivatives of the field components with respect to each coordinate can be approximated by where δr is a small perturbation of r and p, q ∈ {x, y, z}.Plugging ( 19) into (18), one obtains the phase derivative for the multiple source case One can extend (21) to a distributed source scenario by converting the summation to an integration.

A. General Work Flow
The implementation of the ray tracer in this work is based on the programmable ray tracing engine NVIDIA OptiX [24].This engine takes care of efficiently determining ray-object intersection points for a given set of rays in a given propagation environment and executing user-defined functions upon each ray hitting an object.Due to the flexibility of the OptiX framework, one has many possibilities to implement these user-defined functions in order to characterize different propagation phenomena.In particular, we define these functions to enforce GO rules for ray bouncing, update field information according to (4) upon each bounce, and to write the final field data to the result buffer.Therefore, the calculation of the electromagnetic field associated with each ray is integrated with the ray tracing engine seamlessly, which forms the essential building block of our ray tracer "WvTracer" (short for "wave tracer.")From the perspective of programming, an instance of "WvTracer" of a given propagation environment is a callable object that accepts arguments specifying a set of rays (i.e., a list of sources and a list of (initial) directions), and writes results to EM_Buffer (as illustrated in Fig. 4).
Therefore, the general workflow of the ray tracer can be described as follows: 1) Create an WvTracer instance using geometry information of the propagation environment.2) Initiate sources and launching directions of each source.
3) Trace the rays specified in 2) using the WvTracer instance created in 1).4) Predict suitable secondary sources and corresponding directions using results of 3) and the geometry information in WvTracer.5) Repeat 2)-4) until desired field information are obtained.Since predicting the secondary sources and directions for the next launch is normally not performance critical and comprises mostly high-level instructions, the predictor has been implemented in the high-level language Julia [25].It inquires data from WvTracer via its application programming interface (API).In this way, one can easily develop and test new strategies without concern about the low-level ray tracing details.

B. Field Calculation
As mentioned in Section III-A, the update of the field information is integrated into the ray tracing process.Specifically, the propagating direction, propagating tensor R, and path length l associated with each ray are updated upon each bounce.When a ray hits a Huygens surface or a receiver, the full EM information is calculated and written to EM_Buffer.In the implementation of this work, two modes of writing EM information are supported: "allocation" and "addition."As illustrated in Fig. 5, the "allocation" mode writes to a newly "allocated" entry for each ray hit [Fig.5(a)], Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.whereas "addition" mode adds EM field information to its corresponding existing entry [Fig.5(b)] when the position matches.Therefore, in "allocation" mode, each valid entry corresponds to exactly one ray, whereas in the "addition" mode, one entry can correspond to from zero to multiple rays (when corresponding to no ray, the entry represents a null field).Each entry of the EM field data to be written into EM_Buffer is defined as complex3 E , H; complex3 sum_d_Ex , sum_d_Ey , sum_d_Ez , sum_d_Hx , sum_d_Hy , sum_d_Hz ; / / M e t a d a t a . . .} where the omitted metadata attributes include position, wavefront identifier, etc., and attributes sum_d_... are the sum of the ray incoming direction multiplied with a certain field component (e.g., sum_d_Ex = i β E x i di ).These summation attributes are essential for the phase gradient estimation especially when writing results in "addition" mode according to (21).With the help of the sum_d_... attributes, one can directly add a new field entry attribute-wise onto an existing entry, while keeping track of the phase derivative information required in ( 14) and (21).If a phase derivative is needed, e.g., ∂ ̸ E x /∂ y, one can just evaluate d _ a n g l e _ E x _ y = ( sum_d_Ex .r e a l * sum_d_Ex .y .r e a l + sum_d_Ex .imag * sum_d_Ex .y .imag ) / pow ( a b s ( E .x ) , 2 ) .
In this case, O(1) memory complexity with respect to the number of rays hitting the same point is achieved, since the size of EM_field is fixed no matter how many rays contribute to it.Since it is unlikely to have multiple rays hitting exactly the same point when launching rays randomly, the "addition" writing mode is designed for strategies that can predict where rays end up, which will be elaborated on in Section III-D.

C. Wavefront Identification
Another implementation detail worth discussing is the identification of different wavefronts.From the perspective of the image method, a wavefront can be defined as the set of rays coming from the same real or image source.However, the image principle is only exact with infinitely large flat reflective surfaces.For curved surfaces, an image of a source or target is not uniquely defined or does not exist [3].
Alternatively, one can apply a pragmatic definition that two ray paths are considered to belong to the same wavefront if and only if they: 1) are initiated from the same source; 2) are reflected/refracted by the same series of surfaces; 3) end up close enough.Note that a "surface" here is a conceptually continuously differentiable piece of a 2-D manifold.For surfaces discretized by triangular meshes, one normally segments them via edge detection algorithms [26] so that each partition can be approximately considered as a "surface."Using this definition, the complete bouncing history of a ray identifies which wavefront it belongs to.A naive approach of implementing this is to save the complete history of each ray, but this brings two difficulties: 1) Each ray must carry a large fixed-size or variable length array, which either wastes memory or involves expensive memory allocations.2) Determining whether two rays belong to the same wavefront involves a list comparison, which makes grouping rays belonging to the same wavefront even more complicated and inefficient.These difficulties can be elegantly resolved by applying a rolling hash function (e.g., Fowler-Noll-Vo hash function [27]) to the bouncing history and only saving the resulting hash value for each ray.A rolling hash function takes an old hash value and new data as input and returns a new hash value (at the beginning, a carefully chosen initial value is fed as the "old hash value.")Specifically, for each ray being traced, only one hash value needs to be carried, which encodes all the history information and it is updated upon each new hit.For example, when using the 32-bit FNV-1a [27] hash algorithm, a ray starting from source with index 0 and reflected by surfaces indexed by 1 and 2 in sequence will have a hash value 0x253381b6.

D. Ray Launching Strategies
As mentioned in Section II-B, if the source domain can be subdivided into a small number of partitions satisfying the condition that rays from the sources within the same partition reach the intended targets in the same way, the number of rays can be largely reduced.However, identifying a partitioning that strictly satisfies this condition requires launching a large number of rays from each source, which defeats the purpose of introducing such partitions.Therefore, a good partitioning strategy does not emphasize its strict correctness in terms of (10), but rather minimizes the error of the final results even if some sources are placed in a wrong partition.Since the Huygens surface is described by a triangular mesh and all points within each triangle are assumed to belong to the same partition, the finest grouping is to have each individual triangle belong to a different group, whereas the coarsest grouping is to have all triangles belong to the same group.Since the sizes of the triangles are comparable to the wavelength, the finest grouping for an electrically large Huygens surface can lead to prohibitively large computational cost.On the other hand, the coarsest partition assumes that all source-to-target ray paths are of the same type, which is increasingly inaccurate as the propagation scenario becomes more complicated.Hence, effective heuristic strategies are required in order to find a satisfactory compromise between these two extreme cases.
Considering the possible inaccuracy of the partitioning, the strategy of launching rays from Huygens sources can be briefly summarized as follows: 1) Heuristically determine a certain number of representative triangles and their corresponding partitions.2) Randomly launch numerous rays in all (forward) directions from these representatives.3) When a ray hits a reception sphere or a Huygens surface, its position and phantom position are recorded.4) Launch rays from each triangle toward the phantom targets hit by its representative.5) When a ray hits the desired target (within a tolerance), its corresponding EM field contribution is atomically added to the result buffer, otherwise it is discarded.
The mentioned phantom targets are obtained as the positions of the targets seen from the source side in the form of r source + l dinit (with l the total ray path length).Since step 4) checks if a ray from a source can actually hit a target in the same way as its representative, errors due to incorrect clustering of the triangles are reduced.In this work, the heuristics for step 1) are realized by selecting a certain number of representative triangles within a Huygens surface mesh randomly, where the probability of selecting a triangle is proportional to the average field magnitude on it.Then the partitioning of the Huygens surface is determined by nearest neighbor searching (NNS), i.e., each triangle belongs to the same partition as its nearest representative.As such, more representatives are selected where the magnitudes of the equivalent sources are larger, and the ray paths from such areas are, thus, tentatively also better predicted.This partitioning strategy is a kind of "importance sampling" which tends to require fewer samples for a given error bound than choosing representatives uniformly.An example of this strategy is shown in Fig. 6, where the rectangular area is a Huygens surface illuminated by a point source aligned to its center.The leftmost plot shows the average E-field magnitude within each triangle, the middle plot shows a random choice of representatives according to the weights shown on the left, and the right most plot shows the partitioning based on the chosen representatives.As can be seen, fewer representatives are chosen in the area of lower average incoming field magnitude.

IV. NUMERICAL RESULTS
A few typical propagation scenarios are simulated by introducing one or more Huygens surfaces and the results are compared with reference data.For the considered line-of-sight (LOS) and the ground reflection scenarios, the transmitting antenna is a vertically polarized isotropic radiator producing 1 V m −1 at 1 m distance.In free space, the EM fields induced by each of these isotropic radiators can be written as where is the distance between the observation point and the transmitter, and êϑ , êr are the local orthogonal unit vectors at r with respect to spherical coordinates with origin at the transmitter.

A. Line-of-Sight Scenarios
As illustrated in Fig. 7, a transmitting antenna (TX) is placed at the origin and a set of receiving antennas (RX 1 , RX 2 , . . ., RX n ) are placed along the x-axis.The operation frequency of the transmitter is 2.45 GHz and the received Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.EM fields can be directly obtained by evaluating (22) Rays are first launched from the TX to S 1 , then sequentially from S i to S i+1 for each i = 1, . . ., − 1, finally from S m to the receivers.The EM fields associated with the ray hits on each of the surfaces are stored and used to create equivalent sources for the next ray launch.In this way, the radiation from the original source TX is sequentially replaced by the equivalent sources on S 1 , . . ., S m .This replacement is of not exact for surfaces S i finite size, but sufficient accuracy can be achieved with the of 30 m × 30 m (corresponding to about 250λ × 250λ).
we choose m = 5 and the corresponding E-field magnitudes dependent on the distance r to the transmitter are shown in Fig. 8 for numbers of triangles used to mesh the Huygens surfaces.The figure does not only show the fields at defined RX locations in the right of each of the subfigures, but also the fields within each of the five Huygens surfaces (indicated by dots with different colors), where the distance r to the TX increases due to the transverse spatial variation of the observation locations within the Huygens surfaces.As seen from Fig. 8(a)-(c), the accuracy of the Huygens integral evaluations improves significantly as the mesh gets finer.Since the E-fields on S i (i > 1) are obtained by (i − 1) Huygens integral evaluations, the plots in Fig. 8 qualitatively shows the change of the accuracy with respect to the number of Huygens integral evaluations.In order to further analyze the accuracy with respect to the number of Huygens integral evaluations, the relative magnitude error distributions and their mean values are shown in Fig. 9.As seen in Fig. 9(b), the relative errors in the surfaces corresponding to smaller numbers of Huygens integral evaluations (e.g., S 2 , S 3 , and S 4 ) can be well approximated by the gamma distribution, whereas the errors corresponding to larger numbers of Huygens integral evaluations (e.g., S 9 and S 10 ) are better approximated by the normal distribution.This is expected since the error accumulation over the number of Huygens integrals obeys the central limit theorem.Although the relative errors accumulate with the number of Huygens integral evaluations, an average relative error of only around 50% at the tenth Huygens surface is quite satisfying and demonstrates the feasibility of utilizing many cascaded Huygens surfaces in a ray tracing engine.

B. PEC Ground Reflection
Wave propagation over a flat, infinitely extending, and perfectly electrically conducting (PEC) ground plane is considered next.As illustrated in Fig. 10, the transmitter and the receivers are at the same height h = 3 m above the ground.Specifically, the transmitter is located at (0 m, 0 m, 3 m) and the receiver moves from (30 m, 0 m, 3 m) to (55 m, 0 m, 3 m).According to the image theory [16], the electric field above the ground can be seen as produced by the original transmitter together with its image TX ′ at (0 m, 0 m, -3 m).Therefore, using ( 22) the E-fields above the ground can be expressed as where r 1 = ∥r − r TX ∥, r 2 = ∥r − r TX ′ ∥, and êϑ 1 , êϑ 2 are spherical unit vectors with respect to the spherical coordinate systems with origin at r TX and r TX ′ , respectively.As for the LOS scenario, m 30 m × 30 m Huygens surfaces lying in planes x = D + i d (i = 1, . . ., m, D = d = 5 m) are introduced so that the received E-fields can also be calculated via repeated Huygens integral evaluations.The E-field magnitudes at the receivers are plotted in Fig. 11(a) for m = 1 to 5 together with the reference calculated by (23).Similar to the LOS scenario, the results generally agree with the reference whereas the error increases as the number of Huygens integrals increases.This can be characterized more clearly by considering the scenario of m = 10, where the relative error distribution histograms of the E-fields on each surface are shown in Fig. 11(b).Also, the change of the mean relative error with respect to the number of Huygens integral evaluations is shown in Fig. 11(c).Compared to the LOS scenario, the relative errors are larger, but the change in the statistics shows a similar trend.As expected, the most significant increase of the relative error is found in regions with destructive interference, where the reference E-field magnitudes are very small.The peak of the relative error is still below 100% even for ten Huygens surfaces, which is quite satisfactory.Moreover, it is noteworthy that the simulation time for ten Huygens planes amounted to approximately 3856.22 s, and it consumed less than 8 GB of memory.This compares strikingly favorably with a similar numerical experiment featuring repeated Huygens integration as reported in [14], which demanded approximately 306 h.This significant speedup, approximately 300 times faster, underscores both the efficiency and practicality of our proposed framework. 2  C. Multiple Knife-Edge Diffraction Fig. 12 illustrates a multiple knife-edge diffraction scenario formed by a series of thin walls with a height of h = 50 m placed parallel to the yz-plane along the x-axis above an 2 The hardware employed for our experiments was an Nvidia GTX 1080 with a VRAM of 8 GB.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where ρ is the distance to the y-axis and êρ the corresponding unit vector.The height of the line source is H = 40 m (below the edge) and its x-coordinate is 0. This source produces electromagnetic fields of 300 MHz with y-polarized E-field.Due to the translational symmetry in y-direction, the arrangement is a 2-D problem, as also seen in Fig. 12.The electric field is fully determined by the solution to the 2-D Helmholtz equation [28] ∂ subject to the boundary condition where β is the wavenumber in free space, Q is the line source magnitude, and n(x, z) is the refractive index with the value of +∞ for the area occupied by the wall and 1 elsewhere [28].This problem can be efficiently solved by using the method introduced in [28], whose Python 3 implementation by the authors is freely available via [29].
Alternatively, this scenario can be handled in our ray tracing framework by truncating the walls and the source at ±150 m (≈ ±150λ) in y-direction and setting up a series of 150 m × 150 m squares above each knife-edge as Huygens surfaces (as annotated by S 1 , . . ., S m in Fig. 12).The ray tracing process is then similar to the previous examples, where rays are launched first from the TX to S 1 , then from S i to S i+1 for each i = 1, . . ., m − 1, sequentially.Since the line source TX excites a cylindrical wave, rays from the TX are only initiated in directions perpendicular to the line, and the field attenuation factor along a ray is 1/ √ ρ.In order to evaluate the simulation results, an observation plane parallel to the walls is placed behind the last knife-edge with a horizontal distance of ρ = 200 m, so that rays from equivalent sources on S m hitting the observation plane produce an m-knife-edge diffraction pattern.
The magnitudes of the E-fields in the observation plane, obtained by our ray tracer and the 2-D Helmholtz equation solution along with their relative errors, are shown in Fig. 13 for m = 1 to 4. For our ray tracing results, the E-fields in a narrow vertical strip of (−5 m ≤ y ≤ 5 m) are plotted dependent on z.Theoretically, E-fields at the same height z have of course the same value, but the errors introduced by the truncation of the line source and of the Huygens surfaces cause some variations dependent on y, which are seen in the plots in Fig. 13.The ray tracing results match the reference well above the edges for all four scenarios, while the results in the deep shadow region deviate from the reference after the second knife edge.Table I shows the average relative errors in percentage and dB for the samples above and below the edge, revealing a distinct difference in the relative error behaviors for the two cases.The relative errors above the edge grow mildly with respect to the number of diffractions, while the errors of the samples below the edge increase above 100% after the second diffraction due to the very small field magnitudes.Despite the relatively large deviation in the shadow region, the overall relative errors remain small after several diffractions, which demonstrates the stability of our ray tracing framework.

D. Pathloss Prediction in Urban Environment
To demonstrate the utility of our ray tracing framework for complex wave propagation scenarios, we conducted ray tracing simulations utilizing a 3-D model of a segment of the Munich city center, which covers an area of approximately 8 km 2 and comprises the propagation environment of a widely recognized measurement campaign referred to as "COST231" [30].As depicted in Fig. 14, a transmitter is located at the specified gray point in the map.The transmitter operates at a frequency of 945 MHz and stands 13 m above the ground.A receiver follows a designated road path, systematically collecting path gain measurements at approximately 10 m intervals.These measurement points are depicted as dots in Fig. 14.They are sequentially labeled from 1 to 970, following the direction indicated by the blue arrows in the figure .The ray tracing simulations assume the material of buildings and the ground to be concrete, with a relative permittivity ε r = 4.5.Both the transmitter and the receiver are vertically polarized isotropic antenna.As shown in Fig. 15, two Huygens surfaces are involved: the first (S 1 ) forms a tall fence (up to around 200 m above ground) encircling the transmitter, while the second adopts a long, narrow tunnel shape to envelop the measurement points.The choice of these surfaces serves two primary objectives.First, the equivalent sources on the high fence-like S 1 effectively enhance coverage due to their larger height than the surrounding buildings.Second, the tunnel-shaped S 2 efficiently redirects the waves from above to the receivers, while accounting for the diffractive effects introduced by the presence of nearby buildings, as illustrated in Fig. 16.Notably, the second Huygens surface selectively covers measurement points situated at large distances from the transmitter.This strategic selection is based on the fact  that receiving positions close to the transmitter can already be accurately predicted through direct ray hits or by employing a single Huygens integration.Consequently, this arrangement of Huygens surfaces categorizes receiving positions into three distinct types, as marked in Fig. 14.Those predictable by conventional ray tracing alone are denoted in green, while those found through one and two Huygens surface integrations are indicated in blue and orange, respectively.The reception spheres are positioned along the measurement route but with 20 times denser sampling than the original measurement positions.The resulting sequences of path gain predictions at those reception spheres are then smoothed out by applying a Gaussian filter mitigate the fast-fading effect and downsampled to the number of the measurements before being compared with the references.
The numerical results obtained using the aforementioned setup are illustrated in Fig. 17  minus measured path gain in dB) are plotted in Fig. 17(b).As expected, the path gain results obtained without the need of invoking any Huygens integration (i.e., points can be accessed by direct line-of-sight and/or reflected rays) exhibit the highest accuracy, evident in the red curve in Fig. 17     reach without Huygens integration.These points exhibit a path gain of −∞ dB and consequently do not appear in Fig. 17.Upon introducing a single Huygens surface (S 1 ), nonzero predictions emerge for all measurement points, as depicted by the yellow and orange curves in Fig. 17  Huygens surface leads to an improved fitting of the predictions to the measurements for these points, as illustrated by the purple and yellow curves in Fig. 17 The standard deviation of the path gain for measurement points with indices ranging from 1 to 550 predicted by using two Huygens integrations is 7.55 dB, whereas using only one Huygens integration, the standard deviation is 11.27 dB.This implies a 3.72 dB improvement achieved by introducing a second Huygens surface.As evident from Fig. 17(a), using two Huygens surfaces, the results are also much more stable.The range of the error shrinks from ±40 dB to ±20 dB.This highlights the improved accuracy and stability achieved by utilizing multiple Huygens surfaces.
Another noteworthy aspect of these simulations revolves around selecting representative sources on the Huygens surfaces, with a particular focus on the second Huygens surface (S 2 ) due to its complex geometry.As elaborated in Section III-D, the effectiveness of secondary ray launches relies on the selection of representatives from all equivalent sources distributed in a Huygens surface.The representative selection becomes crucial, especially when dealing with complex-shaped Huygens surfaces or surrounding objects.This perspective prompts an investigation into how the number of representative sources on Huygens surface S 2 influences the simulation results.Our study involves the selection of varying quantities of representative sources on S 2 , followed by an analysis of the results in terms of average errors and standard deviations.These results are depicted in Fig. 18.It is important to note that the average errors and standard deviation calculations are based solely on the samples marked as orange dots in Fig. 14 (indices ranging from 1 to 550).The curves in Fig. 18 reveal a clear trend.The standard deviation undergoes a significant decrease from above 20 dB to below 10 dB and then stabilizes when the number of representative sources exceeds 500.This finding supports the assertions made in Section III-D that in complex environments using more representative sources can substantially improve the accuracy of the path gain predictions.

V. CONCLUSION
An efficient ray tracing framework utilizing multiple Huygens surfaces with corresponding Huygens integral evaluations was introduced and evaluated for several wave propagation scenarios.Our validation spanned various scenarios, from fundamental to complex, showcasing strong alignment with established methods.Expanding our investigation to a practical urban scenario, namely the Munich city center, revealed tangible benefits of the framework.By introducing Huygens surfaces, we observed substantial improvements in ray coverage and reduced error standard deviation.This underscores the promising applicability to real-world situations.From an efficiency perspective, the heuristic algorithm and the specifically designed integration rule contribute to linear time and memory complexities concerning the Huygens integral evaluations.This efficiency is further amplified by utilizing a GPU-accelerated ray tracing engine, capitalizing on inherent parallelism.To further improve the efficiency, more intelligent ray tracing strategies and better Huygens integration methods can be introduced, which can be easily integrated into the presented framework due to its flexible software structure.

Fig. 5 .
Fig. 5. Writing EM fields to the result buffer in two different modes.(a) Write in "allocation" mode.(b) Write in "addition" mode.

Fig. 6 .
Fig. 6.Choosing representative equivalent sources randomly with probability proportional to the E-field norm.(a) Incoming E-field norms.(b) Chosen representatives.(c) "Territories" of representatives.

Fig. 8 .
Fig. 8. E-field magnitudes of line-of-sight scenarios computed by different numbers (1-5) of Huygens integral evaluations and different numbers (2048, 8192, and 32 768) of subdivisions per Huygens surface, plotted together with the reference obtained from (22).The shown field samples on the single Huygens surfaces are obtained by sampling the entire surface, where the distance is obtained as the distance r to TX and not just as the distance along the x-axis.(a) 2048 triangles (average edge length ≈ 7λ).(b) 8192 triangles (average edge length ≈ 4λ).(c) 32 768 triangles (average edge length ≈ 2λ).

Fig. 9 .
Fig. 9. Relative error distributions for the E-field computed by up to ten Huygens integral evaluations for the LOS scenario.The shown field samples on the single Huygens surfaces are obtained by sampling the entire surface, where the distance is given as the distance r to TX and not just as the distance along the x-axis.(a) Results.(b) Error statistics.(c) Mean error trend.
Alternatively, by introducing a sequence of Huygens surfaces S 1 , S 2 , . . ., S m between the transmitter and the receivers, one can obtain the EM fields following the flowchart illustrated in Fig. 4. In the considered example, each surface S i is chosen as a 30 m × 30 m square in the plane x = D + i d (i = 0, 1, . . ., m) centered at (D + i d, 0 m, 0 m) with D = d = 5 m.

Fig. 11 .
Fig. 11.Relative error distributions for the E-field computed by up to five Huygens integral evaluations for the ground reflection scenario.The fields are here only shown in the defined RX locations.(a) Results.(b) Error statistics.(c) Mean error trend.

Fig. 12 .
Fig. 12.Multiple knife-edge diffraction for a cylindrical source over PEC ground.
(a), alongside the corresponding measurement results.Additionally, the dB prediction errors of the path gain (absolute value of predicted path gain in dB

Fig. 14 .
Fig. 14.Top view of the Munich city simulation setup.
(a)  and the blue curve in Fig.17 (b).However, this accuracy is confined to the measurement indices between 850 and 905, primarily encompassing the green dots in Fig.14, as other points are beyond Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 15 .
Fig. 15.3-D view of the Munich city ray tracing simulation.

Fig. 16 .
Fig. 16.Intersection view of the two Huygens surface configuration.
(a)  and (b), respectively.Despite the extensive coverage, the deviate from the measurements for points with indices below 550, corresponding to the orange dots in Fig.14.Incorporating a second

Fig. 18 .
Fig. 18.Error statistics of the second Huygens integration versus number of representatives.

TABLE I KNIFE
-EDGE DIFFRACTION RELATIVE ERROR AVERAGES