CrowdMagMap: Crowdsourcing-Based Magnetic Map Construction for Shopping Mall

Indoor positioning is an important part of supporting the Internet of Things and location-based services. Crowdsourcing-based magnetic map construction is a key technology to realize wide-area consumer indoor positioning. However, current crowdsourcing-based magnetic map schemes are not suitable for typical indoor scenarios (e.g., shopping malls). The reason is that they ignore the characteristics of crowdsourced data, including short-term trajectory, various pedestrian motion patterns, large-scale data set, and so on. In this article, we propose a novel crowdsourcing-based magnetic map construction method. First, learning-based inertial odometry is used to recover precise user motion trajectories regardless of changes in motion patterns. Then, a keyframe-efficient association method of magnetic time–frequency features is proposed, which is suitable for short-term trajectories of various shapes. Finally, a two-step global estimation optimization is proposed to further eliminate false associations of keyframes and improve the robustness of the method. The feasibility of the proposed method is verified by using a multiuser data set in a typical shopping mall scenario. The proposed method takes a total of 60.8 s to process a 12-h data set (subtrajectories with a duration of 90 s), and the average position error is 1.48 m (with scale correction) and 2.53 m (without scale correction). Compared with the existing crowdsourcing-based magnetic map scheme, the proposed method has been significantly improved in terms of feasibility, accuracy, and efficiency.


I. INTRODUCTION
Indoor positioning and navigation is key technology in developing the Internet of Things (IoT) and location-based services (LBS) [1].Various techniques, including Bluetooth low-energy (BLE) [2], WIFI [3] [4], magnetometer [5], and Inertial Measurement Unit (IMU) [6] [7] [8] have been adopted to provide accurate indoor positioning services for mass users.IMU-based methods [9] [10] can provide accurate relative trajectory in a short time, but the position error accumulates with the user's walking distance, which is usually used as an auxiliary means.The methods based on WIFI [11] [12], BLE [13], and magnetometer [14] [5] can achieve meterlevel positioning relying on pre-built signal fingerprint maps (including radio frequency signal and magnetic field signal) built in an offline phase.Therefore, an indoor signal fingerprint map is the basic premise to ensure the availability of indoor positioning.
Following the principle of prioritizing signal fingerprint accuracy, traditional fingerprint mapping methods, including point-to-point and walking survey, require professionals to use professional equipment (e.g., total station) to measure geographic coordinates and collect signal fingerprints according to established routes [15].However, traditional methods are labor-intensive, time-consuming, and expensive [16], and it is far from meeting the needs of large-scale indoor signal fingerprint map construction, which is the main limitation for the promotion of wide-area indoor positioning services.
To reduce the cost of signal fingerprint mapping, crowdsourcing-based mapping methods have attracted the attention of many researchers [15] [17].The core idea is to use the pedestrian dead reckoning methods based on the IMU observations built-in smartphone to estimate the trajectory of each user, and use the similarity of signals or sensory landmarks to associate and fuse the trajectories of different users.Walkie-Markie [18] correlates user trajectories using unique landmarks formed at the moment of maximum RSSI, which can avoid the influence of multipath propagation of WiFi signals.Similarly, SoiCP [19] utilizes WiFi signal and building gates as landmarks for trajectory association.WiFi-RITA [16] seeks to provide crowdsourced location based on WiFi, which is adaptable to datasets with noise, diverse motion patterns, and short-period trajectories, and considerably improves efficiency.It proposed a robust iterative trace merging algorithm based on WiFi access points as signal marks.[20] accomplishes a system employing learning-based inertial odometry to offer relative trajectories and a WiFi bundle adjustment to recover trajectories.This approach is very resistant to different motion patterns and suited for data collection by non-experts.However, due to the distribution density of the WiFi or Bluetooth access point (AP) is uncontrollable, and the above methods will become unusable in areas where wireless signals are sparsely distributed.
Infrastructure-free indoor positioning methods (such as the magnetic field map-based) are the primary option for consumer-grade wide-area indoor positioning [14].A crowdsourcing-based magnetic map generation approach independent of wireless sensors is necessary.Compared with the crowdsourcing methods based on wireless signals, which can take advantage of the globally unique characteristics of AP addresses, the crowdsourcing method based on magnetic signals will be more challenging.Detail-wise, two trajectories can only be appropriately connected based on the magnetic field if their respective local sub-trajectories are almost the same.[21] and [22] assume that the trajectory of a single user is composed of straight lines and corners, extract the magnetic field sequences corresponding to the corners as key frames, and use the clustering method to associate them.Dynamic Time Warping (DTW) methods are used to measure the similarity of keyframes in clustering methods.Based on the same assumptions and ideas, [23] proposes a step-based magnetic sequence similarity estimation method to replace DTW, which has significantly improved computational efficiency.However, these methods still have two shortcomings for challenging application scenarios: 1) In complex indoor environments (e.g., shopping mall), user trajectories are irregular curves instead of straight lines and corners.At this time, these methods unable to work properly; 2) The computational efficiency is low, and it is not suitable for large-scale data sets in practical applications.The reason is that they use time-consuming clustering methods [24] [25] to associate user trajectories, including k-means, affinity propagation, and hidden Markov models (HMM).
To achieve consumer-level wide-area indoor positioning, crowdsourcing-based magnetic field mapping schemes for typical indoor scenarios (e.g., shopping mall) still need to be further explored.We explicitly conclude the requirement of crowd-sourcing magnetic map construction methods in shopping mall in Section II-A.Following these requirements, this paper proposed a crowdsourcing-based magnetic map construction system that only uses information from IMU and magnetometers.The study makes the following major contributions: • We proposed a robust crowdsourcing-based magnetic map construction method that can accommodate shortperiod trajectories with irregular shapes, uncalibrated magnetometer observations, and diverse pedestrian motion patterns.• To deal with the huge time consumption of keyframe association, we proposed a fast keyframe association method in which time consumption increases slowly with the size of the dataset.Benefiting from this design, the keyframe association obtained a 12x speed up.• We designed a two-step graph-optimization-based method for merging trajectories resistant to erroneous keyframe association.This approach eliminates the influence of erroneously related keyframe pairs by utilizing magnetometer data, the magnetic field, and the trajectories' geometry.
The remainder of the paper is organized as follows.Section II defines the problem in the crowdsourcing-based map construction and gives an overview of the proposed method.Section III describes the learning-based inertial odometry.Section IV describes the proposed keyframe association method.Section V provides a detailed description of the graphoptimization-based trajectory merge method.Section VI uses a field test to prove the performance of the proposed method.

A. Problem Statement of crowdsourcing magnetic map construction
This section summarizes the challenges for crowdsourcingbased magnetic map construction in challenging scenarios.
Short-period Trajectory: Due to the limited power of smartphones, the data collection application cannot run in the background for an extended time under real-world scenarios.Thus, most of the crowdsourcing data from mass users are short-term (e.g., 1∼3 minutes) trajectories.Moreover, since magnetic signals do not have globally unique characteristics, the reliability of sub-trajectories association methods based on magnetic signal similarity will be seriously damaged.In other words, the relative position across a long trajectory facilitates the identification and elimination of false keyframe association pairs.
Various Pedestrian Motion Patterns: Data collection applications operate in the user-insensitive mode, so the acquired data comprises trajectories with various motion patterns.Conventional PDR methods are ineffective in determining trajectory when pedestrians walk with various motion patterns, such as texting, calling, and swinging etc [26] [27].Although it is feasible to filter out simple motion pattern sensor data to generate a magnetic map, the data utilization rate is extremely low, and the cycle of providing positioning services will be significantly prolonged.Therefore, an accurate trajectory estimation method that can adapt to complex pedestrian motion patterns is an essential property of crowdsourcing-based magnetic map generation methods.
Uncalibrated Sensors: The smartphone's built-in sensors are of low quality, so the sensor measurements usually have large bias errors [28].In particular, the magnetometer bias will change significantly under the influence of the electromagnetic effect caused by the change in the working state of the smartphone.Then, because the magnetometer observations cannot reflect the real environmental magnetic field, the user trajectory association methods based on the assumption of a consistent environmental magnetic field at the same location will become unavailable.Meanwhile, it is impossible to assume that the user will correctly calibrate the magnetometer biases in the user-insensitive mode.Therefore, the proposed method should be able to calibrate the sensor online using the acquired data.
Irregular Trajectory: Restricted by the constraints of the indoor space structure, the shape of the user's motion trajectory is complex and diverse, including straight lines, right-angle turns, irregular arcs, and so on.The existing crowdsourcing-based magnetic map generation methods almost all assume that the user's trajectory consists of straight lines and right-angle turns, which limits its application in complex indoor environments (such as shopping malls and underground garages, etc.).Therefore, the magnetic map construction method based on crowdsourcing should not make too many assumptions about the shape of the user's motion trajectory in order to deal with the realistic user motion habits.
Large-scale Dataset: The necessity to manage massive datasets is a requirement of crowdsourced mapping for the  reasons listed below.First, short trajectories carry less information, so that the probability that any two trajectories can be uniquely associated is low, and the area covered by accurately associated two trajectories is small.Secondly, it is difficult to ensure that the gathered trajectories are spread evenly in space.
To guarantee that certain locations that are infrequently visited by people may also be effectively recreated, we must collect a huge quantity of data to ensure that these regions also have enough data for reconstruction.
In Table I, we summarized related works and their adaptability to these challenges.Following the challenge discussed above, the crowd-sourcing-based magnetic filed map construction method should fulfill such requirements: • The method can construct a map using short-period trajectories.
• The method should work for various motion patterns of pedestrians and an uncalibrated magnetometer.• The method shouldn't use any assumption of trajectory shape.
• The method should be efficient enough to process largescale datasets with acceptable time consumption.

B. System Overview
The proposed method aims to achieve magnetic map construction following the principles discussed in Section II-A, and the algorithm flow is shown in Fig. 1.The crowdsourcingbased magnetic map construction method can be divided into three stages: learning-based inertial odometry (LIO), keyframe association, and global trajectory optimization.
In the learning-based inertial odometry (LIO), the magnetometer-enhanced learning-based inertial odometry method proposed by our previous work is employed to reconstruct each short-period trajectory by using IMU and magnetometer measurements.Compared with the traditional PDR method, LIO can more accurately estimate complex pedestrian motion trajectories.To obtain accurate environmental magnetic observations, the magnetometer bias is modeled and estimated concurrently with the smartphone's pose.
In the keyframe association, we extract keyframes from each trajectory according to predetermined distance intervals, and then use the the similarity of magnetic field feature to associate keyframes.Unlike extracting keyframes based on trajectory shape (such as turning corners and corridors), the proposed method has the benefit of being free from specific trajectory shapes.However, it is very time-consuming to directly calculate the similarity of magnetic field features, especially the crowdsourced dataset size is usually very large.To overcome this drawback, we present a method for efficient keyframe association, which gives the whole system a high level of efficiency under good adaptability.
In the global trajectory optimization, we propose a two-step optimization method to estimate the positions of all keyframes.
In the first step, the optimization problem is defined as a nonlinear least square problem with inequality constraints.The relative poses of keyframes within the same trajectory are fixed, and the optimal parameters are the 2D location and direction of the initial keyframe of each trajectory.Moreover, the heading of the first keyframe is constrained by an inequality equation.This stage establishes a starting location for each keyframe to expedite further optimization and eliminate certain falsely related keyframe pairs.In the second step, a comprehensive optimization of the pose graph is implemented.Lastly, a density-based technique was employed to exclude trajectories that could not be properly merged into the map.

III. LEARNING-BASED INERTIAL ODOMETRY
The proposed method depends heavily on the reconstructed sub-trajectory shapes.The graph optimization enhanced LIO proposed by our previous work [28] [10] can adapt to various motion modes and is an ideal trajectory estimation method.Unlike the setting suggested by [28], this paper adopts global optimization instead of sliding window optimization because the data post-processing mode is available.To reduce the time consumption of the graph-optimization-based method, we reduce the number of iterations by providing optimal initial values for graph optimization.The initial value consists of two parts, the magnetometer biases and the orientation of each instant.
The initial value of magnetometer bias is obtained using the constraint that the magnetic vector is constantly combined with the relative rotation from integrating gyroscope observations.Assuming that the magnetic field vector remains unchanged in a short period of time (e.g., 2 seconds), the relationship between magnetometer observations, magnetometer bias, and relative rotation can be described as follows: where ∆R ij represents rotation from moment j to i calculated by integrating gyroscope measurements, m i and m j are magnetometer measurements at i and j moment respectively, b m is magnetometer bias.To ensure that Eq.( 1) is solvable, we select data of short periods with significant rotation around the magnetometer's x, y, and z-axes, respectively.Then, a typical attitude and heading system (AHRS) approach is used to estimate the initial value of orientation at each instant based on the constraints of the gravity vector and magnetic field vector.To align the headings of all trajectories, a graph-optimization problem is defined.The system state of trajectory k is defined as follows: where R nbi represents rotation from i-th body frame to the navigation frame.b k m is the magnetometer biases of trajectory k.Due to the length of each trajectory being short (no more than 2 minutes), it is feasible to assume that the magnetometer biases are constant during this short period.
The maximum posterior estimation of all system states is obtained by minimizing the following cost function: where m n is the magnetic field vector in the navigation frame.r prior represents the prior constraint to m n .Because we assume the x-axis and y-axis of the navigation frame are orientated to the magnetic north and east, the y-axis of the m n should be close to zero.r gyr is the relative rotation constraint between adjacent instants provided by the gyroscope measurements.r gra is the gravity orientation constraint as described in [28].r mag is the magnetic field constraint which is defined as: where m k i is magnetometer measurements in trajectory k at i-th moment.
Solving Eq. ( 3), we obtained the orientation of each instant and magnetometer biases of each trajectory.Benefiting from these global consistency and accuracy rotations of each moment, we can estimate the shape of each trajectory represented in the navigation frame based on the neural network discussed in [10].This method is robust to various motion patterns and can provide a consistency scale that is superior to conventional PDR methods.These advantages play a key role in the following processing.

IV. KEYFRAME ASSOCIATION
This section introduces the proposed fast keyframe association method.The feature extraction and similarity calculation method are described in Section IV-A.Section IV-B described a two-step keyframe association method, which uses the frequency-domain feature to achieve fast candidate keyframe pair searching and confirm the candidate keyframe pair based on the time-domain feature and geometric consistency.
A. Feature Extraction and Feature Similarity Calculation 1) Preprocess of magnetic field sequence: The availability of magnetic field sequence-based keyframe association identification relies on the fact that the magnetic field vector at a certain point is constant across time but fluctuates with the position.To obtain the characteristics of the magnetic field feature changing with the spatial position, we need to perform data preprocessing on the original observations of the magnetometer, including bias compensation, coordinate transformation and resampling according to the spatial distance.
The raw magnetometer measurements are corrected by estimated magnetometer biases of Section III and then converted from sensor frame to the navigation frame using the following equation: Then, the magnetic field sequence can be converted to a function of moving distance using the estimated trajectory by LIO as discussed in Section III.Benefiting from this conversion, the moving speed no longer influences the magnetic field sequence.Thus, we can avoid using DTW-like methods to compare two magnetic field sequences.This step outputs the magnetic field vector sequence denoted as obtained by equidistant sampling on the i-th keyframe (the magnetic field sequence within a distance of 10 meters is 1 keyframe).m k xi , m k yi , and m k zi are three-axis components of m k i respectively.
2) Frequency-domain Feature: The frequency-domain feature can be obtained by using the Fast Fourier Transform (FFT) for each axis sequence of m k i .Details of the frequency-domain feature for magnetic field sequence can be found in [29].It is defined as: where F F k i is frequency-domain feature of keyframe i in the trajectory k.F F T (•) represents the fast Fourier transform function, and the result is mapping the input sequence in the frequency domain.We only save the first eight components of the F F T (•) to speed up distance calculation.This feature vector definition causes significant information loss [29] but is helpful for fast feature similarity comparison.
The distance of two frequency-domain features is defined as: where ∥ • ∥ 2 is l2-norm.Since the distance of two frequencydomain features is defined as l2-norm, the k-dimensional tree (KD Tree)-based method can be adopted for fast similar feature finding.
3) Time-domain Feature: The time-domain feature is defined as follows: The main advantage of this feature is without additional transform, indicating that this processing is without information loss.The disadvantage of this feature is that the distance metric is hard to define as l2-norm.Because the small displacement of two sub-trajectories may cause large Euclidean distances in feature space which are not expected.
To avoid the disadvantage above, we defined the distance between two raw features as: where S(•) is a function that uses the maximum value of three axes to align two features and estimate distance.Using the maximum value of three sequences (x,y, and z-axis) to align two time-domain features yields three potential shift distances.Furthermore, we should reverse the sequences if users pass the same route with opposing directions.Thus, as illustrated in Figure 2, there are six possible alignment ways.For example, considering the x-axis can obtain two possible alignments way, the 3-axis of T F j and flipped T F j are shifted to align the maximum value of the x-axis with T F i 's.The feature distance of each alignment way is calculated based on the overlap parts.The final feature distance between two features is the minimum one in these six possible pairs of matched sequences.This distance metric considers the minor displacement and opposite movement direction of two trajectories with no loss of information.Using the premise that the moving distance of each trajectory is precise, it achieves more time efficiency than DTW-based sequence distance.This hypothesis is fulfilled by adopting the LIO.In summary, the keyframe association method proposed in this paper achieves a trade-off between efficiency and accuracy.First, a rapid candidate keyframe association is conducted using a technique with low precision and efficiency.Next, methods that are accurate but computationally costly are employed to confirm the associated keyframe pair in this candidate set, which is relatively tiny compared to the entire dataset.

B. Keyframe Pair Determination
In order to achieve better keyframe association efficiency, the k-d tree is adopted to search candidate keyframes.Compare with the brute force search, the search method based on k-d tree can greatly save computational cost.The time complexity of building the k-d tree corresponding to the frequency-domain feature of all keyframes is O(nlog(n)), n is the number of samples in the dataset.Based on the k-d tree data structure, the candidate keyframe pairs of any keyframe are quickly determined by using the criterion that the frequency feature distance is less than a preset threshold T F .
Due to the loss of information in the frequency-domain features, there are many misjudgments in the candidate keyframe pairs association, so the time-domain features and geometric consistency are used to further verify the correctness of the candidate keyframes.Firstly, the candidate keyframe pair whose T D is less than threshold T T is confirmed.The displacement and whether the sequences are reversed are also noted.In particular, we may determine the one-to-one correspondence of each sample point of the magnetic field sequence based on the information on this displacement and whether or not to reverse the sequence.After determining the one-to-one correspondence of the magnetic field sequence, the one-to-one correspondence of the two sub-trajectories may be determined.The one-to-one relationship between the sub-trajectories' points can be utilized for further geometric consistency checks.
According to the correspondence provided before, the metric of geometric consistency can be defined as: p i n and p j n are correspondence 2D points in trajectory i and j, respectively.T ij is a 2D transformation matrix that fulfills the following equation: The associated keyframe pair, which fulfills that D geo less than threshold T geo is pass the geometric consistency confirmation.This step produces a pair of associated keyframes, which shows that corresponding keyframes travel through the same location and have a similar trajectory shape.The result of the keyframe pair association are then applied to merge all trajectories.

V. GLOBAL TRAJECTORY OPTIMIZATION
Global trajectory optimization exploits the relative pose constraints between adjacent keyframes in the same trajectory and the same-position constraints of associated keyframe pairs to merge, optimize, and filter global trajectories.In this paper, we proposed a two-step graph optimization method including the inequality constraint optimization and the pose graph optimization.Section V-A describes inequality constraints for preliminary trajectory merging and higher-level geometric information to verify the correctness of the associated keyframe pair.Section V-B describes the merging and shape optimization of trajectories through global bundle adjustment.

A. Inequality Constraint Optimization
Assuming that the relative poses of all keyframes in the same trajectory are accurate, then the errors of all keyframes are consistent.The optimization parameters of one subtrajectory can be simplified as position p k = [x k , y k ] T and heading θ k .This operation helps for adopting the subtrajectory shape to determine incorrect keyframe association.To avoid the phenomenon of small-area circles in a single subtrajectory destroying the assumption that the magnetic heading disturbance conforms to a zero-mean Gaussian distribution [28], the assumption that the difference in direction between the mean magnetic field vector of sub-trajectories and the global mean magnetic field vector is smaller than a specific threshold is used to construct inequality constraints.This operation helps for adopting the sub-trajectory heading to determine incorrect keyframe association.Due to the robust kernel function being adopted to the distance constraint, the effect of the distance constraints of incorrectly associated keyframe pairs is limited in the optimization problem.Benefiting from properly using the LIO outputs, including the sub-trajectory shape and heading, and the robust kernel function for distance constraints, the inequality constraint optimization is robust to incorrect keyframe association.According to the optimized trajectory poses, the incorrectly associated keyframes can be determined and removed.
The inequality-constrained trajectory merging problem can be defined as: where ∥ • ∥ 2 Σ represents the Mahalanobia norm, ρ(•) represents the Huber norm [30], r dis represents the distance between the corresponding keyframes, [t l k bj ] xy represents x and y components of t l k bj which is position of keyframe j in the coordinate l k , R(θ) is the 2D rotation matrix of θ.Eq. ( 12) defines a nonlinear least-squares problem with inequality constraint.We utilize ceres solver [31] to solve this problem.This problem can converge quickly because the state space is small, 3Kdimensional (K is the number of trajectories).
The optimized pose of each sub-trajectory is used to transform all keyframes in the same coordinate frame (denoted as n-frame).The pose of keyframe i in the trajectory k denoted as {R k nbi , t k nbi }.Based on the estimated poses, we remove the false keyframe pairs that the distance between a keyframe pair is larger than 10 meters.

B. Pose Graph Optimization
This part adopts the global pose optimization to fuse all the constraints obtained in the previous part, so as to obtain the optimal pose of each keyframe.The problem can be defined as: where r odo represents the relative pose residual between two adjacent keyframes, r dis represents the relative distance residual between two keyframe in different sub-trajectories, r gra represents the constraint of gravity orientation, ∆R and ∆t represents the relative rotation and translation provided by Sec III, Log SO(3) (•) represents the logarithm function for SO(3), [•] xy represents x and y components of the 3D vector, g bi and g n represent gravity vectors in the local frame and the navigation frame, respectively, i and j represent the keyframe index, k and q represent the sub-trajectory index.After pose graph optimization, there are still some outlier sub-trajectories that cannot be processed normally.Therefore, we use the following criteria to identify and remove outlier sub-trajectories: select a keyframe as the center, count the number of keyframes within a preset distance radius, and the keyframe is available when the number of frames is greater than the preset threshold.Then, sub-trajectories are judged as outliers when the number of valid keyframes is less than 90%.Finally, the final merged trajectory is obtained.

A. Experiments Setup
We conduct an experiment to evaluate the system performance of the proposed method in a typical indoor shopping mall environment.Figure 3 (a) shows the colorized point cloud of the shopping mall.Figure 3 (b) and (c) show two images captured in the shopping mall.Compared with the typical office building scene, the magnetic field features in the shopping mall are less distinguishable due to the open environment.Moreover, there are almost no strict right-angle turns in the spatial structure of the shopping mall, which makes the traditional crowdsourcing construction method of magnetic field map unusable.
In the experiment, six volunteers (4 males and 2 females) participated in the data collection and used five smartphones (Samsung S10, Samsung S20, Mi 10, iPhone 13 Pro, and iPad Pro).During the data collection process, in order to reflect the real activity habits of the public users in the actual indoor scene, we do not restrict the behavior of the test users, including pedestrian movement patterns, movement speeds, trajectory shapes and mobile phone holding methods.The acquired sensor data includes gyroscope, accelerometer, and magnetometer at 100 Hz.The dataset includes 12 hours of data without ground truth and a subset (about 40 minutes) with ground truth.The ground-truth are calculated by using depth-assisted visual bundle adjustment.
To improve data collection efficiency, the duration of a single test for a single user lasts 10 to 20 minutes.This obviously does not conform to the objective law of the real activity duration of users.Therefore, we must divide the trajectory according to the specified parameters such that the length of each resulting trajectory is smaller than the specified threshold l traj .Notably, because the magnetic field vector of a single point lacks any degree of discrimination, the matching between keyframes can only depend on a magnetic field sequence comparison.Therefore, the magnetic field sequences cannot be connected before and after the split point.This differs from WiFi and visual, which can do keyframe association using all signals observed in one epoch.

B. Evaluation Metric
We divide the test data with reference truth into two parts: the first part (about 12 minutes) is used to evaluate the accuracy of crowdsourcing construction of the magnetic field map, and the second part (about 28 minutes) is used to evaluate the magnetic positioning accuracy using the built magnetic map.
After the magnetic map construction, the estimated positions of the first part of the trajectories are denoted as { t0 , ..., tN } and the corresponding reference positions are denoted as {t 0 , ..., t N }.Three metrics, including E m , E ′ m , and E p , are adopted to evaluate the proposed method.
The aligned horizontal position residuals E m is defined as: where, ∥ • ∥ is l2-norm.R a and t a are 2D rotation and translation, respectively.E m ′ is the horizontal position error while accounting for the effect of scale, defined as: where R a , t a , and s are 2D rotation, translation and scale, respectively.E p is the horizontal position error of localization using the built magnetic map.Before localization, the magnetic map is alignment with the reference framework using (23).The localization method is a brute force sequence-matchingbased method [32].The matching is based on a magnetic field sequence withing a distance of 10 meters of the keyframe.The E p is defined as: where, tl i is localization result based on built magnetic map.t i is correspondence reference position.

C. System Performance
This section evaluates the accuracy of the magnetic field maps generated by the proposed method using a real dataset.To obtain more believable conclusions, we evaluate the system performance under different parameters, including dataset size of 3, 6, 9 and 12 hours, and sub-trajectory time length of 60, 75 and 90 seconds.Next, we analyze and discuss the results with a data size of 12 hours and a sub-trajectory time length of 90 seconds.Figure 4 shows the crowdsourcing trajectories reconstructed by the learning-based inertial odometry as described in Section III.Since there are no presumptions regarding the beginning point of each trajectory, the first frame's coordinates are used as the origin.We can find that the heading of all trajectories is approximately aligned.This benefits from the assumption that the y-axis of the magnetic field for each trajectory is zero.Figure 5 (a) depicts the outcome of the pose graph optimization (Section V-B), whereby the majority of trajectories have been combined.The keyframe density surrounding the few incorrectly merged keyframes is significantly lower than the keyframes for which the location is accurately predicted.As noted before, this information was utilized to determine the successfully merged trajectories, resulting in the final trajectory map displayed in Figure 5 (b).
Figure 6 (a) and Figure 6 (b) show the results of aligning the estimated trajectories with the reference trajectories using two different alignment methods (with or without scale correction), Figure 6 (c) shows the localization results using the reconstructed map that aligned with the reference framework using (23).The semi-transparent gray lines represent the reference trajectory, all the colored lines represent the estimated trajectories, and the blue dots represent the results of sequence-matching-based localization.From Figure 6 (a), we can see that the proposed method can accurately recover the relative spatial relationship between crowdsourced trajectories, but suffers from significant scale error.The reason is that there is a scale error in the user trajectory estimated by LIO, which is the characteristic of almost all traditional PDR algorithms and deep learning based inertial odometry.Nevertheless, the effect of scale bias can be eliminated by global correction, as shown in Figure 6(b).At the same time, based on the scalecorrected magnetic field map, most of the positions estimated based on magnetic field sequence matching have a good degree of overlap with the reference position, as shown in Figure 6(c).The average horizontal error of magnetic map and localization are 1.48m (E m ′ ) and 2.53m (E p ), respectively.In general, the above results can prove the feasibility and effectiveness of the  proposed method to a certain extent.
To further verify the performance of the proposed methods, an ablation test is provided to illustrate the effect of the time length of trajectories and the size of the dataset.As illustrated in Table II, we investigate E m , E m ′ , and E p of different time lengths and dataset sizes.We can conclude that larger dataset sizes and longer trajectory lengths result in more precise map construction and positioning.Longer trajectory length improves the mapping accuracy because a longer trajectory length can have a higher probability of generating correct associated keyframes with other trajectories.Meanwhile, it is helpful to eliminate incorrectly associated frames.The benefit of a larger dataset is that more sub-trajectory shapes are mutually constrained to reduce incorrectly associated keyframe pairs.
Figure 7 shows the cumulative density function (CDF) of the proposed method using different dataset sizes and trajectory lengths.Figure 7 (a), (b), and (c) show the E ′ m CDF of the proposed method using various dataset sizes to build the  2 Keyframe Association (Section IV) 3 Global Trajectory Optimization (Section V) 4 Feature Extraction (Section IV-A) 5 Keyframe Pair Determination (Section IV-B)  7 (d), (e), and (f) show the E p CDF of the proposed method to show the effect of dataset size on the positioning performance of the built magnetic map.The proposed method using a small-size dataset (3 hours) shows significantly lower accuracy in both E m ′ and E p .Thus, the proposed method needs enough data to achieve the magnetic map construction when the trajectory length is limited.In other words, a large dataset is necessary for magnetic map building via crowdsourcing using short trajectories.Nonetheless, the spatial distribution of the data collected in this experiment is uniform, which may not reflect reality.Therefore, the total amount of data required for a region of the same size in a realworld scenario may be greater than that used in experiments.

D. Efficiency of the proposed keyframe association method
Since large-scale data sets are the basic guarantee for the feasibility of crowdsourcing-based solutions, the efficiency of data processing is a very important evaluation indicator.Table III gives the specific time consumption of each algorithm module when using the proposed method to process the 12- hour data set.The computing platform is a laptop, the main parameters include an R7-5800H 8-core CPU and a GTX 3060 GPU.The proposed algorithm takes a total of 60.8 seconds to process the 12-hour data set, and the fast key frame association method takes 10.7 seconds, accounting for less than 20%.We can learn that the proposed fast keyframe association method is efficient and will no longer be the largest time-consuming module of crowdsourcing-based solutions.At the same time, we find that LIO is the most time-consuming algorithm module, taking 43 seconds, accounting for 70%.This is due to the poor performance of the GPU used in this experiment, and using a higher-performance GPU is an effective way to further improve computational efficiency.
Figure 8 shows the time consumption of direct keyframe association (denoted as Direct-Method)and the proposed fast keyframe association method (denoted as Fast-Method) for different sizes of datasets.As the dataset size increases, the time consumption of the Direct-Method increases linearly, while the time consumption of the Fast-Method increases slowly.In particular, when processing a 12-hour data set, the efficiency of the Fast-Method is more than 90% higher than that of Direct-Method, showing that the proposed method is feasible and the efficiency improvement is very obvious.
Tabel IV summarizes the time consumption of keyframe association for state-of-the-art crowdsourcing-based schemes.WiFi-RITA [16] considers many interference factors of realworld environments and is the most advanced WiFi-based crowdsourcing method in the existing literature.The associated keyframe determination of this method is efficiency(takes only 12.3 seconds).But it has a time-consuming pre-processing stage (WiFi mark searching) which costs 4508 seconds.[23], the magnetic field-based crowdsourcing solution which notices to reduce time consumption, takes 37.3 seconds for keyframe association when processing a 15-minute dataset.This solution is inefficient because it uses a hidden Markov model (HMM) to do keyframe association.The fact that the time consumption (37.3 seconds) of [23] processing 15 minutes dataset exceeds the time consumption of the proposed method processing 12 hours dataset indicates the proposed method with significantly better efficiency.Furthermore, it is noticed that [23] relies on detecting corners and only associating corners to reduce time consumption.Thus, [23] is hard to adapt to complex buildings that not only consist of hallways and corners.

E. Effect of the proposed two-step optimization
This section mainly explains that two-step graph optimization can more effectively resist the destructive effects of wrong keyframe associations than one-step graph optimization.Figure 9 shows the comparison of mapping results using the global optimization method with or without inequality constraint optimization.The optimized trajectories without inequality constraints (Figure 9 (b)) show significantly worse relative pose estimation performance compared to the trajectories in Figure 9 (a).E m ′ of graph optimization method with or without inequality constraint optimization are 1.48m and 3.60m, respectively.The reason is incorrectly associated keyframe pairs resulting in incorrect subtrack headings and shapes.In that case, the inequality constraint optimization is immune to the effect of this incorrectly associated keyframe pair.Then, benefiting from the good initial value from the inequality constraint optimization, the pose-graph optimization in the two-step optimization is less affected by incorrect keyframe association pairs.

VII. CONCLUSION AND FUTURE WORK
This paper concludes the challenges of the crowdsourcingbased magnetic map-building method for mass users.The challenges include: 1) short-period trajectories.2) various pedestrians' motion patterns and uncalibrated magnetometers.
3) without the assumption of trajectory shape.4) efficient enough to process large-scale datasets.
We proposed a method that fulfills these requirements.This proposed method consists of three stages, including the leaning-based inertial odometry (LIO), the keyframe association, and the global trajectory optimization.We adopt the LIO to estimate accurate trajectory and sensor biases regardless of motion mode.Benefiting from properly using the information of the time-domain feature and frequency-domain feature, the keyframe association method can achieve efficient keyframe association using short-period trajectories and trajectories with irregular shapes.The global trajectory optimization is proposed to remove incorrectly associated keyframe pairs and improve robustness.
The proposed method's performance is verified by field testing in a shopping mall.It can process a dataset (dataset size is 12 hours and trajectory length is 90 seconds ) in 60.8 seconds and provide merged trajectories that mean position error is 1.48m (with scale correction).The ablation study demonstrated the effect of dataset size and trajectory length and proved that a large-scale dataset is necessary.Furthermore, we verified the efficiency of the proposed keyframe association method and the robustness of the proposed two-step graph optimization method.
The proposed method relies on recovered user trajectories, however LIO is a data-driven approach, and its performance on general public user datasets is still an open issue.Therefore, in future work, we intend to enhance the LIO method and provide reliability metrics for LIO using real datasets from general public users.Moreover, consideration should be given to the method for creating maps in more complex scenarios, such as areas with changing local magnetic fields (e.g., underground garages) and multi-story buildings.

Fig. 1 .
Fig. 1.Data flow of the proposed system.

Fig. 2 .
Fig. 2. Illustration of the T D( T F i , T F j ) calculation.

Fig. 3 .
Fig. 3. Illustration environments of the shopping mall.(a) point cloud of the shopping mall.(b) and (c) images captured in the shopping mall.

Fig. 6 .
Fig. 6.Result of the proposed method.The gray line represents reference positions.(a)estimated trajectories alignment without scale correction.(b) estimated trajectories alignment with scale correction.(c) Positioning result using the constructed magnetic map.

Fig. 7 .
Fig. 7. Performance of the proposed method using various dataset sizes and trajectory length.(a) E m ′ CDF using 60 seconds trajectories.(b) E m ′ CDF using 75 seconds trajectories.(c) E m ′ CDF using 90 seconds trajectories.(d) Ep CDF using 60 seconds trajectories.(e) Ep CDF using 75 seconds trajectories.(f) Ep CDF using 90 seconds trajectories.

Fig. 9 .
Fig. 9. Result of recovered trajectories alignment with scale correction.(a) estimated trajectories with inequality constraint optimization.(b) estimated trajectories without inequality constraint optimization.

TABLE I COMPARISON
OF RELATED WORKS.

TABLE III TIME
CONSUMPTION OF THE PROPOSED METHOD.