Compressed Spectrum Reconstruction Method Based on Coding Feature Vector Enhancement

Compressive spectral imaging (CSI) is a snapshot spectral imaging technique that rapidly captures the spectral information of a target in a single exposure and effectively reconstructs high spectral data using reconstruction algorithms. However, due to the presence of a large number of identical pixels in the measured image, which map to different prior spectral information, existing algorithms struggle to establish an accurate pixel separation representation model. To improve the separation effect between pixels and enhance the representation capability of the measured image pixels, we propose a compressed spectral reconstruction method with enhanced encoding feature vectors. By designing encoding information calculation rules based on a combination of linear and nonlinear functions, encoding features are calculated according to the spatial coordinate position information and wavelength information of the pixels, effectively enhancing the separation representation characteristics between channels and neighboring pixels through the addition of encoding features. Furthermore, by utilizing the semantic similarity between the predicted results of the prior model and the prior spectral image, the reconstruction problem is transformed into a total variation (TV) minimization problem between the predicted results of the prior model and the reconstruction results, combined with the alternating direction method of multipliers (ADMMs) to achieve accurate pixel reconstruction. The experimental setup utilizes a dual-camera compressed spectral imaging (DCCHI) system, consisting of a dual-dispersion coded aperture compressed spectral imaging (DD-CASSI) system and a grayscale imaging system. Various experiments have shown that the proposed method outperforms in reconstructing quality and displays superior algorithmic performance.

Compressed Spectrum Reconstruction Method Based on Coding Feature Vector Enhancement Chipeng Cao , Jie Li, Pan Wang , and Chun Qi, Member, IEEE Abstract-Compressive spectral imaging (CSI) is a snapshot spectral imaging technique that rapidly captures the spectral information of a target in a single exposure and effectively reconstructs high spectral data using reconstruction algorithms.However, due to the presence of a large number of identical pixels in the measured image, which map to different prior spectral information, existing algorithms struggle to establish an accurate pixel separation representation model.To improve the separation effect between pixels and enhance the representation capability of the measured image pixels, we propose a compressed spectral reconstruction method with enhanced encoding feature vectors.By designing encoding information calculation rules based on a combination of linear and nonlinear functions, encoding features are calculated according to the spatial coordinate position information and wavelength information of the pixels, effectively enhancing the separation representation characteristics between channels and neighboring pixels through the addition of encoding features.Furthermore, by utilizing the semantic similarity between the predicted results of the prior model and the prior spectral image, the reconstruction problem is transformed into a total variation (TV) minimization problem between the predicted results of the prior model and the reconstruction results, combined with the alternating direction method of multipliers (ADMMs) to achieve accurate pixel reconstruction.The experimental setup utilizes a dual-camera compressed spectral imaging (DCCHI) system, consisting of a dual-dispersion coded aperture compressed spectral imaging (DD-CASSI) system and a grayscale imaging system.Various experiments have shown that the proposed method outperforms in reconstructing quality and displays superior algorithmic performance.
Index Terms-Compressed spectral imaging, encoding feature reconstruction, prior spectral data, vector enhancement.

I. INTRODUCTION
H YPERSPECTRAL image data is a multiband imaging acquisition of the target object by hyperspectral imaging equipment [1], which can be expressed as a 3-D spectral Chipeng Cao is with the School of Information and Communication Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China, and also with the Xi'an Institute of Optics and Precision Mechanics, University of Chinese Academy of Sciences, Xi'an, Shaanxi 710049, China (e-mail: 1309209924@qq.com).
Pan Wang is with the School of Information and Communication Engineering, Xi'an Jiaotong University, Xi'an, Shaanxi 710049, China (e-mail: panwang@stu.xjtu.edu.cn).
Digital Object Identifier 10.1109/TGRS.2023.3347220data cube composed of 2-D spatial information and 1-D spectral information [2], which can not only reflect the state characteristics of the surface material of the target object, but also reflect the spatial characteristics of the object such as the contour, texture, geometric scale, and so on, and it is widely used in the fields of remote-sensing (RS) detection [3], mineral exploration [4], target detection [5], medical diagnosis [6], and so on.Hyperspectral imaging systems have various acquisition modes, including dispersive, interferometric, and filter-based modes.Compared to color imaging or multispectral imaging, hyperspectral imaging has a higher spatial-spectral resolution, allowing it to capture more subtle variations and greatly improve the accuracy of quantitative analysis [7].However, hyperspectral imaging devices have complex structures and longer spectral acquisition times, making it difficult to quickly capture spectral image data of dynamic targets [8].Compressive sensing reconstruction theory enables snapshot spectral imaging, with coded aperture snapshot spectral imaging (CASSI) as a representative example.It achieves spatial modulation through aperture coding and spectral modulation through dispersive elements, performing superimposed imaging on a 2-D detector.This technique allows for the rapid acquisition of spectral compressed measurement information of dynamic targets [9].According to the compressive sensing reconstruction principle, by utilizing the measured image, the sensing matrix of the CASSI system, and the sparsity of prior spectral data, the original spectral image can be reconstructed, greatly improving the efficiency of acquiring spectral imaging data in dynamic scenes.
However, reconstructing a high-dimensional spectral image from measured images is an ill-posed problem of low-tohigh-dimensional mapping.Traditional reconstruction methods mainly focus on studying the sparsity [10], smoothness [11], and low-rank prior [12], [13] of the prior spectral data.By adding different constraints, these methods can achieve good reconstruction quality.However, different types of constraints often have adaptability to specific types of spectral data and lack generalization ability for high-dimensional spectral data in various scenarios.Moreover, increasing constraints lead to a rise in computational complexity, making it challenging to achieve fast processing of spectral data.Deep-learning algorithms have shown high learning efficiency [14], [15].By using the loss function of the model as a replacement for the multiple constraints in traditional algorithms, they effectively simplify the problem of high computational complexity in reconstruction algorithms [16], [17], [18].For example, Guo et al. [19] 1558-0644 © 2023 IEEE.Personal use is permitted, but republication/redistribution requires IEEE permission.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
proposed a multiscale fusion network that combines binocular spectral features and global spatial features, demonstrating that multiscale fusion networks can extract more discriminative spectral features.Chen et al. [20] proposed a convolutional sparse coding reconstruction method that establishes a regularization model based on the sparsity of multiscale convolutional features, effectively improving the reconstruction quality by enhancing interspectral information constraints.Yuan et al. [21] introduced a plug-and-play (PnP) deep reconstruction method that utilizes proximal gradient algorithms and image-denoising networks to achieve compressive spectral reconstruction, enhancing the flexibility and adaptability of the reconstruction algorithm.Xiong et al. [22] performed upsampling operations on the measured images using multiscale convolutions, establishing an end-to-end mapping model between the upsampled information and the original high-dimensional spectral image, effectively improving the generalization of the reconstruction model.The essential reason why deep neural networks can be used for compressed spectral image reconstruction is that the neural networks can extract deep convolutional features of the compressed measurement image, which enhances the semantic representation of the reconstruction model between the measurement image and the prior spectral image.Existing depth-prior reconstruction algorithms, based on transformer [23] combined with spatial-spectral attention mechanism [24] or multiscale convolution [25], effectively enhance the modeling capability of the network model for global contextual information.However, they still cannot achieve the mapping of spatial coordinate information between training data and measurement data, resulting in mapping deviations of fine details in the reconstruction results, making it difficult to obtain satisfactory performance in practical reconstruction tasks.The establishment of split nodes in the random forest (RF) algorithm [26] can better utilize the small differences between features.This is achieved by calculating measurement metrics such as information gain or Gini index between different features to finely divide feature subsets, effectively avoiding significant biases in the predicted results.Therefore, to enhance the separation representation characteristics of pixel information and reduce the spatial information difference between the reconstruction results and the prior spectral images, this article proposes a model optimization solution method based on enhanced encoding feature vectors.By calculating encoding features based on the spatial coordinates of pixels and wavelength information, the addition of encoding features enhances the separation representation characteristics between neighboring pixels, thereby reducing mapping biases between pixel information and improving the semantic similarity between predicted results of the prior model and the prior spectral images.Additionally, an unencoded grayscale imaging channel is added to the dual-dispersion coded aperture compressed spectral imaging (DD-CASSI) system as a supplement to the reconstruction process.By using the grayscale images and their encoding features as training data, an end-to-end prior model is established to utilize the total variation (TV) difference between the predicted results of the prior model and the reconstructed results as a regularization term in the reconstruction process, aiming to improve the smoothness and spectral correlation of the reconstruction results.A smaller TV difference indicates higher image smoothness.Finally, the alternating direction method of multiplier (ADMM) optimization algorithm is used to optimize the minimum TV difference regularization objective function, achieving precise reconstruction of pixel information.
The main contributions of the enhanced encoding feature vector reconstruction model can be summarized as follows.
1) We have proposed an innovative pixel separation representation model that combines wavelength information with pixel spatial location information to design encoding information calculation rules, enhancing the separation representation characteristics between different scene channels and neighboring pixels.This approach helps to reduce mapping biases between pixel information and improve the semantic similarity between the predicted results of the prior model and the prior spectral images.2) For the first time, we have introduced ensemble learning into the compressed spectral reconstruction process.By using the tree-structured prior model's predicted results as a regularization term in the solution process, we can better describe subtle differences between encoding features, obtaining clearer and more accurate predictions.This approach effectively guides the solution of the compressed spectral reconstruction process.
Compared to traditional methods such as RGB image interpolation and deep priors for establishing regularization terms, our proposed method better preserves the edge and detail information of images, thus improving the reconstruction quality.3) Our proposed method can achieve higher reconstruction quality with lower computational parameters and demonstrates good applicability to hyperspectral data from different scenes.This superior algorithm performance makes our proposed method more practical and meaningful in real-world applications.

II. DUAL-CAMERA COMPRESSED SPECTRAL IMAGING (DCCHI) MODEL
In CASSI, the incident light undergoes spatial modulation and spectral modulation before being compressed onto a 2-D detector [27].Spatial modulation is achieved through a 2-D coding aperture, while spectral modulation is primarily realized using dispersive elements or diffraction gratings.Subsequently, a reconstruction algorithm is used to reconstruct a 3-D data cube from the measured 2-D spectral data.In the compressed measurement image, each element contains the corresponding spectral measurement value at the pixel position.Different objects or regions exhibit distinct spectral characteristics.However, when the incident light undergoes spatial and spectral modulation for compression measurement, there will be some degree of spectral crosstalk and information loss.This leads to the occurrence of identical spectral measurement values at different pixel positions, which can easily result in mapping biases of spectral measurement information in the reconstruction model.
Reconstructing high-dimensional spectral data from lowdimensional compressed measurements poses an ill-posed inverse problem, requiring higher demands on the constraints for accurate data recovery.To improve the reconstruction quality of the CASSI system, Zhu and Zhao [28] have introduced an additional unencoded grayscale or RGB imaging channel through a spectrometer, serving as a supplement to the reconstruction process, effectively enhancing the reconstruction quality in the visible light band.The DCCHI system used in this article is illustrated in Fig. 1, where the incident light first encounters a spectrometer.One path is captured by the DD-CASSI, while the other path is captured by the unencoded grayscale imaging system.
According to the reversibility of the optical path, after the incident spectrum undergoes compression in the imaging system, the DD-CASSI imaging system performs a reverse dispersion operation through the second dispersive element to counteract the dispersion introduced by the first prism, causing the images of different spectral bands to realign and project onto the detector.The spatial dimensions of the resulting 3-D data cube match the initial image size of the target scene, thus eliminating image distortion caused by the single dispersion system.The sampling data of the DD-CASSI detector, obtained after spatial and spectral modulation of the incident spectrum, is overlaid on a 2-D detector, forming a 2-D matrix where each element contains the spectral measurement values at the corresponding pixel position.This describes the detection process of DD-CASSI [29] where (m, n) represents the pixel coordinate position, λ represents the central wavelength, f (m, n, λ) represents the continuous representation of the incident light, h(m, n, λ) is the sensing matrix of the optical system, and y(m, n) is the measurement value of the detector.Expressing the output signal of the detector in a discrete form, the intensity of the output signal at pixel coordinate position (m, n) is given by where C represents the total number of discrete spectral channels.If the size of the hyperspectral data cube f is W × H × C, and the measurement image y(m, n) of DD-CASSI has a size of W × H .By using the grayscale imaging system as a supplement to the reconstruction process, it is possible to effectively characterize the spectral information of the target, thereby improving the smoothness of the reconstruction results and the long-range spectral correlation.The imaging process can be represented as where h gray (m, n, λ) is the sensing matrix corresponding to the grayscale imaging system.

III. PROPOSED CODED FEATURE VECTOR ENHANCED RECONSTRUCTION METHODS
In this section, we first introduce the proposed scalable encoded feature vector enhancement reconstruction algorithm model.Then, we analyze the semantic similarity between the predicted results of this model and the prior spectral data through experiments.We use TV minimization as the denoising function to improve the smoothness of the reconstruction results.Finally, we implement the reconstruction of compressed spectral images using the ADMM optimization algorithm.

A. Scalable Coded Vector Enhanced Reconstruction Algorithm
In machine-learning tasks, features are typically described as a single value or a vector.When a single feature is not sufficient to represent a specific target, using multidimensional feature combinations helps describe the target more comprehensively.This approach allows algorithmic models to explore deeper semantic information and achieve better prediction results.In the process of compressive spectral reconstruction, grayscale-measured images have a good mapping representation relationship with prior spectral information.However, grayscale-measured images often contain many identical spectral measurement values, which fail to fully express the spatial position relationships between pixels.As a result, there may be mapping deviations between the prior spectral data and the collected grayscale-measured images during the reconstruction prediction process.Currently, state-of-the-art (SOTA) methods for compressive sensing reconstruction only utilize the spectral encoding information contained in individual pixels of the encoded sampling data, without leveraging the spatial information between pixels.This can lead to distorted or blurred reconstruction results.Adding additional spatial coordinate encoding information to each pixel helps transform the sampled scalar data into vector information, improving the separation effect between pixel points and enhancing the information representation capability between channels and neighboring pixels.
The framework of the encoding vector enhanced reconstruction algorithm is shown in Fig. 2. First, linear and nonlinear functions are used to design the coding information calculation rules.The coding vector features are calculated based on the spatial coordinates encoding information of each pixel and its corresponding wavelength information.This conversion transforms the scalar measurement values of grayscale-measured Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.images into vector information representations, enhancing the representation of subtle differences between neighboring pixels.As a result, it improves the separation effect of the prior model on pixel points and enhances the quality of reconstruction.Then, the grayscale-measured images and their encoding information are used as the feature set, while the prior spectral information corresponding to each pixel position is used as the label set.The prior model is trained using the RF algorithm.Finally, the trained prior model takes the measurement images and their corresponding encoding features as input data and outputs the spectral information for each pixel point.
To increase the encoding differentiation between channels and neighboring pixels, we have designed a coding strategy that integrates wavelength information and pixel spatial coordinate information using a combination of linear and nonlinear functions, as shown in Fig. 3.The nonlinear component is composed of a Sigmoid function, which exhibits saturation properties when the input approaches extreme values, allowing for better compression and expansion of feature representation.By analyzing the trend of the function, we calculate the value at pixel position i.When the independent variable ranges from −5 to 5, a small change in input value leads to a significant change in the function's output.Therefore, within the range of [−5, 5], the differential encoding features of adjacent pixel points become more prominent, facilitating the establishment of a clear separation effect between pixel points.The linear function, on the other hand, divides the range [0, 255] equally and calculates the corresponding encoding feature value at pixel position i.The combined use of the nonlinear and linear functions for Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
integrating wavelength information and pixel coordinate information enhances the differentiation in pixel information encoding among various channels, thereby improving the diversity and representational capacity of image encoding while maintaining a clear separation effect between pixel points.Taking the calculation process of spatial information coding of a pixel point as an example, the formula for b i is where λ i j is the wavelength value of the image of the band where the current pixel point is located, i is the position of the pixel point taking the value of [1, W × H ], j is the band where the pixel point is located taking the value of [1, C], and W × H is the total number of pixel points in the band.The coded spectral feature data D was created by combining the grayscale measurement image acquired by the grayscale imaging system with the computed spatial coding matrix as features and the prior spectral information as labels.Here, L lable represents the characteristic attribute of the prior spectral data, and y i = [y (i,1) , y (i,2) , . . ., y (i, 31) ] represents the prior spectral data corresponding to the ith pixel The training process of the prior model is composed of multiple regression trees.When training, it is necessary to perform random sampling with playback on the spatial coded spectral features, and random sampling without playback on the prior spectral data.Each time, i samples are taken out and K samples are taken.K training sample subsets T = [(x 11 , y 1 , B 1 ), (x 22 , y 2 , B 2 ), . . ., (x i j , y i , B j )] are generated, and K training subsets are trained K regression tree models.For a single regression tree, each feature attribute in the sample subset and its corresponding coding information x i j are traversed, and the square error (SE) corresponding to the two split subsets R 1 and R 2 divided by this spectral information is calculated.The optimal split feature corresponding to the node is selected by solving the minimum square difference, and each split subset continues to split until all feature divisions are completed.The formula for calculating the square difference of spectral information is shown in the following equations: x i j ∈R 2 (Bj,xij) where c 1 and c 2 represent the mean values of spectral information corresponding to the two subsplit spaces R 1 and R 2 , respectively.When the square error of the final divided spectral information is 0, all the samples of the node are the same reflected spectral information.At this time, the output of the single regression tree model is the spectral information corresponding to the final divided samples.The proposed method predicts that the decision-making process can be represented as where i represents the pixel position coding, the value range is [1, W × H ], j represents the jth coding feature data corresponding to the ith pixel position coding, the value range is [1, C], D i (•) is the decision function, when it is equal to 0, it is the leaf node, and the nonzero part continues to split or stops splitting when the current decision tree depth reaches the minimum value, and I i (•) is a representation function.When D i is 0 or the minimum value, the spectral reflectance y k i corresponding to x i j is output.
is the final reconstruction result, and the training process of the reconstruction model is represented by rf where rf(x) is the compressed spectral reconstruction model enhanced by the coding feature vector, ∧ is the logical AND operator, which indicates that the model starts training when both the assignment operation and error calculation are true (nonzero, nonempty).x = x i j represents the assignment of x, and y pre i = y org i represents the consistency between the prediction result and the original spectral data using the mean square error (MSE).The MSE calculation expression is In summary, the inverse solving process for 3-D spectral imaging data is expressed as where ∧ indicates that when x = x i j is true, the model starts predicting.

B. Semantic Similarity of Predicted Results by Prior Model
In the reconstruction process of compressive spectral imaging (CSI) systems, due to the ill-posed nature of the problem, higher requirements for constraint conditions are needed to completely recover the original data.The regularization fidelity term can limit the feasible domain of x by utilizing the sparsity, low-rankness, or smoothness of the input image in a certain transform domain, which plays a crucial role in improving the smoothness and spectral correlation of the reconstruction results.To reduce the spatial discrepancy between the reconstructed result and the prior spectral image, this article utilizes the semantic similarity between the prior model prediction and the prior spectral image, transforming the compressed sensing reconstruction problem into a TV minimization problem of the difference between the prior model prediction and the reconstructed spectral data.The objective function for optimization can be expressed as where is the sensing matrix consisting of coded aperture and dispersion elements.The full variation model measures the variation in the spatial structure of the image information, and the full variation difference between the reconstruction results and the prior spectral data should be minimized during the optimal solution process.
To verify the semantic representation capability of the proposed method and the pixel-separated representation performance of the coded vector features, the proposed method is compared with the RF algorithm, convolutional neural network (CNN), linear regression (LR) algorithm, polynomial regression (PR) algorithm, and RGB linear interpolation (RGB-L) and nearest neighbor interpolation (RGB-N) based methods, respectively, which do not add the coded features.Using CAVE [30] hyperspectral data as experimental data and peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) [31] as evaluation metrics, Table I shows the comparison of the computed results of each evaluation metric for different wavelength prediction results.
As can be seen from Table I, there is a higher semantic similarity between the prediction results of the RF algorithm and the prior spectral images, and the addition of coded vector features helps to enhance the separation of pixel information between the representations, and the prediction results with the addition of the coded vector features have an increase in the PSNR of 5.32 dB and an increase in the SSIM by 3.5% compared to the prediction results with the unincorporated coded vector features, and the proposed method is significantly due to other traditional algorithms.The visualization of the prediction results of different a prior models are compared by generating pseudo-color images at 460, 520, and 640 nm, respectively.As shown in Fig. 4, the pseudo-color images generated from the prior model prediction results of the proposed method are most similar to the pseudo-color images of the prior spectral data, which reflects the superiority of the proposed spatially separated characterization model, and has higher spatial and spectral fidelity compared with other algorithms.

C. ADMM-Based Prior Model Regularized Reconstruction Method (ADMM-PRM)
In this article, the objective function is optimally solved by using the alternating direction multiplier method.The smaller the result of the TV model calculation, the higher the smoothness of the image.In the reconstruction process, we want the reconstruction result to have both smoothness and to retain more edge detail information, so the difference between TV(x) and TV[x rf (y gray )] should be calculated to be greater than 0. According to the triangle inequality, the upper bound of TV difference can be expressed as TV(x) − TV x rf y gray ≤ TV x − x rf y gray . ( By replacing the TV difference with its upper bound, the objective value of x will be closer to the ideal value of the prior spectral image.Therefore, the final objective function can be corrected as where z is the introduced auxiliary variable.The augmented Lagrangian form of the objective function can be represented as follows: The beam-splitting element averages the incident light into two beams to be imaged separately, to increase the smoothness of the reconstruction results, the balance factor of the regularization term of the prior model takes the value of 1/2, then the algorithm update solution process is expressed as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
where k denotes the number of iterations of the solution algorithm and µ is the Lagrange factor.

A. Experimental Data Preparation
To validate the reconstruction performance of the proposed algorithm, we conduct experiments on the proposed method using KAIST [32], CAVE, and five RS data with different scales and number of channels to demonstrate the advantages of the proposed method over other SOTA methods for CASSI reconstruction measurements.For the convenience of experimental comparison, the size of the KAIST training and testing data images is consistent with that in reference 13.The CAVE dataset employs 31-channel real hyperspectral data for training and testing.RS data from Pavia University (PaviaU), Pavia Center (PaviaC), Kennedy Space Center (KSC), Botswana, and ground-truth RS data from Salinas were used for the simulation experiments, respectively.The RS data were cropped according to the size of the mask and the specifications of the data are shown in Table II.
The corresponding color images of the experimental data are shown in Fig. 5, where the RS data KSC, Botswana, and Salinas are displayed as pseudo-colored images.The computer configuration used in this experiment is an AMD R7 4800H CPU 2.9 GHz, 16 GB RAM, and NVIDIA GTX2060 GPU.The evaluation of the reconstruction results is performed using metrics such as PSNR, SSIM, and spectral angle mapper (SAM) [33].
The experiment simulates the reconstruction process of a DD-CASSI system with a grayscale imaging system.The process of building the coded spectral feature database is shown in Fig. 6, where the spatial coded spectral feature database of each pixel point is built by using the grayscale measurement image and the coded matrix of pixel point position information at different wavelengths as features, and the prior spectral information as labels.Where the spectral wavelength is denoted as R, the number of bands is denoted as C, and the image size of each band is W × H .
First, the prior spectral data are normalized and arranged into N 1-D column vectors according to the wavelength value from low to high, and the spectral label data Y = [y 1 , y 2 , . . ., y i ] is established, where Y = [y (i,1) , y (i,2) , . . ., y (i,C) ].Then, the spectral images of each wavelength were multiplied and summed using the shift random binary code [34].The collected measurement image is used as the first band coding B 1 , and the wavelength and pixel position are coded as B 2 -B C+1 , and the spatial coding feature matrix X = [B 1 , B 2 , . . ., B C+1 ] is established.Finally, the prior spectral data is mapped to the coding feature matrix as the label spectrum of the coding feature data, and the coding where µ y org represents the covariance of the original image and the reconstructed image.To avoid the denominator being 0, C 1 and C 2 are constants in the calculation process [35], the default value C 1 is 6.5025, and C 2 is 58.5225.The SAM evaluation index is determined by calculating the cosine value of the angle between the reconstructed spectral vector and the original spectral vector.The smaller the calculated value, the higher the spectral similarity.The SAM evaluation index definition formula is where θ and θ pre represent the original and reconstructed spectra of pixels, respectively.∥ θ T ∥ and ∥ θ pre ∥ represent the vector modulus of the original and reconstructed spectra, respectively.

C. Simulated Experiments on KAIST
For the KAIST hyperspectral dataset, we chose five scenarios to validate the proposed method and acquired them with a DCCHI system.The dimensions of the KAIST hyperspectral data, the measured images, and the grayscale images are shown in Table II.In the process of model training, the depth of the regression tree plays a decisive role in the computation of the algorithm, to ensure that the algorithm obtains a higher reconstruction quality with lower parametric computation, it is determined through experiments that the reconstruction results can reach the optimum of all current algorithms when the depth of the regression tree of the prior model is 29 and the number of regression trees is 100.
The proposed method is compared with ADMM-Net [36], DGSMP [37], PnP-DIP [38], GAP-Net [39], HD-Net [40], TSA-Net [41], λ-Net [42], MST-L [43], PFusion [44], DFLFME [45], PID-ADMM [46], and other algorithms, the PSNR, SSIM, and SAM of the reconstruction results are calculated as shown in Table III, and the black bolded calculation parameters are the optimal values of the reconstruction results.From the calculation results, it can be seen that the reconstruction results of the algorithm proposed in this article, compared with the SOTA methods, the PSNR of the KAIST scene is improved by 4.31 dB, the SSIM is improved by 2.8%, and the SAM calculation results have the same competitiveness as the SOTA methods.Fig. 7 shows the visual comparison of the reconstruction results for KAIST Scene 01 at different wavelengths.From the reconstruction results of different wavelength images, it can be observed that the proposed algorithm not only exhibits good reconstruction performance in nonedge regions, but also preserves the discontinuous characteristics of pixel values in short-wavelength images.The reconstruction results are closer to the spatial information of the original spectral image and effectively restore the contour changes of edge information in the spectral image.
Fig. 8 illustrates the visual comparison of spectral consistency for two selected pixel locations (100, 100) and (125, 160), in the reconstruction results.By comparing the plots, it can be observed that the reconstructed spectral curves using the proposed method exhibit high consistency with the original spectral curves.Additionally, the proposed method effectively preserves the spectral correlation among different channels.This observation confirms the superiority of the proposed method over SOTA methods.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

D. Simulated Experiments on CAVE
validate the reconstruction performance of the proposed method at a spatial resolution of 512 × 512, we conducted comparative experiments on five scenes within the CAVE hyperspectral dataset.The dimensions of the CAVE hyperspectral data, measurement images, and grayscale images are provided in Table II.The reconstructed results were compared with four algorithms based on the CASSI imaging system, namely Gap-TV [47], DeSCI [48], PNP-DIP, and PNP-HSI [49], and five algorithms based on the DCCHI imaging system, including TwIST-TV [50], NSR [51], LRMA [52], DLTR [53], and PID-ADMM, as shown in Table IV.A comparison reveals that the proposed algorithm outperforms existing SOTA methods in reconstructing hyperspectral data at a resolution of 512 × 512, with a 4.57 dB increase in PSNR, a 0.1% improvement in SSIM, and a 0.029 decrease in SAM.Fig. 9 provides a visual comparison of the reconstruction results for Scenes 01 and 03 among different algorithms, demonstrating that the proposed method exhibits higher reconstruction quality for images in different spectral bands, preserving more texture details and thereby proving the superiority of the proposed method.
Comparing the spectral data of the sampling points (200, 320) and (20, 250) from the reconstructed results of the Scene 03 group, as shown in Fig. 10, it can be observed from the comparison that the spectral reconstruction results of the proposed method are closest to the prior spectral data.This further confirms the advantage of the proposed method in terms of long-range spectral correlation.

E. Experiments on the Real Dataset
To further validate the reconstruction performance of the proposed method, we conducted tests on real scene measurement data [54] with a spatial resolution of 256 × 256.During the model training process, we followed the same experimental settings as in [37], [41], and [43] and trained the proposed method's prior model as well as the SOTA methods using the 28-channel KAIST and CAVE hyperspectral datasets.The trained prior model was then used to reconstruct the hyperspectral data ranging from 450 to 650 nm, and the reconstruction results were visualized and compared, as shown in Fig. 11.By comparison, it can be observed that the reconstruction results of the proposed method exhibit clearer texture information, leading to better reconstruction quality.The real-scene hyperspectral data was collected within a wavelength range of 400-700 nm.We selected the spectral data of 15 channels within the range of 448.25-651.74nm as reference data and qualitatively compared it with the spectral reconstruction results from 28 channels, as shown in Fig. 12.The amplitude of reflectance and its variation trend in the reconstruction results from the proposed method are closer to the variations in the real spectral data, demonstrating the superiority of the proposed method.

F. Simulated Experiments on RS
To verify the reconstruction performance of the proposed method in RS hyperspectral data, this article conducted experiments using five types of   image and the grayscale measurement image are shown in Table II.
Table V shows the comparison of reconstruction results for different scale RS data.From the table, it can be observed that the proposed method achieves good reconstruction results for different sets of RS images.It has high PSNR and SSIM, as well as good spectral consistency.Compared to the PID-ADMM reconstruction method, the proposed method improves PSNR by 10.2 dB, increases SSIM by 13.6%, and reduces SAM by 11.2%.This further validates the effectiveness of the proposed encoded vector-enhanced reconstruction strategy in multiscale, multichannel RS data.Taking Botswana as an example, pseudo-color images were generated for bands 55, 80, and 90 and compared visually with the prior RS data.The spectral responses of the RS pixels at positions (150, 10) and (290, 15) were also compared, as shown in Fig. 13.Through comparison, it can be observed that the pseudo-color images generated by the proposed method are closest to the prior RS data, preserving more detailed texture information.From the comparison of the

TABLE V COMPARISON OF RECONSTRUCTION RESULTS
FOR DIFFERENT SCALE RS DATA spectral reflectance curves of the sampling points, it can be seen that the proposed method achieves better reconstruction quality.

G. Ablation Study
1) Separation Representation Characteristics: To verify the impact of the proposed encoding vector feature-based pixel separation representation method on the reconstruction results, under the same algorithm parameters, five groups of RS data without added encoding vector features were used to train the prior model.The reconstruction results, as shown in Table VI, demonstrated an improvement of 7 dB in PSNR, a 5.8% increase in SSIM, and an 8% decrease in SAM after adding the encoding vector features.The experiments indicate that adding encoding vector features is advantageous for enhancing the separation representation of pixel information, exhibiting superior algorithm performance in multiband RS data, thereby validating the contribution of encoding vector information in enhancing the separation representation characteristics between pixels.
2) Parameter Settings: To analyze the influence of the number of regression trees and the maximum depth parameter on the algorithm's reconstruction performance, we set the number of regression trees to 100 and conducted experiments on the maximum depth parameter.For testing, we used the Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.Scene 01 data and performed separate tests for each experiment.Fig. 14 shows the training time of the prior model and the variation of PSNR in the prediction results.In particular: 1) represents the variation of PSNR in prediction results under different conditions of the number of regression tree models; 2) represents the variation of PSNR in the prediction results under different depth conditions; 3) represents the training time of the model under different numbers of regression trees; and 4) represents the training time of the model under different depths.By comparing the cases with different numbers of regression trees, we can observe that as the number of regression trees increases, the reconstruction effect tends to stabilize.This means that adding more regression trees may not significantly improve the prediction quality.At the same time, we also noticed that the training time is linearly correlated with the number of regression trees, that is, as the number of regression trees increases, the training time also increases accordingly.Additionally, the prediction quality and the training time of the prior model tend to stabilize as the depth of the regression trees increases.When the number of regression trees is 100 and the depth is 29, a good prediction result can be achieved within a relatively short training time.This indicates that selecting an appropriate number of regression trees and depth can reduce training time consumption while ensuring reconstruction quality.

A. Generalization Performance
To evaluate the generalization performance of the proposed algorithm, we conducted experiments using the PaviaU RS data.The data has a spatial resolution of 610 × 340, and a 256 × 256 image patch was extracted as the test data, while the remaining portion was used for training.Fig. 15 shows the pseudo-color images generated from bands 45, 55, and 65 of the prior RS data, as well as the corresponding pseudo-color images of the reconstruction results on those bands.Additionally, the reconstruction results for different bands are displayed, and the spectral reflectance of pixel positions (150, 160) and (80, 180) are compared.According to the test results, the proposed method achieved a PSNR of 31.33 dB, an SSIM of 0.903, and a SAM evaluation parameter of 0.239 for the reconstruction of unknown scenes.This further confirms the good generalization performance of the proposed  method for RS data of unknown scenes and the satisfactory reconstruction results obtained by the proposed method.

B. Computational Complexity and Reconstruction Time
The experimental data used in this study is consistent with the KASIT hyperspectral data used in reference [42].The proposed method was compared with SOTA methods in terms of reconstruction quality and computational parameters.The time consumed for model training and reconstruction was recorded to evaluate the efficiency of the proposed algorithm, as shown in Table VII.Through comparison, it be observed that the proposed method has a lower number of parameters.Due to the involvement of multiple iterations in the ADMM optimization algorithm during the solving process, the reconstruction time is increased.Improving the hardware resources and implementing parallel processing can help reduce the solution time of the algorithm.When comparing the reconstruction quality and parameter count of the proposed method with SOTA methods, the proposed method consumes 54.73 s in the reconstruction process, which is much lower than GAP-Net and ADMM-Net.Compared to the well-performing MST-L algorithm, the proposed algorithm has a lower computational parameter count and achieves higher reconstruction quality.
The reconstructed results show an improvement of 5.58 dB in PSNR and 4% in SSIM, further demonstrating the superior algorithmic performance of the proposed method.

C. Applicability
To validate the applicability of the proposed method to different imaging systems, experiments comparing single-disperser CASSI (SD-CASSI) and DD-CASSI were conducted using PaviaU RS data.The reconstruction results are shown in Table VIII.The proposed method demonstrates applicability to SD-CASSI and achieves superior reconstruction quality compared to the SOTA methods.The PSNR is improved by 4.58 dB, SSIM is improved by 7.6%, and SAM is reduced by 1.5% for SD-CASSI.This article leverages the advantage of DD-CASSI in eliminating aberrations introduced by the dispersing element, resulting in higher optical imaging quality compared to SD-CASSI.This contributes to improving the reconstruction performance of the algorithm model for hyperspectral data.Compared to the reconstruction results of the SD-CASSI system, the proposed method improves PSNR by 1.47 dB, SSIM by 1.4%, and reduces SAM by 1.3%.These results confirm the applicability of the proposed method to different imaging systems.

VI. CONCLUSION
This article proposes a compressed spectral reconstruction method based on enhanced encoding feature vectors.The experiments demonstrate that transforming scalar features of grayscale measurement images into vector features effectively enhances the pixel separation between neighboring pixels and Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.channels, thereby improving the semantic similarity between the predicted results of the prior model and the prior spectral image.The PSNR of the predicted results of the prior model increases by 5.32 dB, and the SSIM increases by 3.5%.A regularization term is established using the TV difference between the predicted results of the prior model and the reconstructed results in the compressed sensing solution process.By combining it with the ADMMs, precise pixel information reconstruction can be achieved.The reconstruction results of various scene hyperspectral datasets indicate that the proposed method improves the PSNR of the KAIST hyperspectral data by 4.31 dB, the SSIM by 2.8%, and the SAM calculation results are competitive with SOTA methods.For the CAVE hyperspectral data, the PSNR improves by 4.57 dB, the SSIM improves by 0.1%, and the SAM decreases by 2.9%.In the case of hyperspectral RS data, the PSNR improves by 10.83 dB, the SSIM improves by 14.3%, and the SAM decreases by 16.3%.The proposed method exhibits good generalization performance across different scenes, as well as achieving higher reconstruction quality with lower computational parameters compared to SOTA methods.

Manuscript received 23
October 2023; revised 8 December 2023; accepted 21 December 2023.Date of publication 25 December 2023; date of current version 9 January 2024.This work was supported in part by the National Natural Science Foundation of China under Grant 62275211 and Grant 61675161, and in part by the Innovation Capability Support Program of Shaanxi under Grant 2021TD-08.(Corresponding author: Jie Li.)

Fig. 1 .
Fig. 1.Structure composition of the DCCHI system and the data structure of DD-CASSI detector sampling.

Fig. 4 .
Fig. 4. Comparison of pseudo-color images of different a prior model predictions.

i and µ y pre i represent the mean
of the original spectral image and the reconstructed image of the ith band, respectively.ofthe original spectral image and the reconstructed image of the ith band, respectively.δ y org i y pre i

Fig. 7 .
Fig. 7. KAIST Scene 01 Visual comparison of reconstruction results from different methods.
RS data: PaviaU, PaviaC, KSC, Botswana, and Salinas.PaviaU and PaviaC are RS data from the University of Pavia and Pavia Centre, acquired by the ROSIS sensor.The spatial resolutions of PaviaU and PaviaC are 610 × 610 and 1096 × 1096, respectively.After removing low signal-to-noise ratio bands, the effective band numbers are 103 and 102, respectively.The KSC data is RS data from the Kennedy Space Center, acquired by the NASA AVIRIS instrument.It has a spatial resolution of 512 × 614 and an effective band number of 176.Botswana is RS data from the Okavango Delta, acquired by the NASA EO-1 satellite's Hyperion sensor.It has a spatial resolution of 512 × 217 and an effective band number of 204.To match the size of the mask, the RS data were cropped accordingly.The dual-camera CSI system was used to detect each scene separately.The dimensions of the DD-CASSI measurement Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

Fig. 9 .
Fig. 9. Comparison of reconstruction results visualized by different methods for CAVE scenes.(a) Comparison of Scene 03 hyperspectral reconstruction results.(b) Comparison of Scene 05 hyperspectral reconstruction results.

Fig. 11 .
Fig. 11.Comparison of reconstructed results of the real scene dataset.

Fig. 13 .
Fig. 13.Visualization comparison of the reconstruction results for KSC hyperspectral RS data.

Fig. 14 .
Fig. 14.Changes in PSNR and training time of the algorithm for different numbers of regression trees as well as maximum depth conditions.(a) PSNR variation of prediction results under different regression tree model conditions.(b) PSNR variation of prediction results under different depth conditions.(c) Training time of the model at different numbers of regression trees.(d) Training time of the model at different depths.

Fig. 15 .
Fig. 15.Reconstruction results of the proposed method on PaviaU hyperspectral RS data.

TABLE I COMPARISON
OF SEMANTIC SIMILARITY IN DIFFERENT PRIOR MODELS

TABLE II SPECIFICATIONS
OF THE DATASET USED IN THE EXPERIMENT

TABLE III COMPARISON
OF RECONSTRUCTION RESULTS FROM DIFFERENT METHODS IN KAIST SCENES

TABLE IV COMPARISON
OF RECONSTRUCTION RESULTS FROM DIFFERENT METHODS IN THE CAVE SCENE

TABLE VI COMPARISON
OF RECONSTRUCTION RESULTS BETWEEN ADDING ENCODING VECTOR FEATURES AND NOT ADDING ENCODING VECTOR FEATURES IN RS DATA