Exploring the Potential of SMAP Soil Moisture for Improving Real-time Streamﬂow Prediction in the U.S. Corn Belt

—L-band microwave satellite missions provide soil moisture information potentially useful for streamﬂow and hence ﬂood predictions. However, these observations are also sensitive to the presence of vegetation that makes satellite soil moisture estimations prone to errors. In this study, the authors evaluate satellite soil moisture estimations from SMAP (Soil Moisture Active Passive) and SMOS (Soil Moisture Ocean Salinity), and two distributed hydrologic models with measurements from in situ sensors in the Corn Belt state of Iowa, a region dominated by annual row crops of corn and soybean. First, the authors compare model and satellite soil moisture products across Iowa using in situ data for more than 30 stations. Then, they compare satellite soil moisture products with state-wide model-based ﬁelds to identify regions of low and high agreement. Finally, the authors analyze and explain the resulting spatial patterns with MODIS (Moderate Resolution Imaging Spectroradiometer) vegetation indices and SMAP vegetation optical depth. The results indicate that satellite soil moisture estimations are drier than those provided by the hydrologic model and the spatial bias depends on the intensity of row-crop agriculture. The work highlights the importance of developing a revised SMAP algorithm for regions of intensive row-crop agriculture to increase SMAP utility in the real-time streamﬂow predictions.


I. INTRODUCTION
In this study, we address the temporal and spatial variability of uncertainties of satellite-based soil moisture maps. Our motivation is to explore the potential of soil moisture estimates for improving real-time streamflow, thus flood prediction over a large domain of an agricultural region. Soil moisture controls the partitioning of rainfall into runoff and infiltration, and thus accurate knowledge of its state in time and space seems crucial for skillful streamflow prediction.
The Iowa Flood Center [1] operates a high-resolution rainfall-runoff model over the state of Iowa. The model partitions the landscape into hillslopes and channel links that form a river drainage network [2]. The model keeps track of the water content in the top soil column at the hillslopes, which is the basis for runoff generation and delivery to the drainage network. Satellite-based soil moisture estimates present an opportunity to "correct" the model-based soil moisture states in space and time. To fully realize this opportunity, we need to understand the uncertainties in the satellite-based estimates and how they vary in space and over time. The temporal variability is especially acute in Iowa, where there is a pronounced cycle of vegetation growth (row crops of corn and soybean).
We have explored some aspects of this uncertainty/variability in several previous studies over Iowa. For example, we demonstrated that there is information in the satellite soil moisture about runoff generation [3], spatial variability [4], and the potential for useful correction of streamflow predictions [5]. We have also demonstrated that there is seasonal bias in satellite-based soil moisture retrieval [6]. Here we build on those earlier studies.
Satellite soil moisture estimations, model soil moisture predictions, and in situ sensor measurements constitute three different sources of soil moisture information. Unlike in situ sensor observations, satellite and model estimates have a larger spatial extent that is more suitable for understanding soil moisture dynamics and its spatial variability (e.g., [4], [24]). Babaeian et al. [25] provide a review of different soil moisture sensing techniques and their applications. Previous studies have evaluated satellite soil moisture estimations with in situ sensor measurements over the globe (e.g., [15], [26]- [30]). Some studies have compared modeled predictions with satellite soil moisture (e.g., [31]- [33]). More recently, Beck et al. [34] conducted an evaluation of 18 satellites and model soil moisture products with in situ sensors over United States (U.S.) and Europe.
Spaceborne L-band microwave radiometers observe brightness temperatures at vertical and horizontal polarizations near 1.4 GHz, which is sensitive to water content in soil and vegetation. The brightness temperature is used with other land surface variables to retrieve soil moisture (e.g., [35]). However, in dense vegetation conditions, where vegetation water column density is high, brightness temperature sensitivity to soil moisture decreases. Still, the sensitivity is significant up to and including levels of vegetation as high as a corn crop (e.g., [36]). Walker et al. [37] conducted a five-year evaluation of SMOS satellite soil moisture in the U.S. Corn Belt, a region of intensive corn and soybean row-crop agriculture in the Midwest U.S. They discussed the potential sources of dry bias in SMOS Level 2 retrievals, such as auxiliary modeled temperature and soil surface roughness.
A few studies have characterized time-variant errors of satellite soil moisture. For example, [6] conducted a seasonal evaluation of SMAP satellite soil moisture in the Corn Belt. Zwieback et al. [38] investigated the time-variable biases introduced by vegetation misrepresentations in SMAP soil moisture retrievals. They used a Bayesian extension of triple collocation analysis (TCA) to study errors in soil moisture retrievals over croplands in the U.S. More recently, [39] used TCA to study time-variant errors of SMAP and ASCAT (Advanced Scatterometer) soil moisture products globally. TCA was introduced by [40] for estimating uncertainties in wind data. It has been one of the common methods for estimating error variances of the different soil moisture products, some of which we have discussed in this introduction. However, [41] showed that major TCA assumptions do not generally hold for surface soil moisture products.
Overall, previous evaluations of satellite soil moisture estimates have provided useful insights on errors and their potential sources over the globe. However, only a few studies focus on dominantly agricultural regions where satellite soil moisture products are prone to more potential errors. For example, [37], and [6] showed that satellite soil moisture estimations are drier than in situ observations in the U.S. Corn Belt. However, because of limited in situ sensors in space, it is challenging to investigate satellite estimations and relevant errors over a large domain. Alternatively, hydrologic model predictions driven with radar-based rainfall could be used to gain better insights on potential errors of satellite soil moisture in a larger spatial extent.
In this study, we evaluate two satellite soil moisture products with two model predictions and in situ sensor measurements over the state of Iowa in the U.S. Corn Belt for five years (2015-2019). First, we compare soil moisture predictions from the two spatially-distributed hydrologic models to determine whether the model better suited for comparison with satellite soil moisture estimations also has the best agreement with in situ soil moisture observations at the soil surface. Second, we compare satellite and model soil moisture to identify re-gions of Iowa of strong/weak agreement. Finally, we compare vegetation information from a multi-spectral satellite with vegetation information from SMAP satellite soil moisture.
In the next section, we describe the study region and data. Then, we provide details of the state-wide hydrologic models and evaluation metrics used in this study. Afterward, we present our findings from data analysis with interpretations followed by a discussion of results and their relevance to previous studies. Finally, we summarize our findings and conclude with their implications for the real-time streamflow predictions.

II. STUDY REGION & DATA
The study domain mainly covers the state of Iowa in the U.S. Corn Belt (Figure 1). More than 70% of Iowa's surface area is covered by cropland (mainly row crops of corn and soybean) [42]. Based on historical data, Iowa climate is characterized as warm and humid (e.g., [43]) with an annual average precipitation of 870 mm (1981-2010). Annual maximum daily rainfall occurs in June and July [44].
We conduct our study for five years from 2015 to 2019. We use data for each year from April 1 to October 31 to exclude periods of frozen soil or snow cover surface conditions. Details on the data products for this study is described as follows.
We use a Stage IV radar-based gauge-corrected product for rainfall forcing [45] posted on a grid with approximately 4km resolution [46]. For evapotranspiration (ET) forcing, we use the climatologic average from the North American Land Data Assimilation (NLDAS) [22].
We use in situ sensor soil moisture at 5 cm depth as the reference soil moisture observations. Figure 1 shows a map of the in situ soil moisture sensors, and number of available sensors and data availability percentage for each year. USDA Agricultural Research Service (ARS) and Iowa Flood Center (IFC) in situ sensors are shown in green and blue, respectively. IFC soil moisture sensors are denser in Turkey River basin located in northeast Iowa. ARS sensors are only available at the South Fork watershed located in the north central part of the state. This watershed is one of the core validation sites for the SMAP satellite mission.
We use soil moisture and vegetation optical depth (VOD) from SMAP and SMOS satellites. VOD is a parameter that indicates the degree of microwave radiation extinction (attenuation and scattering) due to presence of vegetation canopy. We use the Enhanced SMAP Level 3 Version 3 (L3 P E) soil moisture provided on EASE-Grid version 2 [47] with a 9-km grid spacing and 33 km resolution [48]. We evaluate the Single Channel Algorithm soil moisture product [49] retrieved using vertically-polarized brightness temperature (SCA-VPOL).
SMOS soil moisture [12] is posted on the ISEA (Icosahedral Snyder Equal Area) grid [50] with an approximate resolution of 43 km in space. We use SMOS Level 2 Soil Moisture Output User Data Product (MIR SMUDP).
SMAP and SMOS satellites' overpass interval range from 12 to 36 hr for a given point at approximately 6 AM and 6 PM. Both satellites are L-band microwave while the SMAP sensor is a real aperture radiometer and SMOS uses a synthetic aperture radiometer. Also, SMAP brightness temperature observations has less RFI contamination than SMOS because of better RFI filtering hardware and software [51].
We use POLARIS soil hydraulic properties provided on a grid with 30 m spatial resolution [52]. POLARIS is probabilistic soil property maps for contiguous U.S. We the median estimates of the soil properties from this database. The soil parameters used in this study consist of residual soil moisture (θ res ), porosity (θ sat or soil moisture at saturation), hydraulic conductivity at saturation (K sat ), matric suction head at saturation (ψ sat ), and Brooks-Corey's water retention curve fitting parameter (λ).
We obtain FPAR (fraction of absorbed photosyntheticallyactive radiation) and LAI (leaf area index) from the MODIS satellite. The data are provided with a four-day interval and a 500-m spatial resolution [53]. Details of the MODIS dataset are provided in the dataset documentation [54]. We upscaled FPAR and LAI estimations from MODIS to the Enhanced SMAP satellite's grid scale (∼ 9 km).

III. METHODS
This section provides details of the two hydrologic models followed by definitions of the evaluation metrics used.

A. Hillslope-Link Model (HLM)
The HLM model was developed at the Iowa Flood Center and it is used as the state-wide operational hydrologic flood forecasting model [1]. The model decomposes the landscape into hillslopes and links that describe water movement at the hillslope and river channels, respectively [55]. Previous studies (e.g., [56], [57]) have demonstrated successful applications of HLM. The state-wide hydrologic domain consists of more than 400,000 hillslopes with median area of 0.3 km 2 . Figure 2 shows a schematic diagram for HLM structure at each hillslope. Hillslope processes are driven by two hydrometeorologic forcings, rainfall (P) and evapotranspiration (ET). Three layers represent the ponded surface (S p ), top-layer soil storage (S t ), and groundwater storage (S s ). During rainfall events, the ponding layer (S p ) receives water and exchanges water with the top layer and channel link with infiltration (q pt ) and overland flow (q pl ), respectively. Evapotranspiration flux is extracted from the soil layers and the ponded surface. Groundwater discharge contribution to the channel is provided by the groundwater discharge flux (q sl ). Quintero et al. [55] provide more details of the model formulation and the documentation could be found at https://asynch.readthedocs.io/.
The HLM model provides predictions of top layer soil storage in units of meters for the top 20 cm of the soil.
For comparison with other soil moisture products, we convert HLM's storage values to the volumetric water content (cm 3 · cm −3 ) using the corresponding residual soil moisture and porosity.
The current structure of the HLM model does not have information on soil hydraulic properties. In the next section, we present an implementation of the Richard's Equation in the HLM model's top-layer to account for physically-based soil parameters and to better match the sampling depth of SMAP observations.

B. Richard's Equation implementation in HLM (HLMr)
In order to better match HLM with satellite soil moisture, and potentially improve its performance, we created a revised version of HLM that we call HLMr. In HLMr we substituted HLM's top-layer formulation (S t ) with a solution of Richards equation that describes soil water movement in a vertical soil column [58] ∂θ ∂t where soil moisture (θ) varies with depth (z) and time (t).
Here, q is the flux per unit area crossing a horizontal surface. This flux (q) is given by Darcy-Buckingham law where K(θ) is the soil hydraulic conductivity, and ψ(θ) is the pressure head which can be calculated from van Genuchten-Mualem [59], [60] or Brooks-Corey's [61] soil-water retention curves. In this study, we use the Brooks-Corey model to estimate unsaturated hydraulic conductivity given as and matric suction head by where λ is Brooks-Corey's fit parameter [61], K sat = K(θ sat ), and ψ sat = ψ(θ sat ). Normalized water content or effective saturation (φ) is defined as where θ sat is porosity and θ res is the residual soil moisture. These two variables represent the upper and lower bounds for soil moisture dynamic range, respectively. Figure 3 shows the soil layer discretization and soil moisture sensors at each soil moisture observation site. The flux between two layers is defined as where i is the index of the layer, L i is the layer thickness of a given layer, andK is the average soil hydraulic conductivity of the two layers calculated bŷ Assuming an initial zero ponding (s p = 0), if the rainfall rate is less than the infiltration rate from the 1-cm saturated layer, then the rainfall will only infiltrate and the top-layer soil moisture rate of change is If the rainfall rate (q rain ) is larger than the infiltration rate (q inf ) for the 1-cm saturated top soil layer, then the rate of change for ponding depth will be where q pl is the rate of surface runoff to the hillslope channel. After ponding (s p > 0), the rate of soil moisture change will also depend on the surface ponding depth given in Eq. (6).
ET is a combination of soil evaporation and transpiration from vegetation and is either energy or moisture limited. We incorporate these two limits in the ET flux formulation as follows. The total ET (ET tot ) is extracted recursively starting from the top layer soil in the column and it is defined as where C i is defined as where S lim is the soil moisture availability limit and it is defined as Equations (10)(11)(12)(13)(14) summarize the ET formulation in the proposed model. Figure 4 shows an example the three-stage ET coefficient (C i ) for a given soil layer and residual soil moisture. As shown in this figure, the ET flux from a layer will depend on its available soil moisture. During the energy limit stage (θ 0.2), the ET flux from a soil layer will be higher. In the transition stage, the matric suction will increase until there is limited moisture available for ET. At the soil moisture limit stage, the actual ET flux from the soil layer will be lowest. We up-scale POLARIS soil properties to the hillslope scale by taking the median value of the available 30-meter pixels within each hillslope. Figure 5 shows an example up-scaled soil property map from POLARIS database [52] for residual soil moisture (θ res ) for the model domain. We also show the estimated probability density function (in percentage) using a Gaussian kernel to provide insight on the modality of the distribution in space. As shown in Figure 5, the POLARIS soil properties dataset captures the geologic features.
For comparisons with satellite estimations and in situ sensor averages, we up-scale the model soil moisture to the SMAP grid by calculating areal average soil moisture for the hillslopes corresponding to SMAP pixels.

C. Evaluation metrics
We evaluate and compare soil moisture data products using different evaluation metrics described as follows.
Let o and s denote reference and simulation data vectors, and µ, and σ be the mean and standard deviation. The Pearson correlation coefficient, r, is defined as Root mean squared error is Relative RMSE is defined as Mean absolute error (MAE) is The Kling-Gupta efficiency or KGE [62] is defined as where r is the Pearson correlation coefficient of reference and simulated soil moisture given in Eq. (15). Finally, following [63], bias is defined as IV. RESULTS In this section, first we present the evaluations with reference in situ soil moisture measurements over the state of Iowa. This analysis establishes benchmarks for satellite and model soil moisture products. In other words, we aim to identify the soil moisture product that has the best agreement with in situ soil moisture measurements. Then, we compare satellite estimates with soil moisture time-series from the best candidate model. Finally, we provide additional analysis of the vegetation and its signature on the soil moisture evaluations over our study domain.
A. Benchmark evaluation of soil moisture products Figure 6 shows the estimated probability density functions of evaluation metrics for the HLM and HLMr models, and SMAP and SMOS with in situ sensor observations as the reference for years 2015 to 2019. This figure provides performance summary of each soil moisture product with reference to the average of in situ sensor soil moisture at 5 cm depth across the state of Iowa. HLMr and HLM soil moisture show higher KGE and correlation coefficients compared to SMAP and SMOS satellite-based estimations. SMAP has lower median bias than original HLM model's soil moisture but higher bias compared to HLMr. Largest median Bias corresponds to SMOS satellitebased soil moisture product. Figure 6 illustrates that HLMr soil moisture shows the best performance as compared to HLM with respect to in situ sensor soil moisture observations. Therefore, HLMr is selected as the candidate for the comparisons with satellite-based soil moisture estimations from SMAP and SMOS.  respectively. SMAP soil moisture exhibits a lower mean than HLMr soil moisture in the north and central parts of the domain while the two products show strong agreement in the south and northeast. Bias values show similar spatial patterns for SMAP and SMOS with reference to HLMr soil moisture. However, SMOS has drier mean soil moisture than HLMr and SMAP across the domain.

C. Potential effect of vegetation on satellite-based soil moisture accuracy
We compare vegetation patterns from MODIS satellite observations with SMAP vegetation optical depth (VOD). Figure 9 shows time-series of up-scaled MODIS satellite-based LAI for all SMAP pixels over the study domain from 2015 to 2019. During 2015, MODIS data have anomalies but it does not affect the analysis in our study. LAI exhibits consistent and strong seasonality for the study domain. Figure 9 indicates that LAI peaks in June to July after which crops (e.g., corn and soybean) are fully developed. The variability of LAI values at a given time is because of differences in local climate. FPAR has similar seasonality during study period. In Figure 10, we show the maps of yearly maximum FPAR over the study domain. FPAR ranges from 0.7 to 0.8 for the south while northern parts have higher FPAR values ranging from 0.8 to 0.9. Parts of the southern region have lower FPAR during 2017 which was a dry year. There is a consistent pattern in space for FPAR yearly maximum values for all years. Yearly maximum LAI values are higher in west and north-central parts of the domain compared to other regions. Figure 11 illustrate maps of annual maximum vegetation optical depth from SMAP climatological VOD data for the study domain. SMAP shows similar spatial pattern with FPAR annual maximum maps from MODIS satellite. The SMAP VOD maximum map shows similar pattern with maps of the bias evaluation metric for comparisons of SMAP with HLMr soil moisture. Furthermore, the comparisons between SMAP and HLMr soil moisture data with KGE values show a higher agreement for the southern region of the domain where SMAP has lower VOD values compared to other regions.

V. DISCUSSION
Our results for SMAP show a median RMSE of 0.085 (cm 3 /cm 3 ) and a median dry bias of 0.04 (cm 3 /cm 3 ). Colliander et al. [26] found an RMSE of 0.083 and a dry bias of 0.064 (cm 3 /cm 3 ) for SMAP in the South Fork watershed from April 1, 2015 to February 29, 2016. Results from [6] for SMAP SCA-VPOL indicate a 0.018 dry bias (cm 3 /cm 3 ) and RMSE of 0.051 for South Fork watershed during 2015-2018. Our evaluation of SMOS satellite-based product with in situ sensor soil moisture observations show a median dry bias of 0.065 (cm 3 /cm 3 ) and RMSE of 0.1 (cm 3 /cm 3 ). Walker et al. [37] found -0.039 (cm 3 /cm 3 ) and 0.062 (cm 3 /cm 3 ) for bias and RMSE for SMOS over the South Fork watershed. Overall, SMAP and SMOS show dry bias over our study domain while SMOS is drier than SMAP.
Compared to satellite-based soil moisture estimations, model-based predictions showed better agreement with the average of in situ sensor soil moisture observations across the study domain. The benchmark evaluations of soil moisture products indicate that HLMr model-based soil moisture has the best performance compared to HLM, SMAP and SMOS soil moisture products for all evaluation metrics. This result emphasizes the role of radar-based rainfall observations as a good indicator for soil moisture conditions [20]. Previous studies on evaluation of satellite-based and model-based soil moisture found similar results where model-based open loop soil moisture predictions showed better performance than satellite-based estimations (e.g., [34]).
Jadidoleslam et al. [5] found that assimilation of SMAP and SMOS satellite estimations in the original HLM model improve streamflow predictions. Furthermore, SMAP provided a higher degree of improvement for streamflow prediction than SMOS. We have determined in the present study that HLMr provides better soil moisture predictions compared to in situ measurements than original HLM used in [5]. Moreover, HLMr has a top-layer depth that is more representative of the L-band microwave sampling depth. Therefore, we hypothesize that data assimilation of SMAP soil moisture in HLMr could further increase streamflow prediction performance. A followup study is needed to test this hypothesis.
We found consistent spatial patterns in KGE and bias for SMAP soil moisture compared to HLMr soil moisture, specifically in south-central Iowa and to a lesser extent northeast Iowa. These regions with higher KGE and lower bias for SMAP and HLMr have strong spatial similarity with lower yearly maximum FPAR and lower SMAP VOD used in SMAP's SCA-VPOL soil moisture algorithm. Landcover from USDA CropScape [64] is shown in Figure 12 for the 2017 growing season. Note that the south-central and northeast regions of Iowa have a lower percentage of land area in row crop (corn and soybean) production than all other regions of Iowa. This is true not only for 2017 but for all other years considered here. Higher yearly maximum FPAR values are consistent with areas of higher-intensity row crop production since row crops accumulate approximately 6 kg · m −2 of fresh biomass and more than 3 kg · m −2 of dry biomass annually [65]. Consequently, we would expect lower maximum FPAR and thus lower SMAP VOD in the south-central and northeast regions. This spatial pattern suggests that SMAP soil moisture is more accurate in regions with less row-crop agriculture (and less annual change in VOD) and less accurate in regions with Previous studies have investigated the effect of vegetation on soil moisture. For example, [39] found that LAI shows higher correlation with SMAP satellite soil moisture errors   the errors in croplands over the U.S. They attributed the timevariable biases to misspecification of vegetation optical depth in SMAP retrieval that exhibits seasonal error structure in the evaluations. Our results highlight the impact of vegetation in agricultural regions, confirms the findings of previous studies, and underscores the need for a more robust retrieval algorithm that better accounts for changes in vegetation in agricultural regions. Alternatively, it may be possible to use a spatial error structure within a data assimilation scheme to account for poor SMAP performance in regions of intensive agriculture.
Finally, we note inherent limitations in our study similar to other evaluation studies of satellite and modeled soil moisture with in situ sensor measurements. First, the number of soil moisture in situ sensors are limited in space and only provide   the information on the local conditions of soil moisture. However, our results show agreement with findings of previous studies that evaluated satellite-based soil moisture in our study domain (e.g., [6], [37]). Second, the timestamps of satellitebased and model-based products are not exactly the same and can have a maximum difference of an hour. However, the timestamps of model soil moisture and in situ sensors are matched.

VI. SUMMARY & CONCLUSIONS
In this study, we evaluated satellite-based and model-based soil moisture products with in situ sensor observations in a dominantly agricultural region from 2015 to 2019. We used in situ sensor observation averages to evaluate the soil moisture estimations from SMAP and SMOS satellites, Hillslope-Link Model (HLM), and an implementation of Richard's equation in the original HLM model (HLMr). Then, we compared the satellite-based soil moisture estimations with best performing model-based predictions over the state of Iowa with a dominantly row-crop agricultural landcover. Finally, we assessed our evaluation results with respect to vegetation dynamics in our study domain. The following conclusions could be drawn from our results: • HLMr model-based soil moisture provides better predictions than HLM and other satellite-based soil moisture products.
• SMAP satellite-based soil moisture shows more consistent performance with HLMr model predictions than SMOS in a dominantly agricultural region with strong vegetation seasonality. • Spatial patterns of bias values between SMAP and SMOS with HLMr soil moisture show strong similarity with the map of MODIS FPAR and SMAP vegetation optical depth. • This spatial patterns of bias and KGE suggest that SMAP soil moisture is more accurate in regions with less rowcrop agriculture and less accurate in regions with more row-crop agriculture. Efficiency of satellite-based soil moisture data assimilation in hydrologic predictions (e.g., flood and drought forecasts) depends on the correct introduction and handling of error components (e.g., bias, RMSE). Our study highlights the importance of understanding and accounting for space-variant errors in satellite-based soil moisture and the need for an improved SMAP retrieval algorithm. We hypothesize that introducing space-variant errors in a data assimilation scheme could further improve utility of SMAP satellite in real-time flood predictions and overall soil moisture predictions in dominantly agricultural regions. Dr. Krajewski's current research focuses on understanding the genesis and evolution of floods through field data and modeling, and the quantification of uncertainty in hydrologic prediction at a range of temporal and spatial scales. He is a Fellow of the American Geophysical Union and the American Meteorological Society, and in 2021 was elected to the National Academy of Engineering.