Geospatial Assessment of Soil Organic Matter Variability at Sidi Bennour District in Doukkala Plain in Morocco

Understanding the spatial variability of soil organic matter (SOM) is critical for studying and assessing soil fertility and quality. This knowledge is important for soil management, which results in high crop yields at a reduced cost of crop production and helps to protect the environment. The benefits of an accurate interpolation of SOM spatial distribution are well known at the agricultural, economic, and ecological levels. It has been conducted this study for comparing and analyze different spatial interpolation methods to evaluate the spatial distribution of SOM in Sidi Bennour District, which is a semi-arid area in the irrigated scheme of the Doukkala Plain, Morocco. For conding this study, it was collected 368 representative soil samples at a depth of 0–30 cm. A portable global positioning system was used to obtain the location coordinates of soil sampling sites. The SOM spatial distribution was performed using four interpolation methods: inverse distance weighted and local polynomial interpolation as deterministic methods, and ordinary kriging and empirical Bayesian kriging as geostatistical methods. High SOM levels were concentrated in vertisols, and low SOM levels were observed in immature soils. The average SOM value was 1.346%, with moderate to high variability (CV = 35.720%). A low SOM concentration indicates a continuous possibility of soil degradation in the future. Ordinary kriging yielded better results than the other interpolation methods (RMSE = 0.395) with a semivariogram fitted by an exponential model, followed by inverse distance weighted (RMSE = 0.397), empirical Bayesian kriging (RMSE = 0.400), and local polynomial interpolation (RMSE = 0.412). Soil genetics and the strong influence of human activity are the major sources of SOM spatial dependence, which is moderate to low. Low SOM content levels (< 1.15%) were present in the southwestern and eastern parts of the digital map. This situation calls for the implementation of specific soil recovery measures.


INTRODUCTION
Soil is an invaluable source of food production. Agricultural land is under tremendous pressure from an increasing population, alongside intense climate change (Van Kernebeek et al. 2016). Besides, intensive farming contributes to soil depletion and reduces soil potential for crop production. Soil organic matter (SOM) content plays a prominent role in soil quality and soil fertility improvement ; Lehmann and Kleber 2015). SOM has several advantages that can be subdivided into physical, chemical, and biological categories. In addition, SOM plays a crucial role in sequestering carbon in the soil, thus mitigating the impact of climate change, as it represents the most important carbon reservoir in the terrestrial ecosystem (Johnston et al. 2009). However, the main issue of soil protection, including the continued degradation of SOM in agroecosystems, is unsustainable farming practices and climate conditions (Dahan et al. 2001;Badraoui 2006;Dhaliwal et al. 2019).
Accurate information on SOM spatial variability is a crucial indicator not only of soil quality but also of the carbon stock in the land ecosystem and is a good indicator for the agricultural system, natural resource management, climate modeling and environmental science (Liu et al. 2006; Robinson and Metternicht 2006;Bhunia et al. 2018). Thus, adequate information on the status behavior of SOM in space and time is required. However, precise spatial measurements of SOM could be expensive, time-consuming and labor-intensive to sample the soil (Bhunia et al. 2018). Therefore, there is often a relatively sparse number of soil samples available in a given region that does not represent a potential real degree of variation. Thus, for better planning and management, accurate SOM interpolation at unsampled sites is required. However, today's spatial data analysis methods and tools allow the monitoring of spatiotemporal changes in almost all soil attributes at various levels (Mabit and Bernard 2010; Dai et al. 2014; Bhunia et al. 2018). There is still not enough exploration in Morocco of procedures that allow the analysis by statistical interpolation methods with the precise prediction of SOM.
In the past, many conventional statistical techniques have been used to measure soil variability. These approaches do not define the spatial allocation of soil attributes in non-sampled locations. Deterministic approaches, which do not consider the spatial autocorrelation of data, are the most widely used methods for SOM spatial analysis. The most widely used methods are inverse distance weighted (IDW), splines and local polynomial interpolation (LPI) (Robinson and Metternicht 2006). In addition, there are stochastic geostatistical methods, such as ordinary kriging (OK), which consider every variable's evaluation at each location (Pang et al. 2011;Marchetti et al. 2012). In recent years, empirical Bayesian kriging (EBK) has become an essential alternative to traditional kriging techniques for mapping soil properties. It functions differently than conventional approaches; non-sampled locations are automatically calculated and measured in an image representing a particular region using a distribution of semivariogram models rather than a single model such as OK. When kriging parameters are used, their reliability is enhanced by multiple simulated variograms (Yang et al. 2014;Farina et al. 2017;Gribov and Krivoruchko 2020).
Given the significance of the SOM content mentioned above, this study explored four commonly used spatial interpolation methods (LPI, IDW, OK, and EBK) to (1) evaluate and compare the interpolation techniques for SOM spatial prediction, and (2) choose the most appropriate interpolation technique for creating a digital map to detect areas affected by SOM degradation.

Study area
We have conducted this study in the Sidi Bennour district, which is located in western Morocco in the Doukkala Plain (Fig.1). It covers an irrigated area of 436 km 2 . Soil and climatic conditions are favorable for agricultural production. It is situated in a semi-arid climate. It is slightly temperate in winter and hot and dry in summer (Bouasria et al. 2020). The annual temperature ranges from 12.1°C (min) to 26.3°C (max) with an average is around 19.2 °C. Annual rainfall was The irrigated region of Doukkala is one of the largest and most densely populated areas in Morocco. This region is remarkable for its scale and strategic significance in national development. It contributes to national production, with 38% for sugar beets and 20% for commercialized milk. The study region's most important crops are wheat, maize, sugar beet, and alfalfa (ORMVAD 2020).

Sampling, SOM laboratory analysis, and pedological information
In this study, we have used a restricted random sampling system. The study area was divided into a 1 km grid. Within each grid segment, a single sampling unit (a single location) was randomly selected. This technique allowed for coverage of the locations throughout the entire study area. We used a GPS to locate the geographical coordinates of the 368 samples (Fig. 2 ). The samples were collected at a depth of 30 cm in September 2013. They were then dried, crushed and sieved. The Walkley and Black method was then used to analyze the soil organic matter content (Walkley and Black 1934).
To describe the pedological classes in the study area, we used four pedological soil maps at a scale of 1:50,000 (Geoffroy 1978). We georeferenced these maps according to the national coordinate reference system (Merchich / North Maroc, EPSG: 26191). We have then digitized the study area's pedological classes by using the open-source QGIS software package. We then obtained digital vector maps with 28 series according to the French CPCS 1967 soil classification (CPCS 1967).

Statistical data analysis
The SOM descriptive statistics were determined assuming that the data were spatially independent. To draw the correct interpretation and better understand spatial interpolation, data normality is required, as it may have some consequences for geostatistical data analysis. In this regard, we checked the normality of the SOM dataset using the Kolmogorov-Smirnov test before spatial data analysis (Kerry and Oliver 2007). Therefore, the SOM data were subject to transformations when it did not fit the normal distribution at a suitable significance level (p < 0.05) (Webster and Oliver 2008). If, after logarithmic transformation, the dataset failed to achieve normality, then the normalized skewness values, ranging from -0.5 to 0.5, was the criterion to choose another method such as the Box-Cox transformation (Bogunovic et al. 2017).

Interpolation methods
In this study, we used two methods: deterministic and geostatistical. The first approach creates surfaces from the measured points. In contrast, the second method utilizes the measured points' statistical parameters to achieve the spatial interpolation. For deterministic interpolation techniques, we selected two methods: local polynomial interpolation (LPI) and inverse distance weighted interpolation (IDW). For geostatistical methods, we used ordinary kriging (OK) as well as empirical Bayesian kriging (EBK). These four methods were used to produce SOM spatial patterns in the study area.

Inverse Distance Weighted method
Inverse distance weighted (IDW) is a deterministic interpolation technique that is the most commonly used method in soil science for spatial interpolation (Bhunia et al. 2018). It is a nonlinear interpolation technique that calculates the prediction of non-sampled locations based on the average known values of adjacent points (Đurđević et al. 2019) (Eq. 1).
where: Z(s 0 ) is the estimated value at a given location s 0 , N is the total number of sampling points, Z(s i ) is the observed SOM at location s i p is the weight assigned to the distance separating the calculated point from the measured point d ij . The value of p was fixed at two (2) in this study.

Local polynomial interpolation (LPI) method
A local polynomial fits a series of small plans that overlap at sampling points, which further predicts each location in the region using the center of each plan. Local polynomial interpolation produces a map surface by combining the results of numerous polynomial formulas, each optimized for a given neighborhood. The maximum and minimum number of points, the shape of the neighborhood and the area configuration can be defined, and the sampling points of the neighborhood can be weighted according to the distance from the area of the prediction area (Schaum 2008; Xie et al. 2011).

Ordinary kriging method
To estimate the non-sampled position, ordinary kriging (OK) utilizes a weighted average of the calculated neighboring data values, which are reliant on the distance separating these points, as well as their groups and their values. OK estimates the z value of the random function for at least one non-sampled location. (Goovaerts 1999;Webster and Oliver 2008). Kriging estimates z * (s 0 ) and the error estimation variance 2 ( 0 ) at any location s 0 were, respectively, computed as follows: (2) where: ω i is the weight attributed to the known value of Z(s i ), µ is the Lagrange constant, γ(s 0 ) is the value of the semivariogram corresponding to the interval between s 0 and s i .
The kriging method uses semivariogram parameters that are adjusted to a given region to express the soil attributes spatial continuity. The semivariogram considers the distance to measure the strength of the statistical correlation (Oliver and Webster 2015). The parameters of the semivariogram are the range, which is the distance where the spatial association disappears, and the sill that corresponds to the highest variability where the spatial dependence disappears. The semivariogram is expressed as follows (Goovaerts 1999; Webster and Oliver 2008): where: γ(h) is the semivariogram function, h is the lag distance separating a pair of locations N(h), Z is the SOM property parameter, Z(s i ) is the value of Z at position s i and Z(s i + h) is the value of Z at position s i + h.
To calculate the geostatistical parameters, empirical semivariograms were fitted using theoretical semivariogram models. The calculated geostatistical parameters were nugget (τ 2 ), partial sill (σ 2 ), sill (τ 2 + σ 2 ), and range parameter (Φ). These parameters were used to assess the spatial dependency of SOM content (the ratio of nugget to sill variances, τ 2 /(τ 2 + σ 2 ), which is expressed as a percentage). In general, if the ratio is less than 25%, the spatial dependency is strong and if the ratio ranges from 25% to 75%, the spatial dependence is moderate; otherwise, the spatial dependence is weak (Cambardella et al. 1994).

Empirical Bayesian kriging method
Compared to traditional kriging methods, the empirical Bayesian kriging (EBK) approach automatically applies several different semivariogram models instead of a single fitted model for the whole area as OK. It assesses the nonsampled locations by accounting for the error in estimating the underlying semivariogram through repeated simulations (Krivoruchko and Gribov 2019; Gribov and Krivoruchko 2020). The following three main steps are involved in this method: (1) a semivariogram model is estimated from the available data. (2) New values were subsequently simulated at each location of the input data using the resulting semivariogram model. (3) Simulated data were used to estimate a new semivariogram model. Bayes' rule is used to measure the weight for this semivariogram, which indicates how likely the measured data can be produced from the semivariogram. Each of the two previous rules (2 and 3) is repeated for the simulation of a new values at the input locations using the estimated semivariogram in step 1. A newly developed semivariogram model and, using the simulated data, its weights were then calculated. This automated process allows the elaboration of a wide range of semivariograms that have been used in SOM spatial analysis in our study area. (Krivoruchko and Gribov 2019).

Accuracy assessment
We used the leave-one-out cross-validation technique to assess the performance of the spatial interpolation methods. For this purpose, we have used the following formulas to calculate the mean error (ME) and root mean square error (RMSE): where: N is the total of the samples points number, Z(s i ) is the observed value, Z*(s i ) is the estimated value. The method with the lowest RMSE was selected as the most accurate method.

Descriptive statistics
The SOM content in the study area ranged from 0.35% to 3.72%, with an average of 1.346% and a standard deviation of 0.481% (Table.1 Moderate SOM variability is a product of climate, soil type, and land use. The study area has different soil pedological classes that influence its diversity, whereas intensive agricultural practices increase SOM spatial heterogeneity. The SOM dataset revealed a positive skewness (0.885), indicating that the dataset had a non-normal distribution (Fig. 3a). A logarithmic transformation shaped the soil parameters (skewness = -0.354) (Table 1, Fig. 3b), and the transformed datasets passed the Kolmogorov-Smirnov test (p = 0.000). We used logarithmically transformed SOM data to meet the modeling requirements.

Pedological maps of soils types
The soils in the study area are very heterogeneous (Fig. 4). According to the French CPCS classification of 1967, the main types of soils encountered belong to the following six classes: (i) The immature soils cover 13,009.98 ha or 30% of the total area of the study zone. They generally correspond to two types: modal alluvial immature soils and wind-blow immature soils; (ii) Vertisols cover an area of 1,413.56 ha (3%). These soils are located at the edges of the study area. They dominate 52% of irrigated areas; (ii) Calcimagnesian soils represent only 67.41 ha (less than 1%). These are essentially modal limestone browns; (iv) Isohumic soils cover 22,992.41 ha (53%). They are the most dominant in the study area. They are ancient and formed in a semi-arid Mediterranean climate; (v) Sesquioxide-rich (fersiallitic) soils cover a total area of 4,548.94 ha (3%); (vi) Hydromorphic soils occupy only 1,584.17 ha (4%). These soils are mostly located in depressions. They are often associated with hydromorphic vertisols (CPCS 1967;Geoffroy 1978;Badraoui et al. 1993)

SOM spatial distributions by deterministic interpolation methods
In the study area, we analyzed SOM spatial distributions using IDW and LPI as deterministic methods. The results obtained using the cross-validation method showed that IDW provided better results than LPI (Table 3). Therefore, IDW is more accurate than LPI with a lower ME (0.0041, 0.0050) and smaller RMSE (0.397, 0.412), respectively. Many studies have demonstrated that the most significant aspect in obtaining accurate simulations is the model parameters

Spatial dependence of soils by intrinsic and extrinsic factors analysis
The SOM experimental variogram best fitted the exponential model. This finding is in agreement with those of previous studies (Bhunia et al. 2018;Đurđević et al. 2019). Most of the data were fitted to an exponential model (Fig. 5). The spatial correlation range for SOM was high (4207.74 m) ( Table 4). This range shows the correct sampling number for SOM mapping in this study. Hence, the sampling design interval should be shorter than half of the range (Kerry and Oliver 2007). The results showed that the sampling design was appropriately designed to quantify the spatial variability distribution of SOM. However, the spatial dependence was moderate to low according to the nugget/sill ratio (74%). The spatial dependence of soil is influenced by both intrinsic and extrinsic factors (Cambardella et al. 1994). In our study area, soil properties were mainly affected by human activities that are strongly connected to agricultural practices ).
Ordinary kriging (OK) and empirical Bayesian kriging (EBK) were used to spatially interpolate the SOM variability. The summary of the obtained results revealed that OK yielded better results than EBK (Table 3). OK showed the lowest RMSE and ME values of 0.395 and 0.0017, respectively, while EBK showed the highest RMSE and ME values of 0.400 and 0.0018, respectively.

Comparison of deterministic and geostatistical methods for ensuring cross-validation error
Among the tested interpolation techniques for SOM mapping, the most accurate was ordinary kriking, whereas LPI was the least accurate. The estimation of non-sampled locations using kriging depends significantly on the efficiency of variogram modeling. The optimal sampling design can determine the performance of a geostatistical method. To have a minimum cross-validation error, the distance, number and distribution of samples are the primary requirements (Mirzaee et al. 2016). The lowest RMSE was reported for OK. Therefore, the low values (very close to 0) of the mean error (ME) prove that the estimate is relatively unbiased or has a small bias. However, the LPI technique has a larger bias, expressed by a significantly higher ME (Đurđević et al. 2019). Thus, the low RMSE and ME values indicate a good match between the observed and estimated SOM content. The OK method often provides the best spatial interpolation for predicting values at non-measured locations Venteris et al. 2014;Tripathi et al. 2015;Bhunia et al. 2018). This interpolation method incorporates spatial autocorrelation and statistically optimizes weights.

Soil organic matter (SOM) dynamics and scenarios of Doukkala Plain
It is observed that the immature and fersialitic soils, which dominate the southwestern and eastern parts of the Sidi Bennour region, had SOM values below 1.15% (Fig. 6). Moreover, a large majority of soils in the study area generally have low SOM values (Badraoui et al. 2000;Soudi et al. 2000), which can be explained by the human influence of intensive agricultural activities that do not consider the incorporation of crop residues into the soil (Dahan et al. 2001;Mrabet et al. 2004). Besides, a significant proportion of crop residues, instead of returning it to the soil, is used for animal feed, which is a good condition for losing large SOM quantities. It is therefore easy to assume that the SOM situation in the study area is very concerning, and it is easy to speculate that this situation will worsen in the future if appropriate decisions are not taken. Hence, it is essential to emphasize preserving soil fertility and sustainability by increasing SOM content through agricultural practices Badraoui 2006).
In this regard, the efforts of policymakers and farmers must be jointly deployed to restore degraded soils. Possible scenarios can be focused on (1) encouraging the burial of residues, (2) crop rotation, and (3) the practice of direct-seeding. Indeed, studies in this area have shown that burying residues considerably improve soil richness in organic matter (Naman et al. 2015). Crop rotation, which is at least a triennial, plays a crucial role in the variation of the origins of organic inputs. The alternation of sugar beet, cereal, and forage/legume crops, which account for approximately 20%, 50%, and 10%, respectively, is a crucial factor in the variation of organic inputs (ORMVAD 2020). Sugar beets and leguminous forages (soybeans, berseem, and alfalfa) provide more organic matter than cereals (corn, wheat, and sorghum) (Naman et al. 2015). Also, according to trials of no-till in different contexts, this practice has the potential to improve almost the majority of soil properties, including organic matter (Ibno The combination of these solutions could improve the soil recovery in the study area.

CONCLUSION
A good understanding of the spatial distribution of SOM is crucial for agricultural and environmental management. Several methods have been used, and various algorithms have been tested to determine the most reliable approach to SOM spatial distribution. We applied four different spatial interpolation methods (IDW, LPI, OK, and EBK). These methods were evaluated using cross-validation technique. The study indicates that the OK interpolation method yields better results than the other approaches. LPI has the worst performance, resulting in a higher RMSE and ME than the other techniques.
From this study, we could draw the following conclusions: (1) the low SOM concentration in the studied soils indicates the possibility of soil degradation in the future. (2) The most suitable model for SOM is the exponential model. (3) The results show that the range is high and the spatial dependence of the SOM is moderate to low, which shows that the SOM content varies considerably depending on the soil genetics, which is combined with the strong influence of human activity, which is expressed by intensive soil management practices. (4) Ordinary kriging was the most accurate interpolation method, whereas local polynomial interpolation was the least accurate. (5) The results also show that in both the southwestern and eastern parts of the study area, low SOM content levels, which are less than 1.15%, were present in the SOM digital map. The heterogeneous spatial distribution of SOM involves the implementation of specific strategies for handling each soil case. (6) Policymakers and researchers can apply these approaches to different agricultural and agroecological issues, which can greatly facilitate or accelerate decision-making and influence the viability and sustainability of agricultural production.