Characterization and Estimation of Dates Palm Trees in an Urban Area Using GIS-Based Least-Squares Model and Minimum Noise Fraction Images

Date palm is the major food source and possesses an important role in the economic aspects, environmental parts, and society. These crops were subjected to degradation due to the financial and numerous military conflicts. Because of the expensive cost of monitoring and managing date palm in field measurements, and limited studies using satellite images, the authors proposed a method to estimate and map date palm using the Landsat-8 satellite images. The authors applied the least-squares multiple regression and GIS techniques to find suitable predictors from the set of variables such as original bands of Landsat-8, Minimum Noise Fraction (MNF) transformation, tasseled cap component transformation, and spectral index. In order to validate the proposed method, the field measurement data were utilized to assess the estimated date palm from the Landsat-8 images. A linear combination of MNF Landsat-8 band 4 (red, 0.636–0.673 μm), Normalized Difference Moisture Index (NDMI) and Enhanced Vegetation Index (EVI) were the best date palm predictor (R2adj= 0.988, root-mean-squared error (RMSE) = 0.013). The results demonstrate that the MNF Landsat-8 images in the least square regression help improve the date palm estimation and mapping for the practical use in the study area with high accuracy.


INTRODUCTION
Date palm is one of the Arecaceae date group and a sign of existence in the desert (Khierallah et al. 2015). Date palm considered a native plant of the desert cultivated in severe local environments. The date is an ancient crop which has been grown in many regions and countries of the Middle East and North Africa for over 5000 years (Johnson 2010). Furthermore, Iraq could be considered the nation of the date palm, which can be found in different regions in Iraq (Chao and Krueger 2007). The number of date palm trees is to be around 22 million, covering more than 120 000 hectares (Khierallah et al. 2015). The date also represents a significant food source for the local population in the Middle East countries and North Africa. Furthermore, it has an essential role in the environment and society (Chao and Krueger 2007). Date trees provide protection against high-speed wind as well as prevent soil desertification and degradation (Khierallah et al. 2015). Contrarily, in the year 2000, the number of trees in Iraq gradually reduced to 12 million, which cover about 50 000 hectares. The reduction in this important crop tree counts was due to the fighting activities that occurred in date palm regions since 1980 (Al- Khayri et al. 2015).
Additionally, planting of the date palm is affected by the biotic and abiotic factors (Al Shidi et al. 2018). Moreover, the date palm diseases play an essential part in degrading the palms by increasing the salinity of the soil, extreme water table, borers spreading, and aging of palm (Chao and Krueger 2007, El-Juhany 2010). The Arecaceae family, includes the well-known species such as date palm, coconut (scientific name is Cocos nucifera), and oil palm (or scientifically named as Elaeisguineensis), which have a high profile compared others species (Sayan 2001). The remote sensing techniques applied widely for studying various types of vegetation with limiting of date palm (Alhammadi and Glenn 2008). The majority of optical and microwave remote sensing data focused on the plant, above-ground biomass, and palm oil rather than the date palm (Chong et al. 2017). The applications of date palm covered the surface analysis, mapping, and palm boundary extraction, counting of the tree, monitoring the change to distinguishing the variations in real resources, palm age and blight estimation, infection and disease detection, as well as carbon estimation (Santoso et al. 2016, Lazecky et al. 2018). Tan et al. (Li et al. 2016) studied the oil palm trees age. It is proven that the texture features and a shadow fraction are helpful in estimating the oil palm age. The estimation and evaluation of oil palm plantation and aboveground biomass were examined using multispectral Landsat images and microwave data (Morel et al. 2012).
Moreover, the machine learning-based methods were applied for the palm tree detection by using a scale-invariant feature transform (SIFT, and unmanned aerial vehicle (UAV) images (Malek et al. 2014) The circular autocorrelation of the polar shape matrix and a linear support vector machine were used to reduce the feature dimensions for detecting date palm (Manandhar et al. 2016). As follows, remote sensing is a valuable tool for implementing early discovery and constant observation of many difficulties (Chong et al. 2017). Several studies in remote sensing concentrated on the environmental factors and the multispectral properties or radar properties of the elements to extract multiple parameters through satellite images or data (Shi et al. 1997, Shareef 2019. Various models were applied to study different parameters of the Earth's surface. The empirical models are considered the most straightforward and directly applied models; hence, they have been implemented to analyze uncertain determinants or to investigate the relationships between different features (Shareef et al. 2014a). Nevertheless, the remote sensing application in numerous kinds of the palm is confined in limited species and not focused directly on the date palm trees (Hansen et al. 2015). The objective of this study was to (1) produce a different approach that allows identifying the trees of date palm in urban employing the Landsat -8 images and a statistical model, (2) assess the use of tasseled cap and spectral indices to extract the date palm.

DATA AND STUDY REGION
The Landsat 8 satellite officially started service in February 2013, being the eighth satellite of the Landsat group (Oishi et al. 2018). The Landsat-8 data herein were obtained on 09/11/2017 as well as pre-processed radiometric-ally and atmospherically using the ENVI5 software. The study was performed in the Baghdad region, the capital of Iraq, which is extended to the Al Dora city situated in the south of the area and to the Arabs Ejbur along with a distance of 12 km. Two hundred sixty-one samples of palm trees were obtained in October/2017 and separated within 211 samples as training data and 50 samples as the reference data. The references data employed to build and verify the producing models. The area of study can be shown in Figure 1.

Spectral models
Numerous spectral models applied to estimate or analyze the surface properties. Normalized Difference Vegetation Index (NDVI) is a commonly used model to detect vegetation (Lykhovyd 2020), and it was produced by Rouse in 1973. The NDVI depends on concentrating on the near-infrared and red band, which are more sensitive to the plant (Rouse Jr et al. 1974). On the other hand, the Enhanced Vegetation Index (EVI) was employed to enhance the response of the biomass region and to obtain the best signal of the crops or vegetation. It is supported to improve (1) the signal transmitted from the canopy, and (2) lessen the atmospheric influences (Liao et al. 2015). Moreover, the Normalized Difference Moisture Index (NDMI) was employed to obtain moisture using near-infrared (NIR) and shortwave infrared (SWIR), which represent the fifth and sixth bands in the Landsat-8 data. (Hardisky et al. 1983) Figure 2 shows the general process followed in this paper is to assess and determine date palm by using the MNF Landsat-8 data. This methodology included the stages of passing through correction, transformation, producing the models, and mapping the date palm. The statistical regression analysis between the extracted parameters and fieldwork was conducted to create the date palm models

Least square method
The least squares method is a statistical method that aims to estimate a regression line that reduces the sum of the primary deviations or errors in the observed points in the regression line. It minimizes the sum of squares of differences between actual and calculated values. The general model of this method is given in (Eq.1): where: y is a vector of response X is a variables matrix, β are the entered vector parameters, and ε is the error vector (Shareef et al. 2014b).

Transformation of data
Atmospheric correction was applied to the Landsat-8 data. The thermal infrared images were not used in this study. In order to determine the inherent dimensionality of the pre-processed image data acquired by Landsat-8, a Minimum Noise Fraction (MNF) transformation applied to pure images of the Landsat-8. MNF was performed for isolating Landsat-8 noise and for diminishing the calculation necessities for succeeding the process and for reducing the multicollinearity among various features (Joseph 1994).

RESULT AND DISCUSSION
The Landsat-8 data used in this study includes the information concerning the reflectance (spectra) of different features in the region of interest. The MNF transformation was used herein to reduce the image noise in initial data and the produced data were applied to extract Tasselled. The spectral models have well presented the plants at a high brightness level, which herein represents the date palm trees. The wetness component shows the wetland zones by the high brightness, but date palm was mixed land covers in this component. Vegetation and water area reflected the interference or mixed pixels using the intensity and fourth components of TCT. Nevertheless, the fifth and sixth components present the plants in black color, including the date palm trees (Fig 3).  The autocorrelation analysis was used to find a correlation between our databases, a pvalue of statistical significance was under 0.05, which indicated a tendency towards dispersion. The importance of F is to designate the goodness of the predicted model. Table 1 and Table 2 illustrate the correlation values of the date palm derived by NDVI correlated with various bands of the Landsat-8, indices, and TCT components. The date palm effectively related to the 4th component of TCT (0.976), while EVI is low correlated (0.190) with date palm. On the other hand, the correlation value of the multispectral MNF bands and date palm reflects the strong correlation of 0.94 in band 7, which correspond to the SWIR band. The 5 th and 6 th components have inverse correlation with the date palm, which was -0.93 and -0.97, respectively. Brightness has a higher value of correlation with date palm than Greenness.
In this study, three analyses were applied to produce the date palm model. The first analysis included a single variable correlated with date palm (Table 3), while the second analysis included two variables to produce the model ( Table 4). The third analysis using the least square regression model was to three entered variables relative to date palm trees, as illustrated in Table 5.
The minimum correlation value using MNF data was noticed using B5. The lowest R adj 2 value (0.216) noticed in the Wetness component, and on the contrary, it had the highest R adj 2 value (0.952) in Fourth. Moreover, the spectral NDMI is a low correlation with data palm, while EVI had a moderate value of R adj 2 (0.032 and 0.647), respectively. Next, two variables computed as independent variables were examined to produce the models using multi-regression least square. Table 5 shows the achieved results. The combination of spectral indices (NDMI and EVI) slightly increased the R 2 adj value (0.649) as opposed to using the same model with just one spectral index, an independent variable. Conversely, the combination of tasseled cap components of wetness and   brightness are generally increased the R 2 adj value (0.873) compared with those who have a single part. Thus, using the combination of the MNF Landsat-8 images improved the predicted models with high R 2 adj . The final step included three independent variables (MNF Landsat-8 images, spectral indices, and TCT component) used in the regression model (Table 5). This step classified combination into six groups according to the used parameters. The first group includes the combination of spectral indices and tasseled cap. In this group, R 2 adj is ranging between 0.903 (NDNI, EVI, and TC-Greenness) and 0.983 (NDNI, EVI, and TC-Sixth, which is considered the best in this group). On the other hand, MNF Landsat-8 has a high R 2 adj (0.988), mainly when the NDNI, EVI, and B4 indices are used. The third combination employs spectral indices, tasseled cap components, and MNF spectral bands; thus, the best R 2 adj for this combination is 0.97. The only mixture has obtained in this optimization was the combination of a spectral index, spectral bands, and tasseled cap, which is represented by EVI, TC-W, and B7. The combination of tasseled cap components has a good range between 0.928 (using TC-B, TC-G, and TC-FI) and 0.978 (TC-W, TC-B, and TC-S). The combination of the tasseled cap component with MNF Landsat-8 slightly improved, compared to the application of single tasseled cap components images. The next combination, which represents the final group, uses the MNF Landsat-8 images to generate a date palm trees model. The best indication of the producing date palm was obtained using two variables, such in the following: = 0.244 + 1.66 5 − 2.429 6 (2) where: DP refers to the date palm; s refers to the spectral band based MNF; B 5 , B 6 are the fifth and sixth band of Landsat-8.
Because the correlation between two spectral indices and the bands for of MNF Landsat-8 is the highest correlation of the produced model to estimate the amount of date palm given as: where: ns refers to the combination of indices and MNF band.
The validation method was applied on the produced models to measure their goodness. This method was used the polynomial linear fitting to estimate the degree of the confidence of the models between the estimated and field measurement values, as shown in Figure 4. The validation of the models gives legitimacy to use them within a specified accuracy. The produced accuracy was 0.95 and 0.988 for the first (DP s ) and the second model (DP ns ), respectively. The producing model with entering data enables to map the date palm in the spatial pattern. Figure 5 is established depending on the least square method. The produced models rely on the parameters extracted from the original Landsat-8 image. The final image provided by the models was a single image with a high brightness of the date palm. The maps were classified into two classes representing the non-date palm and date palm class. Thus, the representation of the date palm depends on the accuracy of the model used.

CONCLUSIONS
This study used different types of data extracted from the Landsat-8 images to estimate and map date palm using the least squares models in a GIS system. The relationship between the user parameter was examined to comprehend the influence of each applied variable relative to the other parameters and date palm.
Some parameters exhibited a low correlation with the vegetation index that it is necessary to generate the models to estimate date palm. Other parameters present high correlation values with the vegetation index needed to develop the estimation model. Overall, the obtained results demonstrate a strong effect of TCT, spectral index and MNF Landsat-8 of estimating and studying the date of palm trees with a useful finding pertaining to the use of single and multiple variables. Despite the many experiments in this study, we have placed the strongest relationships between the variables that can identify the date palm with high accuracy. The produced result enables us to generate different types of estimating models depending on the data used and the accuracy required. Moreover, the distribution of estimated date palm trees in terms of the spatial map may help to guide the separations of the density of the different types of vegetation to make the decisions in the future to protect and exploit the date palm trees.