Forecasting Oil Crops Yields on the Regional Scale Using Normalized Difference Vegetation Index

Early prediction of crop yields on large cropland areas is of a great importance for operational planning in the agrarian sector of economy and ensuring food security. Large-scale forecasts became possible owing to the introduction of remote sensing technologies in the systems of precision agriculture, providing the information on crops conditions both on a certain field and large croplands. The study on the forecasting of major oil crop yields, namely, sunflower (Helianthus annuus L), winter rape (Brássica nápus) and soybean (Glycine max), on the regional level in Kherson oblast of Ukraine was conducted using historical yielding data and monthly MODIS Terrain NDVI smoothed time series imagery with 250 m resolution of the period from 2012 to 2019. The statistical data on the crop yields were linked to the corresponding values of monthly NDVI to determine the type of inter-relationship and work out the regression models for the oil crops yield prediction based on the remotely sensed vegetation index. The highest correlation between the yields of the oil crops and NDVI with the best prediction accuracy were obtained by using the index values at the period of April for winter rape, July for sunflower, and August for soybean. The developed regression models have reasonable accuracy with the mean absolute percentage errors of predictions reaching 25.23 percent for sunflower, 18.28 percent for winter rape, and 13.24 percent for soybean. The models are easy in use and might be recommended for introduction in theory and practice of precision agriculture.


INTRODUCTION
Early large-scale yield forecasting is of great important for modern agriculture to provide rational agricultural policy and import-export strategy, as well as guarantee food security. Recent advances in remote sensing, which is the science and technique targeted on obtaining the information about the on-land objects from the satellite imagery without direct contact [Sabins, 1987], provide extensive opportunities for monitoring and forecasting crop yields on the territories of various dimensions, for example, districts, field arrays of a whole farm, regions, states, etc. Remote sensing advancements are widely implemented in precision agriculture [Seelan et al., 2003]. The most prospective and convenient instrument for yield predictions by remote sensing data is the application of vegetation indices, calculated based on satellite imagery in special geoinformation system (GIS) software. Normalized difference vegetation index (NDVI) seems to fit best for the needs of yield forecasting because of its accessibility and simplicity of usage. Some services provide ready to use imagery with the previously calculated values of the index, so that the user has no need to deepen into the calculations and raster analysis.
Normalized difference vegetation index, which was proposed by Rouse et al. (1974), is a ratio between the difference of near-infrared and red spectra reflections from the surface and their sum. In turn, enhanced vegetation index or EVI, which was introduced later by Huete et al. (2002), has some benefits related to consideration of the distortions caused by aerosols and soil surface reflectance. Both these indices have been successfully used by scientists for estimation of crop yields. For example, both NDVI and EVI were applied in rice yield predictions by Son et al. and Lykhovyd (2020) applied the NDVI-based regression models for spring row crops yields prediction on the field scale. Therefore, there is several studies conducted in different regions of the Earth, which were devoted to the problem of grain crops yields estimation by the remotely sensed data. However, there is a lack of such studies for major oil crops, namely, sunflower, rape, and soybean [Lühs & Friedt, 1994;Sharma et al., 2012]. Thus, the goal of the study was to develop and test the model for major oil crops yield estimation on the regional level for Kherson oblast of Ukraine based on the values of MODIS NDVI smoothed time series data.

MATERIALS AND METHODS
The study on the estimation of major oil crops yields was carried out in Kherson oblast, which is located in the South of Ukraine (Figure 1), using the data of MODIS Terrain NDVI 16¬Day smoothed time series (with a resolution of 250 m) for the period from 2012 to 2019. The imagery was obtained using the services of the University of Natural Resources and Life Sciences in Vienna. The imagery was inserted for further processing and raster analysis in QGIS 3.10 software. Besides, all the screens were cut by the mask of vegetation cover of Kherson oblast, which was obtained at the NEXTGIS DATA service. This step has been taken to avoid the possible distortions and errors through accounting the areas, which were free from vegetation at the moment conducting the study. Further, 16-Day time series was converted into monthly series.
In order to develop the regression models for oil crops yields prediction the data on their yields in the studied region were obtained at the State Statistical Service of Ukraine. A regression analysis was performed at p<0.05 using the BioStat v7 software and involved the computation of Pearson correlation coefficient (R), coefficient of determination (R2) (raw, adjusted, predicted), slope, and mean absolute percentage error for the devel-

RESULTS AND DISCUSSION
Using the results of NDVI computation for the studied region, the mean values of the index by the months of the studied crops vegetation were obtained and generalized in the Table 1. The yields of the studied oil crops in Kherson oblast during the period from 2012 to 2019 are generalized in the Table 2.
The preliminary assessment of the relationships between the yields of the studied oil crops and NDVI values by the months were performed through the calculation of linear Pearson correlation coefficient and revealed that the highest correlation and, therefore, reasonability for regression modeling, is recorded in April for winter rape, in July for sunflower, and in Augustfor soybean (Table 3). Thus, the yields of winter rape, sunflower and soybean can be estimated 30-50 days in advance of the harvesting period, which in the region commonly starts at the end of June for the rape, at September for sunflower, and at the end of September -October for soybean.
Regression analysis, performed at p<0.05, calculated the values of slopes for the models of oil crops yields. The constant of the models was taken as "0". The regression statistics for the models of winter rape, sunflower and soybean yields is in the Table 4.
The regression statistics prove that the obtained forecasting models are quite reliable with high Predicted R 2 values exceeding 0.85, and MAPE within 13-25 percent. Such predictions could be considered as good and reasonable for winter rape and soybean, but the forecast accuracy for sunflower cannot be considered reliable enough [Caraka et al., 2019]. However, other classifiers claim that the forecasts with MAPE ranging from 10 to 20 percent are good, and from 20 to 50 percent are reasonable [Moreno et al., 2013]. Thus, it is still possible to estimate the sunflower yields on the regional scale in advance by means of the developed regression model. All in all, the best predictive model with the highest R 2 and the least MAPE was developed for soybean. The models for yield estimation for each studied crop are presented in the Table 5. A graphical approximation of the models for every studied crop is depicted in the Figure 2.
The studies previously conducted by Bu et al. (2017) proved the possibility of implementing satellite imagery in sunflower yield prediction     Figure 2. Approximation of the NDVI-based regression models of major oil crops yields in Kherson

CONCLUSIONS
Remote sensing provides an opportunity for fast, convenient, and precise early estimation of yields of major oil crops, namely, winter rape, sunflower and soybean, on large arrays. The regression models developed for Kherson oblast, Southern Ukraine, testify about reasonability of such an approach to yield prediction and crop growth modeling, providing the forecasts of the crops' yields with R2>0.88 at p<0.05.