A Comparative Evaluation of the Use of Artificial Neural Networks for Modeling the Rainfall–Runoff Relationship in Water Resources Management

Recently, Artificial Neural Network (ANN) methods, which have been successfully applied in many fields, have been considered for a large number of reliable streamflow estimation and modeling studies for the design and project planning of hydraulic structures. The present study aimed to model the rainfall–runoff relationship using different ANN methods. The Nergizlik Dam, located in the Seyhan sub-basin and one of the important basins in Turkey, was chosen as the study area. Analyses were carried out based on streamflow estimation with the help of observed precipitation and runoff data at certain time intervals. Feed Forward Backpropagation Neural Network (FFBPNN) and Generalized Regression Neural Network (GRNN) methods were adopted, and obtained results were compared with Multiple Linear Regression (MLR) method, which is accepted as the traditional method. Also, the models were performed using three different transfer functions to create optimum ANN modeling. As a result of the study, it was seen that ANN methods showed statistically good results in rainfall–runoff modeling, and the developed models can be successfully applied in the estimation of average monthly flows.


INTRODUCTION
Rapid population growth and adverse environmental conditions make it necessary to conduct good planning in terms of the protection and use of water, which is a limited resource (Aktürk and Yıldız, 2018; Tayyab et al., 2019, Obasi et al., 2020; Ali and Shahbaz, 2020). Globally, problems have increased with the effects of climate change, and it is predicted that the decrease in usable water resources will create a serious risk every year (Selek et al., 2019). One of the most valid ways to reduce the risk is to carry out accurate planning. The available flow data should be sufficient for planning. Creating prediction models with flow data in basins where water resources are gathered is considered very valuable in terms of hydrology Therefore, it is thought that using ANN, which is a closed-box model, can greatly facilitate solutions as prediction model using observed rainfall and runoff data (Kişi, 2008;Okkan et al., 2018).
In the literature, there are various studies in which ANN methods have been used in rainfallrunoff modeling. In these studies, in general, the observed rainfall and runoff data have been evaluated as input values, and forecasting with the help of the flow data was constituted to predict the output value (Gümüş et  . The studies, in which inputs were selected as rainfall, temperature, past flow rate values in terms of various hydrological data and in which runoff prediction models were selected as methodology, are also included in the literature ( The present study investigated the modeling of the rainfall-runoff relationship, which is considered to be an important factor in the development of hydraulic structures. The Nergizlik Dam area in the Seyhan sub-basin, which includes fertile agricultural lands, was chosen as the study area. Observed rainfall and flow or streamflow data were regarded as input values, and the output was evaluated as streamflow data. Different ANN methods, such as Feed Forward Back Propagation Neural Networks (FFBPNN) and Generalized Regression Neural Networks (GRNN) were applied, and obtained results were compared with the Multiple Linear Regression (MLR) as a conventional method. In addition, in order to create optimum ANN modeling, three different transfer functions were investigated as Logarithmic sigmoid, Hyperbolic tangent sigmoid and Purelin (linear), different numbers of hidden neurons and the number of spread variables were examined to observe the effects on the models. Therefore, in this study, it is aimed to create an effective prediction model in solving a complex problem such as rainfall-runoff modeling. The applicability of artificial intelligence methods has been investigated in order to take preventive measures and to manage water resources properly in drainage basins such as dams, which are most affected by global climate change. Analyzing the model sensitivities of different variables for each ANN method, the obtained results will make positive contributions to the literature in terms of optimum ANN model.

Study area
The Nergizlik Dam, which is located in the western part of Turkey and extending northwards from the Çukurova region, was built on the Üçürge Stream and approximately 50 km far from Adana in 1995 (Fig. 1). The dam meets the agricultural irrigation needs of 2300 hectare area and it can be used for flood prevention. The volume of this earth-fill type dam is approximately 14.74×10 6 m³; its height from the riverbed is 70 m, the lake volume at normal water elevation is 21.80 hm³ and the lake area at normal water level is 1.08 km² (Turkish State Hydraulic Works (known locally as DSI), 2011). In the Seyhan sub-basin ,where the Nergizlik Dam is located, the Mediterranean and continental climates are dominant. Due to the climate characteristics, in the Mediterranean climate region, the winter season is generally rainy, although the snowfall is observed where the continental climate region is effective.
Seyhan basin is one of the basins that will be significantly affected by drought due to climate change. As a result of some studies, monthly average temperatures will increase up to 30°C in the Seyhan Basin; it has been determined that there will be a 25% decrease in the annual precipitation amount. Thus, it has been predicted that there will be a 14% increase in the potential evapotranspiration and a 17% decrease in actual evapotranspiration, depending on the decrease in the precipitation. Significant reduction of up to 30% will occur in surface water resources, snow storage and groundwater potential. It is predicted that this climate change will cause a decrease in the water resources of the Seyhan Basin (Özfidaner et al. 2018; Gümüş, 2019).
In the study, the rainfall data from Karaisalı and Çatalan, which are the closest two rainfall observation stations (ROS) to the Nergizlik Dam site, were evaluated between 1992 and 2011. The data from this period were obtained from the DSI and Turkish State Meteorology Works (known locally as DMI). The average monthly flow values obtained from Flow Observation Stations (FOS) and the average monthly rainfall values from ROS were used. As can be seen in Table 1, the data from FOS nos 1820 and 1828 (Körkün Suyu-Hacılıköprüsü and Çakıt Suyu-Salbaş) and Ros nos 17351 and 17936 (Karaisalı and Çatalan) were utilized and also the location of the stations can be seen in Figure 2  In order to observe the relationship between two or more variables and to determine the effect of the relationship, regression-correlation analysis can be performed (Gümüş et al., 2013; Bakış and Göncü, 2015; Akçakoca and Apaydın, 2020). It was observed that the hydrological relationship between the specified FOSs and ROSs was quite robust, since the coefficient was found close to the value of"1" as a result of the correlation made using the MS-Excel program. The observed monthly average flow values are represented as Q t , rainfall values as P t , delayed streamflow values before and after t as Q t-1 and Q t + 1 , respectively, and the rainfall values before and after t as P t−1 and P t+1 , respectively. Various artificial neural network architectures were used to model rainfall and runoff data at specific times, and the obtained results of streamflow estimations were compared.

Artificial Neural Networks (ANN)
ANN models are computer applications that perform learning, association, classification, generalization, and optimization processes based on available data (Sattari et al., 2007;Okkan and Mollamahmutoğlu, 2010). To create these models, many methods such as FFBPNN, GRNN, RBFN etc. can be used (Gümüş and Kavşut, 2013;Gümüş et al., 2018). The general characteristic of all methods is that they contain input, hidden and output layers (Ustaoğlu et al., 2008).
There is no limit to the number of hidden or intermediate layers and the outputs of neurons in one layer can be presented to the next layer as input values employing weights. In the input layer, a weight coefficient is applied to obtained information with the help of the input vector; from here it is transmitted to the neurons in the hidden layer (Gümüş et al., 2018;Üneş et al., 2019). Afterwards, the output of the network is completed as a result of applying different processes to the information in the hidden and output layers (Fig. 3). This means that the architectural structure is a non-linear feature in Feed Forward Network models. A backpropagation algorithm is a programming language that enables a continuous function to be formed with the desired It is essential to determine the weight values for the neural network for the FFBPNN. With a net function found as the sum of the weighted input values, the effects of the input data on this neuron are stated. These inputs are transferred to the output layer with the help of functions called transfer or activation functions (Yıldıran and Kandemir, 2018). In this study, three different . The difference of the GRNN from the FFBPNN is that they do not require a training process to be repeated over and over (Seçkin et al., 2013). The general schematic structures of the FFBPNN and GRNN are shown in Figure 4 and Figure 5 as can be seen below.
Levenberg-Marquardt algorithm -which is accepted as an advanced type of network algorithm -was used in this study, and is considered to be an analysis of a complex Hessian matrix . The models were also tested on different hidden layer values. The output value was taken to be "1". It can be defined The MARE formula can be defined as Eq. 1 and R 2 formula can be defined as Eq. 2.
In Eq. 2, Q measured shows the observed streamflow data, Q calculated shows the flow data obtained as a result of the calculated modeling and "N" shows the total data (Riad et al. (2)

Multiple Linear Regression (MLR)
The purpose of multiple linear regression is to predict the value of the dependent variable using independent variables and to find its relationship with the independent variables (Ramana, 2014; Shoaib et al., 2018). When the dependent variable "y" is represented by the arguments x 1 , x 2 , .., x r; the relationship between them can be shown in Eq.
where, the unknowns β 0 , β 1 , β 2 ,…, β k …, β r are called as regression coefficients. Any β k If there are deficiencies for observed data in a meteorological station, the data of nearby stations can be used to complete these deficiencies (Bayazıt, 2003;Turhan, 2012). For the estimation of missing data, the unknown rainfall level at the station with the annual average rainfall data can be calculated with Eq. 4. In the Eq. 4, the annual average rainfall in the nearest three stations can be expressed as N A , N B , N C , the values corresponding to the missing rainfall can be expressed as P A , P B , P C , and the unknown level at the station with the annual average rainfall data can be expressed as N X (Turhan, 2012): (4) Missing data from Karaisalı and Çatalan ROSs were completed by using by the Eq. 4 and made suitable for modeling. Thus, different ANN architectural structures were formed with several combinations using obtained rainfall and streamflow values. Of the available data, 60% was evaluated at the training stage and 40% at the testing stage. In other words, the first 60% of the data set was trained as a block and the following part was tested. These values can be evaluated at varied rates depending on the scope of the studies (Nacar et al., 2017; Akçakoca and Apaydın, 2020).

RESULTS AND DISCUSSIONS
Rainfall-runoff model was created with different combinations using monthly average flow values from FOSs no. 1820 and 1828 at t, t−1 and t + 1, and from the monthly average rainfall values from Karaisalı and Çatalan ROSs no. 17936 and 17351 at, t−1 and t + 1. Flow chart of the developed ANN model can be seen in Fig. 6.
The "t" time interval was chosen to run from 1992 to 2011 by scaling rainfall data between 0.10 and 0.90. Out of a total of 236 data, 142 were evaluated at the training stage and 94 were evaluated at the testing stage. The evaluated data during  Table 2. In order to show the convergence amount of the FFBPNN and GRNN methods to the MLR method, the models were employed and the obtained results were presented.
Prediction results were compared according to the MARE and R 2 criteria. It is thought that the closer MARE value is to "0" and the R 2 value to "1", the more accurate convergence is achieved to the predicted value. In general, it was observed that ANN methods provided a very high approximation to the MLR method, and some models yielded slightly better results. It was observed in the models that the ANN architectural structure giving the best results had "5" input data, and obtained optimum results when the hidden layer value was the value of "1" (Fig. 7).
The best results obtained for training and testing stages for FFBPNN, GRNN, and MLR models can be seen in Table 3. The obtained R 2 values indicated compatibility with the observed data. A value between 85% and 100% means that the model is suitable in terms of performance value.
Since the MARE indicates average error value, a value closer to zero indicates that the error rate is reduced (Sattari et al., 2012;Yaseen et al., 2016;Vidyarthi et al., 2020;Song et al., 2020). In the Table 3, it is noteworthy that the FFBPNN method in particular creates very similar data to those of the MLR results. Although the MARE values are close to each other in all methods, it was observed that there was a decrease in some models. Notwithstanding the MLR yielded better results, even if only a little, it was seen that the FFBPNN in particular generally provided a better approach among the whole ANN methods. This result is consistent with the previous studies in literature (Gümüş and  In order to test the sensitivity of the models, simulations with three transfer functions (Logsig, Tansig, and Purelin) were employed. The obtained results are shown in Table 4.
In the Table 4, it can be emphasized that results for the three different transfer functions were found similar. It can be concluded that the data are very close to the MLR method, which is a conventional method and widely accepted. Furthermore, to test the consistency of the GRNN and FFBPNN methods, the effect of the hidden neuron and spread number variations for Model-9 was investigated. The related graphs can be seen in Fig. 8.
When the graphs are examined, the spread value tends to increase with a few exceptions. Fig. 7. The most suitable ANN architecture (5, 1, 1)
The most suitable relationship network with regard to R 2 and MARE values were obtained with the FFBPNN in Model-9 for ROS no 17936, FOS nos 1820 and 1828, and also it was consistent with the results of other methods. Even though the GRNN accomplished a very high R 2 value during the training phase, it achieved less convergence in the testing compared to other methods. The Logsig, which has a lower error rate as a transfer function, put out optimal values.
In the GRNN models, the spread value of 0.03 yielded the lowest error rate. In the models performed on the number of hidden neurons, it was observed that the value of "1" produced the highest R 2 value and the lowest error rate. It was observed that the error rates increase at the turning points of the curves; however, the results of the FFBPNN were closer to the y=x (exact) line, and the obtained results by the MLR were condensed around certain points, according to the GRNN. This may be caused due to the MARE values being further away from the value of zero for the GRNN results. Although the R 2 value is high, the   Figure 10.

CONCLUSIONS
In the present study, the relationship between monthly average streamflow data from FOS nos 1820 and 1828 and monthly total precipitation or rainfall data from ROS nos 17351 and 17936 was investigated using the FFBPNN and GRNN for one of the irrigation dams in the Seyhan subbasin. The obtained results were compared to the MLR method. Of the actual data, 60% was tried during the training process, while the remaining 40% was performed only in the testing phase. After many modeling processes, to investigate the best approach, rainfall and streamflow values were considered as inputs and the flow value was estimated according to these ANN architectures.
Therefore, in order to evaluate their performances, three different transfer functions: Logarithmic Sigmoid (Logsig), Hyperbolic Tangent Sigmoid (Tansig) and Purelin (Linear) were used in modeling these network structures to evaluate To test the sensitivity of the FFBPNN and GRNN methods, the effect of hidden neurons and variations in spread value were also researched. Consequently, it was regarded that (5,1,1) is the most suitable ANN architecture for this study.
It can be concluded that streamflow estimation models, analyzed with the ANN, can successfully model the non-linear rainfall-runoff relationship of river basins. With the aim of investigating the relationship, it is thought that using the FFBPNN method can be a good alternative as a result of the application of the ANN methods methods for the Nergizlik Dam area. It is a fact that the autocorrelation function effect on the streamflow estimation can be increased with using more hydrological data values and in this way the performance of the models can be further improved. The results already present a high correlation and minimum error rates. Thus, it was concluded that the ANN input data for this study was required and sufficient to model the Nergizlik Dam inflows. It is also obvious that ANN methods will provide significant advantages, especially in supporting decision-making processes, when it was needed to plan and manage appropriately and sustainably of water resources.