Flow Rate Determination as a Function of Rainfall for the Ungauged Suhareka River

For ungauged rivers, when there are no hydrological measurements and there is a lack of data on perennial flow rates, the latter one to be determined based on other hydrological data. The river Suhareka catchment represents a similar case. Since there is no data on Suhareka’s flow rates, the authors of this study aimed for the flow rate determination based on rainfall measurements. From the available data on annual precipitation (monthly sums) provided by the Kosovo Hydrometeorological Institute for the Suhareka hydrometric station, the observed monthly rainfall data for 30 years were analysed. Those gaps were initially filled by connecting the hydrometric station in Suhareka with those of Prishtina, Prizren and Ferizaj, and as a result a fairly good fit was ensured. Moreover, the intensity-duration-frequency curves were formed using the expression of Sokolovsky, as a mathematical model of the dependence I (T, P). For a transformation of rainfall into flow, the American method SCS was used. As a result, the equation for the Suhareka River basin was derived, which enabled the determination of maximum inflows, for different return periods. The results obtained through this paper, indicates that even for ungauged river basins the peak flows can be determined from available rainfall data.


INTRODUCTION
For the water management process, the data on river's flow rates it of utmost importance. Therefore, the river flow rate determination is a focus of every hydrological research. Runoff plots are very important means to monitor runoff and soil loss (Baoyan et al., 2017). Unfortunately, many catchments are ungauged, and thus there are limitations for flood calculation using rainfall-runoff models (Nam & Shin, 2018). As stated, hydrological modeling is instrumental for both scientific application and for providing public services (Kolbjorn & Alfredsen, 2020).
When there is a lack of data on the river flow rates and many parameters of catchment properties are missing, then another aproach must be considered. The measurement model states in the form of equations, the relationship between the measurements and the true value of interest (McMillan et al., 2018;Nearing et al., 2016). Due to the lack of flow rate measurements, the relationship between rainfall and flow can help solve this defficiency. The flow forecasting as investigated in this paper, can rely solely on using the available rainfall data.
The Suhareka muncipality is on the sothern part of Kosovo and is characterised with the continental climate with the Mediterenean impact. The Suhareka river, flowing through this region is a small river but with higher flow oscilations through out the year, meaning that the Q max /Q min flow relation is quite high. The Suhareka River is also known as a dry river, but since 2015 it has been flooding the downstream areas about 3 times and damaging the agriculture and industrial activity in its vicinity. Therefore, the integrated flood management is a priority in this area and it requries the knowledge of the probabilistic flow duration curves. On the basis of the hydrologic data availibility, there are some methods that can be used in this regard. The rain intensity (i) ore rain height (H) dependence on duration time (tk) and return period (P), can be determined in Flow Rate Determination as a Function of Rainfall for the Ungauged Suhareka River Laura Kusari 1 , Lavdim Osmanaj 1* , Hana Shehu 1 , Samir Bungu 1 1 Faculty of Civil Engineering, University of Pristina, Kosovo a pluviograph station that has a long observation period, as a final product of statistical analyses of rain (KHMI, 1984.). As for the pluvial floods, these are dynamic processes influenced by rain intensity and its distribution, catchments area, river flow density, catchment land use, soil geologic structure, soil moisture content, soil infiltration rate, etc. For the rainfall-runoff relationship identification, the most common method is the SCS method that is based on the dynamic interaction between rain intensity, soil infiltration and surface runoff.

FLOW TREND DYNAMICS AND DISTRIBUTION FUNCTIONS FOR EXTREME RAINFALL
The available rainfall data, for a 30 year time period, have some data deficiency during this time; therefore, the statistical population parameters will be evaluated on the basis of the representative group. As it is known, the larger the representative group, the smaller will be the errors in the population parametrs estimation and vice versa. From the population, we have the yearly rainfall series for 80 years, including those years when the data are missing. The representative group chosen is a 30 year series (Table 1).
In order to evaluate the arithmetic mean of population, of the representative group, the standard deviation of the population (SDP) (Bektesh B, 2005) has to be found first, as follows: where: σ -standard deviation of the representative group; n -number of cases in group.
On the basis of the equation (1) it can be said that: While standard deviation of the arithmetic mean (SDXavg) will be: respectively, The results enable the finding of the boundary interval of the arithmetic mean of the population. As per the obtained results, for the probability coefficients 95%, the boundary interval is calculated as follows: With the probability as high as 95%, it was found out that the mean arithmetic of population is within the range 759.47 mm and 830.19 mm.
Standard error of the standard deviation (SEoSD) (Bektesh, 2005) is: In theory and scientific research, trends are calculated as is regression, with the following formula: Parameter (a) shows the mean value of the time series, while parameter (b) shows the mean of the phenomena that is a study object, while (x) are stanadard values given for each year (Figure 1).

Regression and correlation analyses
In scientifi c research, the regression analyses is a very useful method of identifying the correlation between two or more variables or phenomena. The correlation between two variables is a bivariate correlation, while the correlation between three or more is known as a complex or multivariate correlation.
When ther is no linear function with the needed correlation coeffi cient of the (y) as dependent variable, from a single (x) as an independend variable, then a possible correlation could be tested through multi linear regression. At double linear regression, the results can be stated as a line y = b 0 + b 1 x, where b 0 is dependent from (z) (Maniak, 2010). Thus, a relationship y = y(x i ; z i ) is obtained: To determine the b o , b 1 and b 2 coeffi cients, the sum of all small quadrates S should be kept in minimum: By partaill derivation of this equation, through b o , b 1 and b 2 coeffi cients, the 3x3 equation system is obtained: Knowing that and the relevant expressions for Δxi and Δzi, following some transformations, the needed coeffi cients are the following: −∞ < < +∞ (12) The degree of determination is B = r yxz 2 and is defi ned as follows: with: In the considered case, the triple regression was obtained, connecting the hydrometric station of Suhareka (y) with those of Prizren (x), Ferizaj (z) and Prishtina (t) (Maniak, 2010). From these analyses of variance, an important, but not sufficient correlation was obtained (Table 2, Figure 2).
According to this analysis, the regression equation will be: The inadequacy of this relationship is also noticed by the high values of t-test and low ones of F-test as well as P-value which should be below the 5% probability level (<0.05). However, the cause of the failure of a linear regression is usually the nonlinearity between the variables.

Non linear regression and transformations
The cause of failure of a linear regression is mainly nonlinearity between variables (Maniak, 2010;Husno, 2007). When using a nonlinear regression, the curve function is often unknown (Karakuş, 2020). If the multiple nonlinear regressions are taken into account, then: On the basis of what was said above, similarly as in the case of linear regression, the Suhareka hydrometric station was connected with all three other stations. The nonlinear triple regression equation has the form y = a 0 * x 1 a 1 * x 2 a 2 * x 3 a 3 or logy = loga 0 + a 1 * logx 1 + a 2 * logx 2 + a 3 * logx 3 , where after determining the coefficients a 0, a 1 , a 2 and a 3 the same takes the form: or after anti-logarithm: The results of these calculations are shown in the Figure 3.

Distribution functions for extreme rainfall
At a first glance, it seems that between individual cases that produce mass phenomena, there is an irregularity and chaos, but when they are studied scientifically, it is seen that there are genuine regularity, principles and laws. These rules, principles and laws are best revealed by the law of large numbers of Laplace and Gauss.
The normal distribution is a symmetric twoparametric distribution with density function: +0.358 * log 2 + 0.172 * log 3 = 2.02 * 1 0.384 * 2 0.358 * 3 0.172 For standardization, take the standard variable k = (x-x m ) / Σ and for x m = 0, σ = 1 we have: The normal distribution function for calculating the stagnation probability is: The normal distribution is represented by the surface under the density function, which can be formed by the area x m ± kσ arranged symmetrically with the center. This area x m ± σ contains 68.26% of all cases.
Normal distribution symmetry is used to represent the distribution as right in a suitable probability diagram. The straight line is easily determined by points, i.e the average x m in P u = 50% and the values x m ± σ, which are 84.13% and 15.87% (or 1/6 of all values). With a linear division of the axes, the distribution function P (x) From here, while due to symmetry Cs = 0. The normal distribution function takes the following values: For P u = 50% → x = x m = 764.47 mm; for P u = 84.13% → x = x m + σ = 895 mm; and for P u = 15.87% → x = x m -σ = 633.9 mm. The Table 3 presents these calculations for different probability factors (Figure 4).
On the surface ± 3σ, which includes 99.73% of all cases (0.9973 * 38 = 37.897), almost all   Although only 0.26% of all values are out of bounds x = x m ± 3σ, it is still a shortcoming of the normal distribution that its smallest value -∞ is not physically meaningful. To obtain only the positive range (space) of cases, instead of x i we obtain y i = logx i or y i = lnx i (log-normal, Galton or Fechner distribution). In this case the function will be defi ned in the interval (0, ∞).

Autoregressive models for simulating monthly precipitation
The time series models used are often based on the equation known as the fi rst-order equation of Markov models.
For the rainfall of a season or a year with average μ x and autocorrelation coeffi cient ρ with time shift 1 we have: The recursion link for the generation of synthetic time series is the Fiering model. The application of the Fiering model to a normally distributed group with mean μ, standard deviation σ and autocorrelation coefficient r i is given by the recursion expression (Maniak, To establish the recursion equation, the average monthly rainfall is initially calculated as: The variance of the individual time intervals (monthly) t will be: While the autocorrelation coeffi cient is calculated with the expression: The general form of the equation suitable for use is: where: q (i, j) -the precipitation generated by string (i) in the i-th time interval, e.g for t = 1 per month we have j = 1,2,…, 12; Figure 5. Normal distribution density function μ j -monthly precipitation averages (j) (j≤12); σ j -standard monthly deviation (j); r j -the correlation coefficient between qj and qj-1; t i -random numbers normally distributed with μ = 0 and σ = 1.
The ratio On the basis of what was said above, the 30year series of monthly rainfall were takend and the statistical parameters were set as in the Table 5.
The model is based on equation (24), where 12 consecutive equations are described. The starting point is the average rainfall in October, 61.33 mm. To generate synthetic time series, a sequence of randomly distributed random numbers were computed z 1 , z 2 ,…. These numbers z i can be generated from a table of random numbers such as -0.313, -0.951, 0.590 and so on.
However, according to the model, z i numbers should be converted to random gamma distribution numbers according to the equation: where: t i -random numbers normally distributed (0; 1), t g -random gamma distribution numbers (0; 1; C g ), C sj -coefficient of asymmetry for months.
After converting these numbers, new numbers are obtained according to equation (25); where the 12 equations were then laid out for each month and the values were simulated: First year:  Precipitation from IDF curves, due to the climatic eff ect and non-uniform distribution of precipitation is increased by 8%, and the eff ective precipitation is determined according to equation (33).
For the Suhareka river basin the following characteristics can be used: L = 15 km; Lc = 7.5 km; i ur ≅ 5% and F = 80.2 km 2 , of which according to equation (32) and (31) results: T o = 3.761h dhe t p = 3.761 + a * t k . For F = 80.2 km 2 the regression coeffi cient is a = 0.44. The setting time of the synthetic hydrograph is defi ned as: Due to the small surface area of the basin, it can be assumed that in the synthetic hydrograph it is Tp = Tr (rise time = fall time). Eventually, by condition: By adjusting the units in the last equat ion, the maximum infl ows for diff erent return periods are obtained, which are presented in the Figure 9.
Finally, the dependence of the fl ow factor (X) of the area for diff erent durations (h) and diff erent return periods (years) is given. In this case, the eff ective rainfall is expressed according to the formula:  where: t -time in hours (h).

CONCLUSIONS
The aim of this research was to estimate the flow for ungauged Suhareka River, based on the rainfall data available. In the considered case, the triple regression was obtained, connecting the hydrometric station of Suhareka (y) with those of Prizren (x), Ferizaj (z) and Prishtina (t). From these analyses of variance an important, but not sufficient correlation was obtained. The inadequacy of this relationship is noticed by the high values of t-test and low ones of F-test as well as P-value. However, the cause of the insufficient correlationn is the nonlinearity between the variables, so after using transformation the Suhareka hydrometric station was connected with three other stations, using non linear triple regression analyses. The correlation coefficient in this case is higher than that of linear regression. Thus, r yx1x2x3 ≅ 0.62, which represents a very important correlation of these stations. The next step was the calculation of the distribution functions for extreme rainfall, with the use of Laplace and Gaussian law of large numbers. For a series of 30 years monthly rainfall data for the Hydrometric station in Suhareke, the Normal and Log Normal Distribution as well as Normal Distribution density function were calculated. Assuming that the rainfall series follow the Pearson III distribution, and based on the data on the maximum daily rainfall for the Suhareka region, the probability distribution of heavy 24h precipitation (according to the Pearson-III and log-Pearson-III distributions) for the combined series of maximum rainfall is determined.
According to the Pearson type-III distribution, the heights maximum daily rainfall heights for different return periods, in Hydrometric Station Suhareks, are calculated.
Statistical testing was performed (according to the test χ 2) to assess the suitability of the theoretical distribution with the empirical one. The results showed the acceptance of the hypothesis below the 1-ɑ = 5% probability level or with statistical certainty about 95%.
Aa a result, the Precipitation intensity curves were given for different return periods. Since no flow measurements have been performed in the Suhareka region, the SCS method was used, by which the high waters were indirectly determined by the transformation of rainfall into flow. The equation for the Suhareka Basin charcteristics is The results obtained through this paper, indicates that even for ungauged river basins the peak flows can be determined from the available rainfall data. This will be of great help to the water engineers that are facing many data deficiencies while managing water resources.