Climate Change and its Effect on the Energy Production from Renewable Sources – A Case Study in Mediterranean Region

In terms of climate forecasting, the Mediterranean region is among the most difficult. It is correlated with the five significant subtropical high pressure belts of the oceans and is symbolized by dry and hot summer and cold and rainy winter. Due to its location in the area, Albania is particularly susceptible to climatic changes. It has been noted that summertime sees the greatest temperature increases. More intense heat waves that stay longer and oc cur more frequently are anticipated in the eastern Mediterranean. The seasonal patterns of precipitation have not changed, but the amount of rain has become more intense. The effects of climate change have drawn attention to various renewable energy sources, including solar and wind power. In this study, the changes and prospective in average temperature, rainfall, humidity, CO 2 emission and their impact in energy production were investigated. Several different models such as Auto Regressive Integrated Moving Average method; Prophet algorithm; Elastic-Net Regularized Generalized Linear Model; Random Forest Regression models; Prophet Boost algorithm; have been built for the study and prediction of each variable. The appropriate models are used to determine the anticipated values of the indicators for a period of four years. The prediction shows an increase in CO 2 emission which leads to a decrease in energy production by hydropower. These findings suggest the use of other renewable sources for energy production in the country and the Mediterranean region.


INTRODUCTION
Climate changes have had an impact not only on life processes but also on different areas of the economy. Research studies have shown that the world's surface air temperature increased an average of 0.6° Celsius during the last century. This process has been accelerated by the burning of fossil fuels which releases carbon dioxide and other greenhouse gases into the atmosphere associated with the development of industry and population growth. The climatic factors such as: average temperature, precipitation, and CO 2 emission, are linked with economic and financial developments in different climatic regions. Most significant hazards associated with climate change are: extreme temperature; windstorms; drought; and floods. Energy is one of the most vulnerable sectors. Thus, these hazards are important and directly linked to the production of electricity by renewable source and especially hydropower plants.
Since the main source for energy production by HPP involves water inflow from rivers, snow melt and rainfalls, the average temperature and also CO 2 emissions may directly affect the amount produced by hydropower. Hydropower generation is one of the most favourable sources of energy since it has many advantages, such as the low cost and the absence of greenhouse gas generation.
The seasonal nature of some of the climacteric factors that affect the amount of energy produced by HPP requires that the predictive models are able to capture the visible but also the hidden seasonality. Short-term and long-term prediction scenarios of climate change and its effect have been studied by many researchers through the last decade.
This study highlighted the challenges and opportunities concerning the impact of climacteric factors on the amount of energy produced by hydropower. The results are based on the data for a Mediterranean country, such as Albania, but the models may be useful to other countries in the region. The nation is prone to a wide range of environmental dangers, including weather-related risks, geological risks like earthquakes, and other catastrophic phenomena. Severe storms, floods, heat waves, and wildfires become increasingly frequent and unexpected. The occurrence of "heat waves" has increased recently; 74% of all incidents have taken place since 1996. Because the nation's river and stream systems are important suppliers of water during rainstorms, it is extremely vulnerable to floods.
Since the 1970s, the average annual temperature in Albania has risen by 1.5°C. However, very limited information on climate effects on power generation, transmission and distribution is currently available for the region and especially Albania. Related to the monthly average temperature summer and autumn (from May to October) show stable average temperatures over the years, while the spring and winter months have more frequent extreme temperatures in their history, see Figure 1. In recent years, a significant increase in the average temperature for the winter months has been observed. The same behaviour is noted for the average rainfall.
The summer months are the ones with the least amount of rain in the history of summer observations, see Figure 2. In turn, the autumn and winter months are represented by the highest amount of precipitations. Over the years, it has been observed that the amount of rainfall for the autumn season has decreased. Extreme values are present especially for the first and last months of the year. Seasonal patterns are also clear in both temperature and rainfall.
The area has developed hydropower facilities as a source of electricity because of the wet winters and a vast river network in the region. Albania is one of the nations whose primary source of electricity comes from hydropower plants (almost 80% of country production). A severe drought that lasted for 30 years drastically decreased the amount of water in the river cascade. The high amount of rainfall during the wet months also increases the amount of inflows in the cascade of rivers which are the primary source of energy production. In this way, the amount of energy produced during this season increases significantly and meets the needs of the local market and during the last years also for the region.
Many classical prediction models have been used in the literature of climate change data more advanced models, such as machine learning models, are also needed. Franco et al., 2008 have estimated the impact of regional climate change on Figure 1. Temperature distribution by month, in Albania electricity demand in California using historical relationships between peak demand and temperature. Another view of short-and long-term impacts of climate change on electricity demand is presented by Emodi et al. (2018). They used an autoregressive distributed lag (ARDL) model with monthly data from Australia. The results of the study showed that electricity consumption will indicate a slight increase in winter as a result of the significant decrease in temperatures in this period and the peak consumption will be in the summer season, when the temperatures are higher. Wadsack and Acker (2019) conducted production cost modelling to examine climate change and future power systems. They studied the importance of energy storage in reducedhydropower systems in the American Southwest.   considerably increased the need of using hybrid models to improve the quality of predictions. Gjika et al. (2019) presented ensemble models to obtain energy prediction models: Hybrid models based on LSSVM method; improved Hybrid models with SARIMA and ETS forecasting model. The LSSVM hybrid model showed more accurate prediction. Hale and Long (2021) used time series models, ARIMA and exponential smoothing, to develop a prediction of Missouri's annual electricity generation. The study showed that ARIMA exhibited superior performance.
The model presented provides a univariate time series prediction of annual electricity generation using publicly available data. In their study, Guo et al. (2021) have studied the effects of climate change on hydroelectric power generation. They have used artificial neural network (ANN) optimized by Improved Electromagnetic Field Optimization (IEFO) algorithms to obtain prediction of energy demand.
Eysenbach et al. (2021) analyzed important factors for generation and load of Texas' power grid. They constructed different statistical models and machine learning with the data which included: weather, population, location, and time in its models and forecasts, all of which are autoregressive components of historical load data. Gjika et al. (2021) analyzed the seasonal pattern of climacteric factors: precipitation, average temperature, and water inflow affecting energy production. They considered different statistical learning methods for energy prediction: ARIMA, ETS, NN, TBATS, STLM, etc. and comparing the performance of the models they agreed for the data that neural networks provide good forecasts for monthly energy produced by hydropower. Gjika and Basha (2022), classical time series and deep learning models for energy load prediction. They evaluated the impact of climacteric factors on energy load using hourly and daily time series for a period of three years. On the basis of the frequency of the data, the models considered showed competitive performance for short-and medium-term energy load forecasts. Lahouar and Ben Hadj Slama (2015), proposed a load prediction model of one day ahead with resolution of one hour, using regression random forests. The main variables used are: season, temperature, type of the day and hourly load. The models proposed offered accurate and effective long term prediction. A forecasting model based on the random forest algorithm was created by Meng and Song (2020) for three categories of winter days in North China. The daily power generation at the Zhonghe PV station, which is situated in the middle of North China, is forecasted using the proposed model as well as three additional approaches in order to assess its performance. Serras et al. (2019), combined random forests with physicsbased models to forecast the electricity output of the Mutriku wave farm on the Bay of Biscay.
In this article a case study employing average temperature, rainfall, CO 2 emissions, and their impact on energy production from renewable sources is described.

METHODOLOGY
The main solution to all issues with predictive modelling is frequently said to be machine learning and deep learning techniques. Different machine learning techniques, as well as traditional time series forecasting techniques have been used in this study. The data have the form of 1 1 ( , ),.....,( , ) n n x y x y where k y ∈ � ℝ is the response variable and ,1 ,

Auto regressive integrated moving average
By modelling the correlations in the data, the ARIMA methodology is a statistical technique for assessing and creating a forecasting model that accurately depicts a time series, Box and Jenkins, 1976. In order to generalize the forecast and boost prediction accuracy while maintaining the model's parsimony, ARIMA models only require the past data of a time series. The full model can be written as: where: p -the order of the autoregressive part, d -the degree of first differencing involved, q -the order of the moving average part; t ϖ -is white noise.
Past data are typically used as predictor variables in time series modelling. However, other factors that have an impact on the target variables are occasionally required. Regression modelling is necessary to include those variables. To capture sophisticated patterns, dynamic regression will be employed instead; in contrast to conventional regression models, this one uses ARIMA to describe residuals. The regression model error and the ARI-MA model error are the two error factors in this model. The faults in the ARIMA model are considered to be white noise (Harris and Sollis, 2003).

Prophet algorithm
Prophet is a well-known local Bayesian structural model used for time series prediction. It is a technique used especially when dealing with time series data based on an additive model where non-linear trends are coupled with yearly, weekly, and daily seasonality, as well as holiday effects. Strongly seasonal time series and multiple seasons of historical data are ideal for it (Taylor and Letham, 2018). Since the main concern in these time series are outliers, Prophet typically manages outliers well and is robust to missing data and changes in the trend.
The process is fundamentally an additive regression model with four major parts, of the form (2) where: l(t) -denotes a logistic growth curve or piecewise-linear trend.
It uses change points from the data to automatically identify trends that have changed. The seasonal component of p(t), which describes the distinct seasonal patterns, is made up of Fourier terms for the pertinent periods. Holiday impacts are inserted as straightforward dummy variables and are described by h(t) and white noise is present by t ϖ .

Prophet boost algorithm
A well-liked and effective version of the gradient boosted trees technique is XGBoost. Regression trees serve as the weak learners when utilizing gradient boosting for regression, and each one of them associates each input data point with a leaf that holds a continuous score. A convex loss function and a penalty term for model complexity are combined to form a regularized objective function that is minimized using XG-Boost. Because the loss when introducing new models is minimized, the technique is known as gradient boosting. In order to achieve the best of both worlds, machine learning and prophet automation, the Prophet Boost algorithm combines XGBoost with Prophet. Prophet modelling of the univariate series is the initial step in the algorithm's operation. Following that, the Prophet Residuals are regressed using the XGBoost model while employing regressors provided by the pre-processing recipe. XGBoost is a popular and efficient open-source implementation of the gradient boosted trees algorithm. Gradient boosting is a supervised learning algorithm, which attempts to accurately predict a target variable by combining the estimates of a set of simpler, weaker models.

Elastic-net regularized generalized linear model
One of the most popular methods for inferential modelling is the use of generalized linear models (GLMs). They are particularly useful tools for expressing causal inference to stakeholders since they are straightforward and simple to understand. A popular regularization technique called elastic net regularization removes irrelevant and highly correlated information, which can be detrimental to accuracy and inference. These two techniques are a valuable addition to any data scientist's toolset. To efficiently compute a model's coefficients for any link function, not only the simple ones, Tay et al. (2018), published a research that makes use of cyclic coordinate descent.
The response can be modelled as a linear combination of the covariates, according to the ordinary least squares model. The residual sum of squares is minimized in order to estimate the parameters. In the last two decades, regularization techniques have been the subject of extensive research. Of particular interest are the elastic net that minimizes the total of the residual sum of squares and a regularization term that combines 1  and 2  penalties: where: γ and α -the tuning parameter and a higher level hyper parameter, respectively.
Three components make up a generalized linear model: a linear predictor, a link function, and a variance function that depends on the mean. The residual sum of squares term in Equation 3 is changed to a negative log likelihood term in order to apply the elastic net to GLMs: where: -the log-likelihood term connected to observation k.
The approach used to minimize this objective function is the same as that used for GLMs. The main distinction is that a penalized weighted least squares problem is addressed in place of a weighted least squares problem in each iteration (Friedman et al., 2010).

Random forest regression models
The most well-known classification algorithm, Random Forest (Breiman, 2001), can perform both classification and regression. Large amounts of data can be accurately classified using it. The fact that the method combines decision trees gives it the name "Random Forest." Each tree in the "forest" is reliant on the values of a random vector sampled independently from the others using the same distribution. Each one has been given the best chance of growth. The Random Forest principle states that although though individual trees may be "poor learners," when they are all combined, they might make up a single "strong learner." Random selection of input variables enables the achievement of this objective in the tree growth process. After T such trees 1 { ( ; )} T t x Κ Θ are grown, the random forest (regression) predictor is: where: t Θ -characterizes the t-th random forest tree in terms of split variables, cut points at each node, and terminal-node values.

Evaluation metrics
The performances of the models presented in this study were evaluated in terms of a number of metrics. The selection of the most accurate model is made by analysing and comparing error measurements and information criteria for proposed models, as well as extending personal judgment to the advantages offered by each model based in the nature of the data. The metrics used to assess and compare the various methods are: Many arguments and discussions of using the appropriate accuracy measurements of the model are presented in the literature. On the basis of these discussions and the nature and complexity of the data in reference of (Hyndman & Koehler, 2006) MASE offers a straightforward indication on the relative model performance compared with the naïve benchmark. It is a scale-independent measure where a value of less than one indicates that the performance of the model is better than the naïve benchmark on average, whereas a value greater than one indicates the opposite. What is important is the fact that this critical value should not conclude the performance of the model but further analysis is suggested.

RESULTS AND DISCUSSIONS
The objective of this paper was to study the changes in average temperature, rainfall, CO 2 emission and their impact in energy production. The data used in this work are obtained from official publications, The Word Bank and Databank. In the first step of this work, average monthly temperature and rainfall, annual CO 2 emissions in Albania were studied, for the period 1970-2021. Statistical analysis was performed for each of the indicators. Then, several different models have been built for the study and prediction. The authors have used classical models such as Auto Regressive Integrated Moving Average method, then machine learning algorithms: Prophet algorithm, local Bayesian structural time series model; elastic-net regularized generalized linear model; random forest regression models; prophet boost algorithm. The correctness of each model was also assessed using performance metrics. The appropriate models are used to determine the anticipated values of the indicators for a period of four years after choosing the optimal model for each explanatory variable.
In the following phase, the effects of temperature, precipitation and CO 2 emissions on the annual energy produced by hydropower for the years 1990 through 2021 were examined. The predicted values received for each of the variables are placed in the best chosen model, in order to obtain the annual forecast of energy production by hydropower, for 2022 to 2025, see Figure 4.

Temperature, rainfall and CO 2 emissions
The mountains, hills, and coastline of Albania make up the majority of its topography, and the nation's geology and climate contribute to an enormous network of rivers and lakes. Since the 1970s, the average annual temperature in Albania has risen by 1.5°C. Intensity, length, and frequency of heat waves are predicted to rise in the eastern Mediterranean, maybe up to six to eight times per year. Summer temperatures are comparable in all coastal locations of Albania; however, the northern half of the country's coastal zone normally experiences lower winter temperatures than the central and southern zones. Albania's average annual temperature from 1970 to 2021, is 12.43°C; the year with the lowest annual temperature, 10.6°C, was in 1977, and the year with the highest, 13°C, was in 2018. It has been noted that the summer months of June, July, August, and September have the greatest temperature increases. The months of April and November are also becoming warmer. In the past 20 years, September and November have had greater temperature variability, while March has experienced substantially less.
First, the time series were explored examining its components and seasonality structure, and a correlation analysis was carried out to identify the main characteristics of the series. The data displays a clear rising trend. The plot displays a clear seasonal pattern, and the random component seems to be dispersed randomly. Then, the data were divided into two part the train (80%) and test (20%) data, Figure 5, which also present the time series of monthly temperature from 1970 to 2021.
Several different models have been built for the study and prediction of temperatures. After the models were constructed, each one was tested, comparing the models' accuracy to performance metrics and real vs. projected plots. An accuracy table is created for the models in order to evaluate and contrast their performance, see Table 1. From the results of accuracy table, it can be concluded that the Prophet algorithm is the best model for  Figure 6 give the predictions of temperature for 48 months for years 2022 and 2025, with mean annual temperature 13.18;13.22;13.28 and 13.31 Celsius degree, respectively.
Most of Albania's precipitation has fallen in its western regions, notably in the northwest. Although there is considerable intra-seasonal fluctuation in Albania's precipitation, since the 1970s there has been a very modest (but statistically insignificant) decline in mean annual precipitation. The number of rainy days each year has increased. Although seasonal precipitation patterns as a whole show little change, rainfall has become more intense. The amount of infrastructure that needs to be maintained and ready for use in managing flood waters are both impacted by the increased intensity of rainfall.
The regional variation of the measured average monthly rainfall for the period 1970-2021 is shown in Figure 7, together with models performance. There are the graphs of the actual vs forecasting values in the figure to follow, for each model. The year 2011 recorded the least amount of rainfall, with an annual rainfall of 856.01 mm, while the year 2010 recorded the most, with an annual rainfall of 1470.31 mm. The average annual rainfall in Albania during the past 50 years has been 1136.03 mm.
The fundamentals of machine learning models vary greatly. Therefore, before selecting the optimal model, it is crucial to examine the predictive ability of at least two models. Thus, for the built models, for rainfall estimation and forecasting, an accuracy table was built, see Table 2. It was shown that for the given dataset, Prophet algorithm performs better than other models.
This model is used to make predictions of rainfall for 48 months for years 2022 and 2025, with annual rainfall 1189.7 mm, 1190.1 mm, 1190.7 mm, and 1191 mm, respectively, see Figure 8.
In Albania, the CO 2 emissions in 2020 decreased by 0.473 megatons, or 8.48%, from those in 2019. According to Figure 7, Albania's percapita CO 2 emissions in 2020 were 1.73 tons, a drop over the previous year. Since 2010, both the      Figure 9. For CO 2 emissions, the same models that were created for temperature and precipitation were also used, with training and testing sets of data. After the models were built, they were assessed, and the best model was selected. The forecasts were generated using each model and evaluated against the baseline.
According to the accuracy table, Table 3, the Prophet Boost algorithm is the best model for the considered data. This model has been used to make predictions of the CO 2 emissions for years 2022:2025, see Figure 10.
Machine learning random forest models to investigate the importance of temperature, rainfall and CO 2 emissions in energy production by hydropower Albania is blessed with a wide range of energy resources, including coal, oil and gas, hydropower, biomass from natural forests, and other renewable energy sources. Albania is already susceptible to changes in precipitation, as seen by the severe energy shortages caused by the 2007 drought. In addition to harming coastal infrastructure, landslides and floods in the plains and lowlands in previous years also severely damaged infrastructure. The potential for hydropower in Albania is likely to decline as a result of rising temperatures, which are also anticipated to modify the seasonal need for heating, cooling, and refrigeration. As the nation develops, there will be a growing demand for energy from individuals, businesses, and the overall economy, which could exceed current energy production and transmission facilities. Albania has already committed to building at least three new solar photovoltaic power facilities starting in 2020 to accommodate this demand.
The next stage of the analysis looks at what impact temperature, precipitation, and CO 2 emissions have on the annual energy generated by hydropower from 1990 to 2021. The regression  Table 1 gives the performance metrics for each of the models From the results in Table 4, it can be seen that the model that best suits the data is the random forest model. To fine-tune the parameters, we created a unique random forest model. By comparing the results of several parameter combinations, it was possible to determine that 500 trees and a maximum of 80 nodes were the optimal parameters for the adopted model. The degree to which predictor factors reduced node impurity during split selection serves as a measure of their effectiveness.
The variables' relative importances are depicted in the plot in Figure 11. The multi-way importance plot displays the relationship between three important metrics and identifies the top variables for each of these metrics, which are based on the forest's tree structure: the number of trees whose roots are divided on the variable; the mean depth of  the first split on the variable; and the overall number of nodes in the forest that split on the variable. The behavior of energy generation by hydropower is clearly influenced by all of the variables, as is shown. The most important factor is Rainfall, followed by CO 2 emissions and temperature. The distribution of minimal depth among the trees in your forest is depicted in Figure 12. The scale of the X axis ranges from zero to the maximum number of trees, in this case 500, in which any variable was used for splitting, and the mean of the distribution is indicated by a vertical bar with a value label on it. The depth of the node that splits on a variable and is the furthest from the tree's root is the minimal depth for that variable in the tree. If it is low, this variable is used to partition a large number of observations into groups, in the presented study this variable is Temperature.
The random forest constructed model is used to anticipate the yearly hydropower energy output from 2022 to 2025 using the predicted values for temperature, rainfall, and CO 2 emissions. The results are plotted in Figure 13, with values 6239200; 6260434; 5452307 and 5870179 MWh, respectively. As it can be seen from the results, it is expected that there will be a decrease in the production of electricity from hydropower in Albania in the next 4 years.

CONCLUSIONS
This study explored the climate change in Albania and its impact on energy production. Albania has experienced an increase in mean annual temperature of 1.5°C since the 1970s. Intensity, length, and frequency of heat waves are predicted to rise in the eastern Mediterranean, maybe up to six to eight times per year. Despite the considerable intra-seasonal variability of Albania's precipitation, a modest decline in mean annual precipitation has been noted since the 1970s. The rainfall in February, September, October, and December has decreased during the past ten years.
Several different models have been built for the study and prediction of temperature, rainfall and CO 2 emissions. The results obtained showed that Prophet algorithm performs better than other models for temperature and rainfall, while Prophet Boost algorithm is the best model for CO 2 emission prediction. The average annual temperature for the 4 years, 2022-2025 and the CO 2 emission will continue to increase. The predicted values received for each of the variables were placed in the best chosen model, in order to obtain the annual forecast of energy production by hydropower, for 2022 to 2025. Compared to 2021, energy production from hydropower will have a significant decrease in 2022, as well as in the following years.
Accuracy in these types of forecasts is not high, as extreme weather can affect the factors that impact energy production by hydropower. Moreover, consideration by governments of other renewable sources can make the energy produced by hydropower stay at normal capacities. One of them can be the energy produced by photovoltaic sources.