Analysis of Spatial-Temporal Variations of Surface Water Quality in the Southern Province of Vietnamese Mekong Delta Using Multivariate Statistical Analysis

The study aimed to assess the variation in surface water quality in the Tien Giang province, Vietnam, and at the same time identify the main sources of water pollution. The surface water quality samples were collected at 34 locations (NM01-NM34) with 17 surface water quality indicators in March, June, September and November in canals and rivers in the Tien Giang province. Multivariate statistical analysis methods, including principal com ponent analysis (PCA), cluster analysis (CA) and numerical discriminant analysis (DA), were used to analyze the variability and key indicators affecting the effect of multivariate statistical analysis. The analysis results show that the surface water quality in the study area is contaminated with organic (low DO, high BOD and COD) and nutrients (NH 4 + -N, NO 2- -N, PO 4 3- -P and TP), salinity (high Cl - ). The PCA results showed that 14/17 surface water environmental parameters to be monitored are pH, temperature, TSS, BOD, COD, NH 4 + -N, NO 2- -N, PO 4 3- -P, TP, SO 4 2- , Cl - , coliform and Fe. The PCA analysis showed that PC1-PC4 accounted for 79.70% of the variation in surface water quality in the study area. Potential surface water polluting sources include hydrological regime, do mestic waste, agricultural production, industrial production activities. The CA results showed that 34 monitoring locations can be reduced to 27 locations, with a frequency of 4 times/year to ensure surface water quality repre -sentativeness. The DA indicated that the indicators of EC, SO 4 2- and Cl - made the difference of the surface water quality between the wet and dry seasons. The current results provide important information on the current state of water quality for different uses and contribute to the improvement of the surface water quality monitoring system in the Tien Giang province.


INTRODUCTION
Surface water pollution with toxic chemicals as well as eutrophication in rivers and lakes with sudden increase in nutrients are environmental problems that have received worldwide attention (Iscen et al., 2008). Natural factors, such as discharge, rainfall, soil erosion, biochemical characteristics of watersheds and human factors, including urbanization, industrial and agricultural activities, can all affect the surface water quality (Hajigholizadeh & Melesse, 2017). Agricultural, industrial and urban activities are considered as the main sources of pollution in aquatic ecosystems, with potential impacts on the ecological environment, human health and economic development (Mustapha & Abdu, 2012;Zeinalzadeh & Rezaei, 2017;Yang et al., 2020). The pollutants in water can cause acute or chronic poisoning of humans through the use of unsafe drinking and domestic water sources (Si et al., 2015). Respiratory diseases, gastritis, diarrhea, vomiting, neurological and cardiovascular disorders, as well as skin and kidney problems are all associated with the use of contaminated water (Haseena et al., 2017). Besides, nitrogenous chemicals are the cause of cancer and baby blues syndrome (Currie et al., 2013). In addition, a large number of bacteria harmful to human health were found in contaminated water (Haseena et al., 2017). Along with the attention to water quality, the methods of assessing water quality are also more and more diverse. Currently, assessment methods such as index method (Son et  , and discriminant analysis (DA) are widely used multivariate statistical methods. PCA is used to identify the potential factors or sources of pollution affecting the water quality in the study area, optimizing future monitoring programs by reducing the number of monitored parameters (Tanriverdi et al., 2010). The CA analysis aims to find patterns and variants with similar biophysical and biochemical properties, optimizing future monitoring program by reducing monitoring frequency and number of monitored locations. DA was performed to differentiate variables between two or more groups, introducing important variables that lead to differences between water quality groups (Koklu et al., 2010;Schaefer & Einax, 2010). All three methods mentioned above have the ultimate purpose of optimizing the water quality monitoring network system, saving costs and time.
The Tien Giang province is a major center of rice production, aquaculture and seafood processing, making a great contribution to the country's agricultural exports, etc., towards the goal of becoming a province with a dynamic economy.
With the acceleration of the process of industrialization and modernization of the province, the problem of surface water pollution is inevitable. Therefore, a comprehensive water quality assessment, understanding the pollution status, and identifying the main pollution sources are urgent to protect water resources and control water pollution (Yang et al., 2020). Therefore, the research on applying multivariable statistics in assessing surface water quality in the Tien Giang province was carried out, through 17 basic water quality indicators of water samples in the province. The results can be used to establish a new, more suitable monitoring network, to support water quality management, control pollution sources and protect water resources in the study area.

Data analysis
The surface water quality is assessed using the National Technical Regulation on Surface Water Quality, column A1 (QCVN 08-MT:2015/BTN-MT, column A1). The seasonal and spatial variations in the surface water quality were assessed using Independent Samples T-Test, numerical discriminant analysis (DA) using IBM SPSS 20.0 Windows software (IBM, USA), and cluster analysis (CA) using Primer 5.2 software for Windows (PRIMER-E Ltd, Plymouth, UK). The key parameters affecting the surface water quality were analyzed by principal component analysis (PCA) using Primer 5.2 software for Windows (PRIMER-E Ltd, Plymouth, UK).

Summary of surface water quality in the study area
From the surface water environment parameter data at 34 monitoring locations obtained with a frequency of 4 times/year, descriptive statistics such as mean, standard deviation, lowest and highest values of 17 parameters are presented in Table 1. Table 1 shows that the average values of pH, 3--P and Clhad relatively large differences between the minimum and maximum values, and almost all exceeded the allowable thresholds of QCVN 08-MT:2015/BTNMT (column A1). DO in water was low (3.92±1.4 mg/L), consistent with the high BOD, COD and TSS in surface water. The results showed that the surface water quality in the study area is contaminated with organic matters, nutrients and salinity. The results also showed that the high concentrations of NH 4 + -N, NO 2 --N, and PO 4 3--P in water resulted in relatively high concentrations of TN and TP (Table 1). Table 2 presents the variation of surface water quality by month. Surface water indicators such as pH, EC, NO 3 --N, SO 4 2tended to decrease over time and in March and June, the values of these indicators tended to be higher than in September and November. Meanwhile, the Fe concentration fluctuated slightly over the months of observation and peaked in November. The remaining indicators fluctuated quite complicatedly, with differences between monitoring periods during the year. However, the concentrations of DO, NH 4 3--P, TP and Fe were not statistically significant between the four monitoring periods. The pH and COD values between March and other months (June, September, November) were significantly different (p<0.05). The temperature value between June and September has no statistical significance (p>0.05) and is statistically significant with March and November (p<0.05), while the temperature differences between March, September and November were not statistically significant (p>0.05). The values of EC and Clbetween the months (March, June) were significantly different from those of the months (September, November) (p<0.05). The TSS concentration between the months (March, June, November) was statistically significantly different from the months (June, September, November) (p<0.05). Regarding the BOD concentration, two pairs of statistically significant differences  4 2has a statistical difference between the months (March, June, September and November) at 5% significance level. In general, most of the surface water quality parameters were temporarily fluctuated.

Determining key parameters influencing the surface water quality
The results of principal component analysis of surface water quality in the study area are presented in Table 3. The first group of water quality indicators (PC1), which contributes 31.20% of the total data variables, was determined including TSS, BOD, COD, NH 4 + -N, NO 2 --N and coliform, which were positively correlated with each other. These can be organic matters, nutrients and microorganisms affected by the decomposition of organic compounds originating from domestic, aquaculture and industrial activities as well as human and animal feces that contaminate water sources. In addition, the presence of TSS in water usually originates in alluvium of flows, solids from riverbanks and canals and is influenced by the movement of water transport and aquatic organisms (Nhan, 2013). The second factor (PC2) accounted for 23.70% of the variability of the original data set consisting of EC, SO 4 2and Clwhich were positively correlated with each other as well as negatively correlated with PO 4

3-
-P and TP. The origin of PO 4 3--P formation in water can come from the wastewater containing detergents, domestic wastewater, the higher the PO 4 3--P concentration, the greater the TP formed in the water. The concentration of Clappeared to represent a source of saline water, along with receiving a large amount of domestic and industrial wastewater. PC3, in addition to the criteria appearing in PC2, also has the presence of the pH and Fe parameters, explaining 16.30% of the variation in surface water quality. Finally, PC4 contributed 8.40% of the total initial data with the parameters such as pH, temperature and Fe. As a result, the parameters of pH, temperature, EC, TSS, BOD, COD, NH 4 3--P, TP, Cl -, coliform and Fe were the key water quality parameters influencing the surface water quality in the study area. The surface water quality impacting sources could possibly stem from hydrological regime, domestic waste, agricultural production, industrial production activities.

Spatial analysis of surface water quality
The results of cluster analysis showed that the surface water quality at 34 initial monitoring locations formed 6 clusters (Figure 2). Clusters I and III, each representing a separate location, corresponding to NM08 and NM04, should be kept for monitoring. Cluster I (NM08) was the least polluted location with most water quality indicators being lower than those in the other clusters (Table 4). Cluster III (NM04) was the only location where Fe appeared in water and exceeded the allowable limit of QCVN 08-MT:2015/BTNMT, column A1 (0.5 mg/L). It was also the cluster with the highest concentration of PO 4

3-
-P and TP among the six clusters. Because this is an area aff ected by the seafood processing wastewater, detergents are used during the stages of tool washing and factory cleaning. Clusters II and V each contained four monitoring sites with nearly identical water quality. Cluster II included NM29, NM32, NM33 and NM34 with the highest concentration of SO 4

2-
and Clin water, and this was also the cluster of locations bordering the sea, aff ected by seawater intrusion, causing high salinity in water. All the sites in this cluster are located in diff erent areas; the locations should be kept for future monitoring. Cluster V gathers the locations of NM13, NM14, NM15 and NM16, areas aff ected by residential activities that made the surface water polluted with organic matters, nutrients and microorganisms in the water (Table 4). This result indicated that this is the cluster of with heaviest organic and microbiological pollution. Cluster IV gathered 10 sites with similar surface water characteristics, namely NM03, NM05, NM06, NM07, NM22, NM23, NM25, NM27, NM30 and NM31 with relatively high EC and total N. For Cluster IV, one of the two locations NM05 and NM07 can be removed, because they are located on the Tien River and are subject to the same impact source. Finally, Cluster VI gathered many observation sites with similar physicochemical and biological properties in water, including NM01, NM02, NM09, NM10, NM11, NM12, NM17, NM18, NM19, NM20, NM21, NM24, NM26, NM28, are infl uenced by many diff erent sources such as residential areas, contiguous locations between provinces, and livestock activities. In Cluster VI, NM01 and NM02 belong to the Tien River area and are aff ected by residential areas, so one of these two locations can be eliminated. Through the CA analysis results, the 34 initially selected monitoring locations can be reduced to 27 locations, saving about 20.59% of monitoring costs but still ensuring the representativeness of the area and the source of impacts.
The analysis results in Figure 3 show that the cluster analysis from 4 monitoring periods formed two clusters of surface water quality with Cluster I collected in March at the end of the dry season and June in the middle of the rainy season. Meanwhile, cluster II included the rainy months of September and November. However, November is the end of the rainy season, entering the dry Figure 2. Spatial variation of surface water quality season. Between the two clusters, the obvious difference can be seen to be dominated by EC, SO 4

2-
and Cl -. EC, SO 4 2and Clconcentrations in cluster I were higher than those of cluster II (Table 4). This was also confi rmed by discriminant analysis. Therefore, the frequency of surface water monitoring should be carried out 4 times per year to clearly represent the temporal variations between the dry and rainy seasons.

CONCLUSIONS
The results show that surface water quality of the Tien Giang province in 2020 was polluted by organic matters, nutrients and salinity since the parameters TSS, BOD, COD, NH 4 + -N, NO 2 --N, PO 4 3--P, Clexceeded the limits of QCVN 08-MT:2015/BTNMT (column A1). Four PCs explained 79.70% of the variation in surface water quality in the study area. The surface water quality impact sources could be possibly caused by hydrological regime, domestic waste, agricultural production, and industrial production activities. The PCA results 14 water quality indicators including pH, temperature, EC, BOD, COD, NH 4 + -N, NO 2 --N, SO 4 2-, PO 4 3--P, TP, Cl -, coliform and Fe should be monitored. The CA analysis results showed that it is possible to reduce the monitoring locations from 34 to 27,  while still ensuring the representativeness of the surface water quality monitoring. However, the monitoring frequency should still be maintained at 4 times per year to clearly show the variation of the surface water quality in the study area over time. The parameters of EC, SO 4 2and Cl -, played a significant role in discriminating the surface water quality following seasonal variations. The current results provide useful information for water use planning and the surface water quality monitoring in the study area.