Real-time Monitoring and Assessment of River Basin Conditions for Drinking Water Intake Protection

Albert Mpé Guilikeng
WATERNET EN1016
Rue du Président Wilson, 42 BP56, 78230 Le Pecq sur Seine, FRANCE
Email: ampe@citi.suez-lyonnaise-eaux.fr

OBJECTIVES

This paper presents the EU funded Telematics Application Programme "WaterNet" Project (EN1016); Distributed Water Quality Monitoring using Sensor Networks. During this project, two applications were developed for the demonstration of online river monitoring. These applications are based around two existing monitoring networks. The major objectives of the two applications are:

While these may appear common objectives for river monitoring, instead of using the normal procedures of sampling and subsequent analyses, which are time consuming and far from real-time, the objectives are met by the use of a geographically distributed network of automatic monitoring stations equipped with sensors and analysers.

The difficulty facing the users of existing monitoring systems is the rate and volume of data being collected, which prevents users from effectively overseeing the river state in real-time. Hence, their effective use demands real-time data handling capability in order to turn this monitoring data into useful information for water quality managers. This in turn enables them to make rapid and reliable decisions regarding the state of the river water for use in the production of drinking water.

USER REQUIREMENTS

The basic needs expressed above led to specific user requirements for an online, real-time, data handling/processing system that includes the following features:

In order to meet these requirements, different data fusion/data handling methods were developed and implemented. These are summarised in Table 1.

Level Type of Fusion Function Space
3 Decision Fusion Actions on environment, treatment plants, sensors,....  
2 Feature Fusion Detection of changes in state, classification, explanation... Multi-parameter, spatial
1 Data Fusion Cross Validation Geographical validation
space-time validation
Multi-parameter, spatial
0 Single Data Validation Low level validation One dimensional

Table 1: Data handling methods

The methods are general in the sense that they can also be applied to other systems where assessment of water state and classification of quality are required. The WaterNet applications were developed to build on the existing SCADA systems using the latter systems as front-ends to the real world while extracting data from the existing databases. The application runs on an NT platform. The extracted data is converted into useful information for presentation to different users. This means that the overall functionality of WaterNet is to concentrate and handle geographically distributed data and to distribute the resulting information to users in different geographical locations. This functionality collection process is structured as presented in Figure 1.

figure 1

Figure 1: Fusion Hierarchy

Although data handling in WaterNet is based on the fusion of data collected from automatic monitoring stations, the higher levels of the fusion hierarchy (the data collection and analysis procedure) also require other inputs such as manual sample data, descriptions of pollution events, and physical/geographical river data etc. If this data is not included in existing SCADA systems it is necessary to extend this or extract the data from other sources.

THE APPLICATIONS

The existing automatic monitoring networks used in the project are located on the River Seine (Paris area) and on the River Llobregat (Barcelona area). The two different monitoring networks both rely on SCADA systems to collect data from automatic monitoring stations located on the river banks. Water is pumped into the station houses, pre-treated (for most of the necessary measurements), and pumped to the different sensors/automated analysers. All activities at each station are controlled by a local computer, which in turn can be controlled from the SCADA system's main computer. Data is stored both locally and transmitted to the SCADA system's main computer located in a central control room.

The Paris-based application consists of the automated monitoring network, "APES," which is located on the River Seine and which relies on five monitoring stations located in the western part of the Paris region. The Barcelona-based application is built around the "SIAM," automated monitoring network, which is located on the river Llobregat. It relies on ten monitoring stations distributed along the river and its main tributaries. These stations are operated by Grupo Aguas de Barcelona. Tables 2 and 3 define the stations and the data collected for each application site.

  Issy-les-Moulineaux Suresnes Villeneuve-laGarenne Chatou Rueil
Temperature x x x x x
Dissolved O2 x x x x x
Ammonia x x x x x
Monitored Redox-pot. x x x x x
Turbidity x x x x x
Conductivity x x x x x
pH elements x x x x x
Phosphate x x x x x
Phenol   x   x  
Hydrocarbons   x   x  
UV-Abs.   x      
Visible-Abs.   x      
Heavy Metals   x   x  

Table 2: Monitoring stations and measured elements in the "APES" monitoring network (France).

  Balsa-reny Castell-gail Ca Castell-gail Li Abrera Martorell Pressa Sedo El Papiol Rubi S. Joan Despi
Temperature x x x x x x x x  
Dissolved O2 x x x x x x x x x
Ammonia   x x x x x x x x
Suspended Solids             x    
Turbidity x x x x x x x x x
Conductivity x x x x x x x x x
pH x x x x x x x x x
Phosphate   x x   x x x x x
TOC x x x x x x x x x
Cyanides         x   x x  
Hydrocarbons   x     x   x x  
UV-Abs.   x x x x     x  
Total Chromium   x x   x        
Heavy Metals             x    

Table 3: Monitoring stations and measured elements in the "SIAM" monitoring network (Spain).

THE VALIDATION PROCESS

According to the data-fusion hierarchy, the first analysis (the collection procedure) applied to the incoming data is a stage of single data validation. However, as shown in Figure 2, the term "single data validation" covers a series of analytical methods applied to the data. The aim is to establish a synchronised time series in which each measurement for each element of data collected is given a "confidence value." The result is the generation of a full input vector at each time interval for higher level fusion methods. Slope and range compliance is applied in this application in a standard way.

Data synchronisation and gap filling are carried out according to a simple concept. If a current measurement of any data element does not exist within a specified time frame around the acquisition time, then the nearest valid measurement is used. In online mode, this is the last valid reading. Within the river environment, different data elements have different dynamic characteristics. Some may change faster than others. To take this behaviour into account, data elements that change slowly are allocated a "high persistence" ranking, and those subject to rapid changes, a "low persistence" value. The value is arbitrarily defined as the time for the confidence in a measurement to fall to 50 percent. As a specific measurement ages, the confidence ranking in the value decreases, and if the measurement continues to age the confidence should decrease progressively faster.

THE "SITUATION DESCRIPTION"

Defining the situation description involves three methods of multi-parameter validation, which provide additional information to assess the confidence of the measured data by using information from other sensors. The first method is geographical validation, which compares time-series measurements of the same data element at different locations. The measurement is concerned with daily variations which are expected to be the same at different locations. The validation method yields a high confidence value when the two measurements exhibit similar trends, and lower confidence when the trends are dissimilar.

Cross validation utilises information from the time-series of two different but related data elements at the same locations. The cross validation algorithm calculates a confidence value for a measurement based on the rates of change of each of the two data elements. The rate of change for each data element is divided into three zones based on the "slope compliance" - slope ascending and slope descending rate limits, and allocated a confidence value. Consistent with geographical validation procedures, the basis for this confidence is the correlation between the variations of the two time-series.

"Space time" validation as part of the geographical validation process compares two time-series data sets of the same data elements at different monitoring stations. However, it also takes into account the transport time between the stations. For this purpose, the transport times between two stations are simply represented by a table which contains the normal observed transport time for each week in the year. These times are used to incorporate a time shift between the upstream and downstream time-series prior to a comparison of the trends. However the actual transport delays are expected to deviate from those in the table. Hence a deviation is considered in the time shift, and the best confidence figure found in the range of the compared time is used.

SITUATION ASSESSMENT

figure 2

Figure 2: Low level validation train

As mentioned earlier, one of the major objectives of the project was to protect drinking water intakes. This requires the rapid assessment of water quality in the river from which the water is abstracted. The task is therefore to provide the operator with an online classification of the water quality in two or more classes and present the results in an easy to understand user-interface which can be used as decision support system, for example, in deciding whether to open/close the water intake or to choose the proper mode of operation of the drinking water treatment plant. Three different methods have been used, and the results from each of the methods are available simultaneously to the operator presented online as colour coded classes.

The approach is generally the same for each of the methods, and in practical terms defines a software sensor that gives a one or two dimensional output, which then can be compared to a table of threshold values giving the class of the water quality.

Classification

The simplest method of situation assessment is to compute a water quality index by comparing the input data for comparison with a threshold table of the type presented in Table 4. In fact, this is the normal method used for the classification of rivers. The resulting water quality is then classed in the following way: Excellent (0-25); Very good (25-50); Good (50-75); Bad (75-90); Very bad (90-).

Three classification methods are configured within the Paris application. The first two of these are based on the French river and canal water classification method developed by Duport and Margat [1] (known as the "Multipurpose Scale" method which has been in use in France since 1971) and which provide the user with:

The third classification method is based on the newly proposed classification method SEQ-eau, which divides classification into different "use-types." The most important of these for potable water production is the AEP (Alimentation Eau Potable). This system is more complex but can be represented in a conservative manner using the generic classification method used in the tool.

In practice, it is not possible to collect online, all of the data elements included in the classification scheme discussed. Hence, within the actual application, only those data elements which are available are used to carry out the assessments. However where additional laboratory analyses can be made, these may be used in the same way as online data to compute water class information.

Parameter   Class 1
Blue
Class 2
Green
Class 3
Yellow
Class 4
Orange
Class 5
Red
 
Dissolved oxygen 99999 7 7 5 3 -99999
NH4+ -99999 0.5 0.5 1.5 4 99999
NO3- -99999 25 50 50 50 99999
Turbidity -99999 2 35 1500 3750 99999
Transparency 99999 2 1 0.1 0.05 -99999
Conductivity -99999 2000 2500 3000 4000 99999
pH-min 99999 6.5 6.5 6.5 5.5 -99999
pH-Max -99999 9 9 9 9.5 99999
Cadmium -99999 5 5 5 5 99999
Chrome-total -99999 50 50 50 50 99999
Nickel -99999 50 55 100 400 99999
Lead -99999 50 50 50 50 99999
Copper -99999 1000 1000 1000 1000 99999
Zinc -99999 5000 5000 5000 5000 99999

Table 4: Classification thresholds.

Variation Detection

The second method for situation assessment provides the operator with a tool for the online detection of significant variations in input data, and thereby enables the identification of changes in the water quality. Normally these changes are identified using a simple criteria based on the trend of sensor measurements. Most SCADA systems generate an alarm when the trend of a measurement is greater or lower than a threshold value. However, there are situations where the variations at each sensor cannot be considered critical, but nevertheless still indicate an overall variation of the water quality.

The method can be regarded as a "constant window" on quality, from which it is possible to construct an online software sensor called "Detect" (where the values describe the variance in the data set), and compare this with another software sensor called "Threshold." The construction of "Detect" is based on the co-variance, because it is considered that the co-variance represents a variation in the data. The sensitivity of the detection method automatically varies in time according to the level of variance for the input data.

Feature Maps (Kohonen maps)

The third and most complex method is based on the use of "Self Organising Feature Maps" (SOFM) or Kohonen maps. These maps deal with high-dimensional state space (multiple time-series data) by:

figure 3

Figure 3: The propagation process of SOFM - finding the best matching unit to input data (sample).

Pollution Tracking

All of the three previous methods can be used to detect a change in monitored river state. A sudden change could very well arise from an industrial spill upstream. Since monitoring networks tend to cover quite large areas, one or other station might detect this change, and hereafter it is possible to evaluate when the pollution will arrive at the water intake and when it will subside.

Therefore the project has included a simple pollution tracking method which is able to simulate the time of arrival and the concentration of a pollutant, and compare measured values at different stations. The method makes use of a table that gives rough estimates of dilution. The dilution is estimated from the known flows (weekly figures) of the various tributaries expressed as a percentage of the contribution to the main river at the most downstream station. The arrival time is computed using a transport timetable (weekly figures) as in the case of space time validation.

TRANSFERABILITY ASPECTS

The difficulties of dealing with large quantities of process data in real-time are common in many large process monitoring applications. For this reason, not only the methods and techniques implemented in the tool but the tool itself is largely applicable to other large scale process monitoring problems. Recent discussions with staff from water production plants in England have identified that the application is well suited to joint river and treatment plant monitoring applications.

The application of the tool to new sites is constrained only by the existing data collection infrastructure. However, where a suitable infrastructure does exist, the problems involved with implementing a new application are confined to existing database(s) and configurations.

Cost Benefit Considerations

Though generally considered to be a stable process, the production of drinking water is nevertheless a critical application where human health is potentially at risk. Improving technology instrumentation and telemetic systems provide the means to improve the security of operations like drinking water production, petro-chemical and other large scale process industries. However, economic pressure also results in either the reduction of staffing levels, or an increased monitoring burden on existing staff. This often leads to a single operator being responsible for overseeing a large quantity of real-time data from possibly several distributed applications. This places a huge burden on the operator who at best can only oversee between 10 and 20 time-series data sets effectively. Hence centralisation of monitoring operations can lead to increased risk because the capabilities of the operator to absorb and analyse information are surpassed. The application of data fusion and classification techniques allow a single operator to effectively oversee much larger data sets. Hence the WaterNet application provides a tool allowing for useful information to be derived from the investment in sensors and telemetics.

Payback periods, calculated on the basis that the application allows for one person to oversee the data which would otherwise required more staff is short, usually about six months, especially since the application is only an add-on feature designed to handle data from existing systems. The investment required to purchase hardware, installation and configuration is expected to be about 20 KECU.


REC * PROGRAMS * TELEMATICS * DETERMINE

HOME PAGESEARCH