USGS - science for a changing world

South Florida Information Access (SOFIA)


projects > greater everglades hydrology monitoring network: data mining and modeling to separate human and natural hydrologic dynamics > work plan

Project Work Plan

U.S. Geological Survey, Greater Everglades Priority Ecosystems Science (GE PES)

Fiscal Year 2006 Study Work Plan

Study Title: Hydrology Monitoring Network: Data Mining and Modeling to Separate Human and Natural Hydrologic Dynamics
Study Start Date: 10/01/2004 Study End Date: 9/30/2007
Web Sites:
Location (Subregions, Counties, Park or Refuge): Total System
Funding Source: USGS Greater Everglades Priority Ecosystems Science (GE PES)
Other Complementary Funding Source(s): none

Funding History: FY05 was the first year of funding for this project; FY06
Principal Investigator(s): Paul Conrads
Study Personnel: Paul Conrads, Ed Roehl, Ruby Daamen, Mark Lowery, Toby Feaster
Supporting Organizations: USGS-South Carolina Water Science Center
Associated / Linked Studies: South Florida Surface Water Hydrologic Network for Support of MAP Projects (Higer, Telis, PIs); Water Quality Monitoring and Modeling for the A.R.M. Loxahatchee National Wildlife Refuge (Brandt, Harwell, Waldon, PIs); Estimation of Critical Parameters in Conjunction with Monitoring of the Florida Snail Kite Population (Wiley Kitchens, PI); Freshwater Inflows to Northeastern Florida Bay (Hittle, PI); Southern Inland and Coastal Systems (SICS) Model Development (Eric Swain, PI)

Overview & Objective(s): New technologies in environmental monitoring have made it cost effective to acquire tremendous amounts of hydrologic and water-quality data. Although these data are a valuable resource for understanding environmental systems, often are under utilized and/or under interpreted. The monitoring network(s) supported by the Comprehensive Everglades Restoration Plan (CERP) records tremendous amounts of data each day and the data base incorporates millions of data points describing the environmental response of the system to changing conditions. To enhance the evaluation of the CERP data base, there is an immediate need to apply new methodologies to systematically analyze the data set to address critical issues such as water depths at ungaged locations, water-depths and water-quality responses to controlled flow releases, and relative impacts of controlled freshwater releases, tidal dynamics, and meteorological forcing on streamflow, water level, and salinity. There also is a need to integrate longer-term hydrologic data with shorter-term hydrologic data collected for biological resource studies. This study will be undertaken as a series of pilot studies to demonstrate the efficacy of data mining techniques, including artificial neural network (ANN) models, to evaluate CERP data and address hydrologic issues important to DOI's efforts in South Florida.

The objectives of the study for FY06 include: (1) develop water-depth prediction models for ungaged locations in the Everglades Depth Estimation Network (EDEN); (2) compile data and develop preliminary hydrologic response models for the A.R.M. Loxahatchee National Wildlife Refuge; (3) complete the development and documentation of the Snail Kite hydrology decision support system (DSS); (4) document the analysis of the salinity response for five tributaries to Florida Bay.

Specific Relevance to Major Unanswered Questions and Information Needs Identified: (Page numbers below refer to DOI Science Plan.)

An important part of the USGS mission is to provide scientific information to manage the water resources of the Nation, including the other Agencies of the Department of the Interior (DOI). The objectives for this study addresses science needs to support DOI managers in fulfilling their stewardship responsibility as identified in The Science Plan in Support of Ecosystem Restoration, Preservation, and Protection in South Florida (U.S. Department of Interior, 2004). This is consistent with primary USGS activities that include providing knowledge and expertise to assist various levels of government in understanding and solving critical water-resources problems.

The study objective to develop prediction models for water depths at ungaged locations is part of the overall objective of the EDEN project to support the South Florida Hydrology Monitoring Network and the Monitoring and Assessment Plan (MAP). The MAP was developed as the primary tool to assess the system-wide performance of the CERP by the REstoration, COordination and VERification (RECOVER) program (p. 17, DOI Science Plan). The MAP describes and outlines the monitoring and supporting enhancement of scientific information and technology needed to measure the responses of the South Florida ecosystem to CERP projects.

The study objective to develop hydrologic and water-quality response models for the Arthur R. Marshall Loxahatchee NWR, including Internal Canal Structures and STAs (stormwater treatment area) meets a stated need in the Science Plan for the “synthesis and integration of data about historic hydrologic and ecological conditions on the refuge” and “research to understand the ecological effects of hydrology and water quality on refuge resources…”(p. 37 and 40, DOI Science Plan). The study objective will benefit the DOI and other Federal and State Agencies in South Florida by providing data analysis needed by water-resource managers to make decisions concerning the quantity and quality of inflows to the Refuge.

The development and documentation of the Snail Kite Hydrology DSS supports the Water Conservation Area 3 Decompartmentalization and Sheetflow Enhancement Project (DECOMP) by addressing the science needed for “...additional research to understand the effects of different hydrologic regimes and ecological processes on restoring and maintaining ecosystem function…” (p.64, DOI Science Plan) and supports ecological studies of impacts of hydrologic change on Everglade snail kite habitat. The study also supports the Combined Structural and Operational Plan project (CSOP) by addressing the needed science for “…refinement of hydrologic targets and operating protocols (p. 63, DOI Science Plan).”

Status: There were three objectives for the first year (FY05) of the Data Mining Study:

  1. Integration of Long-term Hydrologic Data with Snail Kite Study,
  2. Analysis of Water Level, Streamflow, and Salinity Signals,
  3. Assessment of Hydrologic Data Networks for Further Analysis and Integration.

The first objective has been completed and elements of the study (historical database, ANN models, model controls, and model output) will be integrated into a DSS during Year 2 (FY06) of the study. The second objective has only been partially met. Preliminary models of the salinity response at five tributaries to Florida Bay have been complete but the final models and analysis will be completed in the fall of 2005 and presented at the 2005 Florida Bay Conference. Work on the analysis of the Florida Bay tributaries was suspended to evaluate the feasibility of applying ANN models to the EDEN network to predict water depths at ungaged locations. The evaluation was very encouraging. The approach for using ANN models to predict at ungaged locations will be expanded and a primary focus of the study for 2006 and 2007. The third objective has been met by identifying two databases and the resource managers to identify the critical information to be extracted from the databases. The two networks are (1) 5-years of data (1999-2004) from the EDEN network and (2) the A.R.M. Loxahatchee NWR hydrologic and water-quality databases (1950's to the present).

Recent Products: Products from Year One of the study included (1) ANN models and database used to hindcast long-term water level response at 16 sites in WCA 3a; and (2) a summary document describing the application of ANN models for estimating water depth at ungaged sites.

Planned Products: Major products include (1) ANN models for predicting water depths in the EDEN network; (2) hydrologic response models for A.R.M. Loxahatchee NWR, (3) Snail Kite Hydrology DSS and documentation; and (4) poster session at the 2005 Florida Bay Science Conference describing the use of ANN models and three-dimensional response surfaces to analyze the salinity response for five tributaries to Florida Bay.

WORK PLAN

Title of Task 1: Estimating Water Depths and Water Levels at Ungaged Location in the EDEN Network
Task Funding: USGS Greater Everglades Priority Ecosystems Science (GE PES)
Task Leaders: Paul Conrads
Phone: (803) 750-6140
FAX: (803) 750-6181
Task Status: Active (first year)
Task priority: high
Time Frame for Task 1: (2006FY) 2006-2007
Task Personnel: Paul Conrads, Ed Roehl, and Mark Lowery

Task Summary and Objectives: The Everglades Depth Estimation Network (EDEN) was established to support the South Florida Hydrology Monitoring Network module of the Comprehensive Everglades Restoration Plan (CERP) and the Monitoring and Assessment Plan (MAP) and Restoration Coordination and Verification Team (RECOVER). The goals of EDEN are to help guide large-scale field operations, integrate hydrologic and biologic responses, and to support the MAP assessments by scientists and principal investigators across disciplines. One objective of EDEN is to relate water-level data at real-time stage gages to ungaged areas using ground elevation data, so that water depths throughout the greater Everglades can be estimated (Telis, 2005: http://sofia.usgs.gov/projects/eden/).

Accurately predicting the hydrologic responses at ungaged locations can be challenging due to the limited number of reference gaging stations and a limited understanding of complex topology and vegetation interactions. Techniques that are often used to estimate hydrologic responses at ungaged locations include combinations of linear regression and interpolation, but often the dynamics between hydrology, topography, and vegetation are nonlinear. The preliminary results from FY05 pilot study from applying ANN models to estimate water depths at ungaged locations are very encouraging. The spatial domain of the model is 370 square kilometers or about 2300 cells in the EDEN grid network. The average root mean square error for the prediction model at validation gages are approximately a tenth of a foot or 4 percent.

Work to be undertaken during the proposal year and a description of the methods and procedures:

The approach taken in FY05 will be expanded from a small sub-domain of WCA3a to the domain of the Everglades. For many spatial modeling problems, it is necessary to subdivide a larger study area and create separate models for regions rather than create a single model for an entire study area. The domain of EDEN varies broadly with respect to climate, topography, hydrology, and ecology. To subdivide the water-depth data EDEN dynamic clustering analysis will done to group water-depth time series into homogeneous groups based on similarity of dynamic response. In addition to determining which specific sites fall into which groups, clustering analysis will be used to determine an optimal number of groups. A higher number of groups will create more distinct homogeneous groups. However, these groups will contain a smaller number of sites, which may be insufficient for creating robust ANN models. The clustering of sites into the groups will be evaluated by analyzing the distribution of the groups and their sites across the Everglades. The physical properties of each group will be identified and sites that do not appear to share similar properties will be re-evaluated. If necessary, the number of groups will be recomputed to ensure the robust, homogeneous groups are determined. A quality-assured data set of hourly data for the EDEN network for the period 1999 to 2004 will be used for the cluster analysis.

A three-step modeling approach will be used to predict water-depths. The first step will be to develop a group assignment model. The model will use static variables of an ungaged site as input variables to determine which group (from the clustering analysis) the site should be assigned. The second model will predict the water-depth using only the static variables of location and vegetation types. Obviously, this model (also called the “static” model) is not able to predict the dynamic variability of the water depth, but it is able to discriminate general differences in the water-depth variable based on differences in location and vegetation. The static model is used to calculate the residual error (difference between the predicted and measured water depth), which is then modeled by the third model. The third model (also call the “dynamic” model) will use time series of water-depths and static variables to predict the variability in water-depth at each site as characterized by the residual in the static model. The final prediction of water depth at each site is the summation of the water-depth prediction from the static model and the prediction of the water-depth residual from the dynamic model.

Specific Task Product(s):

  1. Cluster analysis of 5-year data base (January 2006)
  2. ANN model for ungaged site for sub-domain of EDEN (June 2006)
  3. Manuscript summarizing development and application of ANN models for predicting water depths at ungaged sites (September 2006).

Title of Task 2: A Synthesis of Hydrology and Water-Quality Data of A.R.M. Loxahatchee NWR
Task Funding: USGS Greater Everglades Priority Ecosystems Science (GE PES)
Task Leaders: Paul Conrads
Phone: (803) 750-6140
FAX: (803) 750-6181
Task Status: Active (first year)
Task priority: High
Time Frame for Task 2: (2006FY) 2006-2007
Task Personnel: Paul Conrads, Ed Roehl, and Toby Feaster

Task Summary and Objectives: The Arthur R. Marshal Loxahatchee National Wildlife Refuge is the last of the soft-water ecological systems in the Everglades. Historically, the ecosystem was driven by precipitation inputs to the system that were low in conductance and nutrients. With controlled releases into the canal that surround the Refuge, the transport of water with higher conductance and nutrient concentration could potentially alter critical ecosystem functions. With potential alteration of flow patterns to accommodate the restoration of the Everglades, the Refuge could be affected not only by changes in the timing and frequency of hydroperiods but by the quality of the water that inundate the Refuge.

There is a long history of collecting hydrologic and water quality data in the Refuge. Data characterizing the hydrology of the system - inflows, outflows, precipitation and water levels have been collected since the 1950's. Data characterizing the water quality of the system, including conductance and phosphorus, has been collected since the late 1970's. To enhance the understanding of the hydrology and water quality of the Refuge, there is an immediate need to apply new methodologies to systematically synthesize and analyze the data set to answer critical questions such as relative impacts of controlled releases, precipitation, groundwater interaction, and meteorological forcing on water level, conductance, and phosphorous. There also is a need to integrate longer-term hydrologic data with shorter-term hydrologic data collected for biological and ecological resource studies.

Work to be undertaken during the proposal year and a description of the methods and procedures:

To understand the relationships between canal inflows/outflows and water level, conductance, and phosphorous a Data Mining-based model will be developed to predict water level, conductance, and phosphorous at various locations interest. The steps to be taken are described below.

Step 1. Data Compilation and Merging.

Historic hydrologic and meteorological data from the various Federal and State databases will be merged and time synchronized. Parameters of interest include inflows, outflows, rainfall, wind direction and speed, groundwater levels, water levels, conductance, and phosphorous.

Step 2. Data Preparation

Methods will be used to maximize the information content in the raw data, while diminishing the influence of poor or missing measurements. Signal (time series) processing methods include clustering, filtering, spectral decomposition, estimation of data characteristics and time delays, and synthesizing missing data. Signal processing transforms the “raw” data into “pre-processed” data for analysis and modeling. The data collected from the agencies have different sampling frequencies, ranging from every 15 minutes to once per month. The variables must be “time-merged” by either interpolating between less frequent measurements, or by averaging frequent samples to obtain fewer values.

Another signal processing task is “signal decomposition”. The complex behaviors of the variables of a natural system result from interactions between multiple physical forces. Signal decomposition involves digital filtering to split a signal into sub-signals, called “components”, that are independently attributable to different physical forces. Components can be periodic, chaotic, or random, or a combination. Digital filtering can also diminish the effect of noise in a signal to improve the amount of useful information that it contains. Working from filtered signals makes the modeling process more efficient, precise, and accurate.

Step 3. Correlation Analysis and Sensitivity Estimation

Correlation analysis quantifies the relationships between many variables and provides deeper understanding of the data. The computer systematically correlates factors that influence parameters of interest, such as water level, conductance, and phosphorous to combinations of controlled and uncontrolled variables, such as inflows, outflows and rainfall. Correlation methods based on statistics and machine learning are applied in combination. Comparing them to known patterns of behavior validates promising results found by the computer. Correlation analysis identifies:

  1. Relative impact - For example, “What variables impact the increased conductance and phosphorous? And to what degree?
  2. Relationships between controlled (inflows and outflows) and uncontrolled variables (meteorology forcing).
  3. Quantifiable answers to complex questions - For example, “What are the critical temporal and spatial relationships between the controlled releases and the water level, conductance, and phosphorous response in the interior of the Refuge? Which has more effect on these responses - large releases over a short period of time or weekly flow volumes? What are the relative impacts of the inflows/outflow locations on these responses?

Step 4. Predictive Modeling

Using machine learning, predictive models are developed directly from the data and correlations determined in Steps 2 and 3. To maximize accuracy, the model is constructed from sub-models, which independently correlate periodic and chaotic components. Their outputs are combined to obtain an overall prediction that manifests all of the different forcing functions that are represented by input variables, which affect the output variables. The models of the Refuge will predict water level, conductance, and phosphorous at multiple locations from inputs such as inflow, outflow, rainfall, wind direction and speed.

The predictive modeling will be limited to three selected water level sites (1-7, 1-8c, 1-9) and three water quality sites (LOX4, LOX5, LOX13). The water-level sites are critical sites for the operation of the regulation schedule and the water quality sites are critical sites for the water quality compliance consent degree. The anticipated results for FY06 are the compiled, time synchronized database and the predictive ANN models of the selected water level and water quality stations. The models will provide powerful analysis tools for understanding the dynamics of the system. In particular, 3-dimensional response surfaces showing the interaction of two explanatory variables (such as canal inflow, outflow, canal water level, and rainfall) on a response variable (interior water level, conductance, and phosphorous) will be generated.

Specific Task Product(s):

  1. Data base of water levels, flow, salinity, and controlled release time series and derived variables used for analysis (January 2006).
  2. Preliminary hydrologic response ANN models (July 2006)
  3. Preliminary water-quality response ANN models (September 2006)

Title of Task 3: Integration of Long-term Hydrologic Data with Snail Kite Study
Task Funding: USGS Greater Everglades Priority Ecosystems Science (GE PES)
Task Leaders: Paul Conrads
Phone: (803) 750-6140
FAX: (803) 750-6181
Task Status: Active
Task priority: low
Time Frame for Task 3: (2006FY)
Task Personnel: Paul Conrads, Ruby Daamen, and Ed Roehl

Task Summary and Objectives: One of the objectives for FY2005 was to integrates short- and long-term hydrologic and ecological data for the study of Snail Kites in Water Conservation Area 3a (WCA3a). Ecologists from the USGS Florida Coop Unit are studying the Snail Kite and its habitat quality as it relates to vegetative community structure. The vegetative structure of these sites is an expression of both recent past and current hydrological conditions. It is critically important to determine how the species associations within these communities respond differentially to changes in hydrology through time and space. The monitoring network for the snail kite study has established 17 continuous water-depth monitors to understand differences in hydrology in the study area. In addition to the 17 water-depth gages (short-term, < 2 years of record), there are 3 long-term (>13 years of record) in the study area. To maximize the information content ANN models were developed to predict the water depths at the 17 monitoring stations. These models are used to extend the period of record of the short-term monitoring stations to be concurrent with the three long-term stations.

The DSS would allow users to interrogate the historical database and model simulations to better understand the water-depth dynamics of the system. The DSS will also allow users to evaluate alternative water management scenarios and their impacts on the hydrology of the Snail Kite habitat. The DSS will read and write files for the various run-time options that can be selected by the user through the system's graphical user interface (GUI). The historical database contains thirteen years of hydrodynamic data that will be used to generate water-level simulations using the 17 ANN models. Using GUI controls, the user can evaluate alternative flow and water-level scenarios. The outputs generated by the ANN models will be written to files for post processing in MS ExcelTM The DSS will provide streaming graphics during model simulations, visually representing historical and predicted behaviors side-by-side.

Work to be undertaken during the proposal year and a description of the methods and procedures:

The following steps will be taken for the development of the Snail Kite Hydrology DSS .

  • Build DSS Shell to Include
    • GUI for setting all simulation run parameters
      • Start and stop time of simulation run
      • Source input for simulation run (historical data, percentage of actual, user-defined, external file)
    • Streaming graphics to display model outputs
    • hydrologic parameter/statistic
      • Calculate statistic
      • Provide graphical display
      • Tabular output values
    • Output model predictions and simulation parameters
  • Integrate 17 ANN models into DSS Shell
    • Define models and inputs within the DSS
    • Calculate model inputs using user selected data source
    • Run models using iQuest Runtime module
    • Write and display output within DSS application

Specific Task Product(s):

  1. Prototype Snail Kite Hydrology DSS (November 2005)
  2. Final Snail Kite Hydrology DSS (February 2006)
  3. Manuscript describing the development of the DSS (August 2006)

Title of Task 4: Analysis of Water Level, Streamflow, and Salinity Signals
Task Funding: USGS Greater Everglades Priority Ecosystems Science (GE PES)
Task Leaders: Paul Conrads
Phone: (803) 750-6140
FAX: (803) 750-6181
Task Status: Active (second year)
Task priority: low
Time Frame for Task 2: (2006FY)
Task Personnel: Paul Conrads and Ed Roehl

Task Summary and Objectives: This task completes Task 2 of the FY2005 Statement of Work for the study. Tributaries into Florida Bay and the Everglades National Park are constantly integrating various changing conditions such as low-gradient streamflows, tides, and meteorological forcing. Only a portion of these forces are controlled by operational practices. Using data mining techniques and systematically decomposing the time series and decorrelating variables, high fidelity empirical models were developed and used to analyze the relative contribution of the major forces on the streamflow and salinity dynamics of the system. Data mining techniques, including ANN models, are being applied to the USGS data of five tributary creeks to Florida Bay (McCormick, Mud, Taylor, Trout, and West Highway Creek) to answer critical questions such as relative impacts of controlled freshwater releases, tidal dynamics, and meteorological forcing on streamflow, water level, and salinity.

three-dimensional scatter plot of Canal 111 flow and Trout Creek gage height and salinity and three-dimensional response surface generated by ANN model of the system
Figure 1. Three-dimensional scatter plot (A) of Canal 111 flow and Trout Creek gage height and salinity and three-dimensional response surface (B) generated by ANN model of the system. [larger image]

The ANN models of the tributaries are used to examine the impact of controlled releases, water levels, precipitation, wind speed and direction, on tributary salinity dynamics. Three-dimensional (3d) surfaces generated by an ANN model are a powerful way to discover the model's representation of a process' variable interaction and physics. For example, Figure 1a shows a 3d scatter plot of C-111 Canal flows and Trout Creek gage height and salinity in 3-space. Figure 1b shows a 3d response surface generated from an ANN model of the system by plotting two explanatory variables (gage height and controlled releases) with an output response variable (Trout Creek salinity). The data for the surface is computed by incrementing the “shown” (displayed) ANN model inputs across their historical ranges of the displayed input variables, while the “unshown” inputs (the ANN model has more than two variables) are set to a constant value, such a historical mid-range. The response surface is a representation of the dynamic history of the system. The response surface shows significant salinity response for all gage heights when the Canal-111 flows are below 500 ft3/s. For higher gage heights (> 0.5 ft) significant salinity response occurs with flows up to 1,000 ft3/s.

Work to be undertaken during the proposal year and a description of the methods and procedures:

The salinity dynamics of the five tributary creeks are currently being analyzed. Response surface for the five tributary creeks and for various combinations of explanatory variables are used to evaluate system behavior at the five sites. Comparisons and differences in the process physics, as manifest by the response surface for each tributary, between tributaries will be documented.

Specific Task Product(s):

  1. Poster session at the 2005 Florida Bay Science Conference describing development and application of ANNs models and the use of response surfaces for analyzing salinity response for five tributaries to Florida Bay (December 2005).



| Disclaimer | Privacy Statement | Accessibility |

U.S. Department of the Interior, U.S. Geological Survey
This page is: http://sflwww.er.usgs.gov/projects/workplans06/hydro_mon.html
Comments and suggestions? Contact: Heather Henkel - Webmaster
Last updated: 04 September, 2013 @ 02:09 PM(KP)