Warning: Javascript must be enabled to use all the features on this page!
Click to hideNews Bulletins

Tab-delimited output-format description

Overview:
  The principal machine-readable data format supported by the USGS Water Data for the Nation site is a variant of a tab-delimited ASCII file structure called "rdb". The rdb file structure consists of a header section containing zero or more comment lines. The rdb header contains important information such as disclaimers, sites, parameter and location names. The header is followed by exactly one tab-delimited column-name row, which is followed by exactly one column-definition row, and a data section consisting of any number of rows of tab-delimited data fields. The header comment lines start with a sharp sign (#) followed by a space character followed by any text desired. The fields in the tab-delimited column-name row contain the names of each column. The fields in the tab-delimited column-definition row contain the data definitions and optional column documentation for each column. Data rows must have exactly the same number of tab-delimited columns as both the column-name and column-definition rows. Null data values are allowed. Example rdb file:
   # -------------------------------------------
   # Documentation lines. These describe and 
   # identify the rdb file contents. 
   # -------------------------------------------
   NAME   COUNT  TYP  AMT   OTHER   RIGHT
   6s     5n     3s   5n    8s      8s
   Bill   44     A    133   Another This 
   John   44          23    One     Is 
   Gary   77          77    Here    On 
   Mar    77     B    244   And     The 
   Greg   77     D    1111  So      Right
   
When reading (parsing) these rdb files, it is important to first parse the column-name row (the first non-comment row in the file) to determine the column position of each data value as different sites may return data columns in a different order. Information detailing the column-name syntax of the data file is contained in the header comments of the file for each data type.

Water Data for the Nation output-file format:
  Tab-delimited data files output by the Water Data for the Nation site consist of tab-separated columns of data for one or more sites. Each site is separated by a header section of comments and new column definitions. The following 3 column definitions are always included for each site:
   Column        Definition
   ----------    -----------------------------------------
   agency_cd     Agency collecting data or maintaining the site 
   site_no       USGS site-identification number 
   datetime      Date (and time for real-time data) in ISO format
The remaining pairs of columns vary for each site depending on whether real-time or daily data are being output and on which data parameters were selected.

For real-time data the data-column pairs use the format   'nn_nnnnn'   'nn_nnnnn_cd'   where the first two-number sequence in each column name uniquely defines the sensor (the 'data descriptor') used to collect the data and the following five-number sequence defines the 'parameter_cd' which describes the type of data shown in the column. The second   'nn_nnnnn_cd'   column in the pair contains data-value qualification codes pertaining specifically to the preceeding column.

For daily data the data-column pairs use a similar format of   'nn_nnnnn_nnnnn'   'nn_nnnnn_nnnnn_cd'   where the first two-number sequence in each column name uniquely defines the sensor (the 'data descriptor') used to collect the data, the next five-number sequence defines the 'parameter_cd' which describes the type of data shown in the column, and the next five-number sequence describes the type of daily statistic used to calculate the daily data value. The second   'nn_nnnnn_nnnnn_cd'   column of the pair contains data-value qualification codes pertaining specifically to the preceeding column.

A list of specific parameter codes, statistic codes (if daily data), and data-value qualification codes (if present) are included in the header for each site. For example:
   
   # Data for the following station(s) are contained in this file
   # -------------------------------------------------------------
   #  USGS 06041000 Madison River bl Ennis Lake nr McAllister MT
   #
   # Available data at this site--lines with asterisk '*' are included in this output.
   #    DD parameter statistic - Description
   #    --   -----     -----     ------------------------------------
   #   *02   00065     00003   - Gage height, feet (Mean)
   #   *05   00010     00001   - Temperature, water, degrees Celsius (Maximum)
   #   *05   00010     00002   - Temperature, water, degrees Celsius (Minimum)
   #   *05   00010     00003   - Temperature, water, degrees Celsius (Mean)
   #   *06   00060     00003   - Discharge, cubic feet per second (Mean)
   #
   # Data-value qualification codes included in this output: 
   #     A  Approved for publication -- Processing and review completed.  
   #     P  Provisional data subject to revision.  
   #     e  Value has been estimated.