pygeohydro.waterdata#

Accessing data from the supported databases through their APIs.

Module Contents#

class pygeohydro.waterdata.NWIS#

Access NWIS web service.

get_info(queries, expanded=False, fix_names=True)#

Send multiple queries to USGS Site Web Service.

Parameters
  • queries (dict or list of dict) – A single or a list of valid queries.

  • expanded (bool, optional) – Whether to get expanded sit information for example drainage area, default to False.

  • fix_names (bool, optional) – If True, reformat station names and some small annoyances, defaults to True.

Returns

geopandas.GeoDataFrame – A correctly typed GeoDataFrame containing site(s) information.

get_parameter_codes(keyword)#

Search for parameter codes by name or number.

Notes

NWIS guideline for keywords is as follows:

By default an exact search is made. To make a partial search the term should be prefixed and suffixed with a % sign. The % sign matches zero or more characters at the location. For example, to find all with “discharge” enter %discharge% in the field. % will match any number of characters (including zero characters) at the location.

Parameters

keyword (str) – Keyword to search for parameters by name of number.

Returns

pandas.DataFrame – Matched parameter codes as a dataframe with their description.

Examples

>>> from pygeohydro import NWIS
>>> nwis = NWIS()
>>> codes = nwis.get_parameter_codes("%discharge%")
>>> codes.loc[codes.parameter_cd == "00060", "parm_nm"][0]
'Discharge, cubic feet per second'
get_streamflow(station_ids, dates, freq='dv', mmd=False, to_xarray=False)#

Get mean daily streamflow observations from USGS.

Parameters
  • station_ids (str, list) – The gage ID(s) of the USGS station.

  • dates (tuple) – Start and end dates as a tuple (start, end).

  • freq (str, optional) – The frequency of the streamflow data, defaults to dv (daily values). Valid frequencies are dv (daily values), iv (instantaneous values). Note that for iv the time zone for the input dates is assumed to be UTC.

  • mmd (bool, optional) – Convert cms to mm/day based on the contributing drainage area of the stations. Defaults to False.

  • to_xarray (bool, optional) – Whether to return a xarray.Dataset. Defaults to False.

Returns

pandas.DataFrame or xarray.Dataset – Streamflow data observations in cubic meter per second (cms). The stations that don’t provide the requested discharge data in the target period will be dropped. Note that when frequency is set to iv the time zone is converted to UTC.

static retrieve_rdb(url, payloads)#

Retrieve and process requests with RDB format.

Parameters
  • url (str) – Name of USGS REST service, valid values are site, dv, iv, gwlevels, and stat. Please consult USGS documentation here for more information.

  • payloads (list of dict) – List of target payloads.

Returns

pandas.DataFrame – Requested features as a pandas’s DataFrame.

class pygeohydro.waterdata.WaterQuality#

Water Quality Web Service https://www.waterqualitydata.us.

Notes

This class has a number of convenience methods to retrieve data from the Water Quality Data. Since there are many parameter combinations that can be used to retrieve data, a general method is also provided to retrieve data from any of the valid endpoints. You can use get_json to retrieve stations info as a geopandas.GeoDataFrame or get_csv to retrieve stations data as a pandas.DataFrame. You can construct a dictionary of the parameters and pass it to one of these functions. For more information on the parameters, please consult the Water Quality Data documentation.

data_bystation(station_ids, wq_kwds)#

Retrieve data for a single station.

Parameters
  • station_ids (str or list of str) – Station ID(s). The IDs should have the format “Agency code-Station ID”.

  • wq_kwds (dict, optional) – Water Quality Web Service keyword arguments. Default to None.

Returns

pandas.DataFrame – DataFrame of data for the stations.

get_csv(endpoint, kwds, request_method='GET')#

Get the CSV response from the Water Quality Web Service.

Parameters
  • endpoint (str) – Endpoint of the Water Quality Web Service.

  • kwds (dict) – Water Quality Web Service keyword arguments.

  • request_method (str, optional) – HTTP request method. Default to GET.

Returns

pandas.DataFrame – The web service response as a DataFrame.

get_json(endpoint, kwds, request_method='GET')#

Get the JSON response from the Water Quality Web Service.

Parameters
  • endpoint (str) – Endpoint of the Water Quality Web Service.

  • kwds (dict) – Water Quality Web Service keyword arguments.

  • request_method (str, optional) – HTTP request method. Default to GET.

Returns

geopandas.GeoDataFrame – The web service response as a GeoDataFrame.

get_param_table()#

Get the parameter table from the USGS Water Quality Web Service.

lookup_domain_values(endpoint)#

Get the domain values for the target endpoint.

station_bybbox(bbox, wq_kwds)#

Retrieve station info within bounding box.

Parameters
  • bbox (tuple of float) – Bounding box coordinates (west, south, east, north) in epsg:4326.

  • wq_kwds (dict, optional) – Water Quality Web Service keyword arguments. Default to None.

Returns

geopandas.GeoDataFrame – GeoDataFrame of station info within the bounding box.

station_bydistance(lon, lat, radius, wq_kwds)#

Retrieve station within a radius (decimal miles) of a point.

Parameters
  • lon (float) – Longitude of point.

  • lat (float) – Latitude of point.

  • radius (float) – Radius (decimal miles) of search.

  • wq_kwds (dict, optional) – Water Quality Web Service keyword arguments. Default to None.

Returns

geopandas.GeoDataFrame – GeoDataFrame of station info within the radius of the point.

pygeohydro.waterdata.interactive_map(bbox, crs=DEF_CRS, nwis_kwds=None)#

Generate an interactive map including all USGS stations within a bounding box.

Parameters
  • bbox (tuple) – List of corners in this order (west, south, east, north)

  • crs (str, optional) – CRS of the input bounding box, defaults to EPSG:4326.

  • nwis_kwds (dict, optional) – Optional keywords to include in the NWIS request as a dictionary like so: {"hasDataTypeCd": "dv,iv", "outputDataTypeCd": "dv,iv", "parameterCd": "06000"}. Default to None.

Returns

folium.Map – Interactive map within a bounding box.

Examples

>>> import pygeohydro as gh
>>> nwis_kwds = {"hasDataTypeCd": "dv,iv", "outputDataTypeCd": "dv,iv"}
>>> m = gh.interactive_map((-69.77, 45.07, -69.31, 45.45), nwis_kwds=nwis_kwds)
>>> n_stations = len(m.to_dict()["children"]) - 1
>>> n_stations
10