pygeohydro.pygeohydro#

Accessing data from the supported databases through their APIs.

Module Contents#

class pygeohydro.pygeohydro.EHydro(data_type='points', cache_dir='ehydro_cache')#

Access USACE Hydrographic Surveys (eHydro).

Notes

For more info visit: https://navigation.usace.army.mil/Survey/Hydro

bygeom(geom, geo_crs=4326, sql_clause='', distance=None, return_m=False, return_geom=True)#

Get features within a geometry that can be combined with a SQL where clause.

byids(field, fids, return_m=False, return_geom=True)#

Get features by object IDs.

bysql(sql_clause, return_m=False, return_geom=True)#

Get features using a valid SQL 92 WHERE clause.

Parameters:
  • data_type (str, optional) – Type of the survey data to retrieve, defaults to points. Note that the points data type gets the best available point cloud data, i.e., if SurveyPointHD is available, it will be returned, otherwise SurveyPoint will be returned. Available types are:

    • points: Point clouds

    • outlines: Polygons of survey outlines

    • contours: Depth contours

    • bathymetry: Bathymetry data

    Note that point clouds are not available for all surveys.

  • cache_dir (str or pathlib.Path, optional) – Directory to store the downloaded raw data, defaults to ./ehydro_cache.

property survey_grid: geopandas.GeoDataFrame#

Full survey availability on hexagonal grid cells of 35 km resolution.

pygeohydro.pygeohydro.get_camels()#

Get streaflow and basin attributes of all 671 stations in CAMELS dataset.

Notes

For more info on CAMELS visit: https://ral.ucar.edu/solutions/products/camels

Returns:

tuple of geopandas.GeoDataFrame and xarray.Dataset – The first is basin attributes as a geopandas.GeoDataFrame and the second is streamflow data and basin attributes as an xarray.Dataset.

Return type:

tuple[geopandas.GeoDataFrame, xarray.Dataset]

pygeohydro.pygeohydro.soil_gnatsgo(layers, geometry, crs=4326)#

Get US soil data from the gNATSGO dataset.

Notes

This function uses Microsoft’s Planetary Computer service to get the data. The dataset’s description and its supported soil properties can be found at: https://planetarycomputer.microsoft.com/dataset/gnatsgo-rasters

Parameters:
  • layers (list of str or str) – Target layer(s). Available layers can be found at the dataset’s website here.

  • geometry (Polygon, MultiPolygon, or tuple of length 4) – Geometry or bounding box of the region of interest.

  • crs (int, str, or pyproj.CRS, optional) – The input geometry CRS, defaults to epsg:4326.

Returns:

xarray.Dataset – Requested soil properties.

Return type:

xarray.Dataset

pygeohydro.pygeohydro.soil_properties(properties='*', soil_dir='cache')#

Get soil properties dataset in the United States from ScienceBase.

Notes

This function downloads the source zip files from ScienceBase , extracts the included .tif files, and return them as an xarray.Dataset.

Parameters:
  • properties (list of str or str, optional) – Soil properties to extract, default to “*”, i.e., all the properties. Available properties are awc for available water capacity, fc for field capacity, and por for porosity.

  • soil_dir (str or pathlib.Pathlib.Path) – Directory to store zip files or if exists read from them, defaults to ./cache.

pygeohydro.pygeohydro.soil_soilgrids(layers, geometry, geo_crs=4326)#

Get soil data from SoilGrids for the area of interest.

Notes

For more information on the SoilGrids dataset, visit ISRIC.

Parameters:
  • layers (list of str) – SoilGrids layers to get. Available options are: bdod_*, cec_*, cfvo_*, clay_*, nitrogen_*, ocd_*, ocs_*, phh2o_*, sand_*, silt_*, and soc_* where * is the depth in cm and can be one of 5, 15, 30, 60, 100, or 200. For example, bdod_5 is the mean bulk density of the fine earth fraction at 0-5 cm depth, and bdod_200 is the mean bulk density of the fine earth fraction at 100-200 cm depth.

  • geometry (Polygon, MultiPolygon, or tuple of length 4) – Geometry to get DEM within. It can be a polygon or a boundong box of form (xmin, ymin, xmax, ymax).

  • geo_crs (int, str, of pyproj.CRS, optional) – CRS of the input geometry, defaults to epsg:4326.

Returns:

xarray.DataArray – The request DEM at the specified resolution.

Return type:

xarray.Dataset

pygeohydro.pygeohydro.ssebopeta_bycoords(coords, dates, crs=4326)#

Daily actual ET for a dataframe of coords from SSEBop database in mm/day.

Parameters:
  • coords (pandas.DataFrame) – A dataframe with id, x, y columns.

  • dates (tuple or list, optional) – Start and end dates as a tuple (start, end) or a list of years [2001, 2010, …].

  • crs (str, int, or pyproj.CRS, optional) – The CRS of the input coordinates, defaults to epsg:4326.

Returns:

xarray.Dataset – Daily actual ET in mm/day as a dataset with time and location_id dimensions. The location_id dimension is the same as the id column in the input dataframe.

Return type:

xarray.Dataset

pygeohydro.pygeohydro.ssebopeta_bygeom(geometry, dates, geo_crs=4326)#

Get daily actual ET for a region from SSEBop database.

Notes

Since there’s still no web service available for subsetting SSEBop, the data first needs to be downloaded for the requested period then it is masked by the region of interest locally. Therefore, it’s not as fast as other functions and the bottleneck could be the download speed.

Parameters:
  • geometry (shapely.Polygon or tuple) – The geometry for downloading clipping the data. For a tuple bbox, the order should be (west, south, east, north).

  • dates (tuple or list, optional) – Start and end dates as a tuple (start, end) or a list of years [2001, 2010, …].

  • geo_crs (str, int, or pyproj.CRS, optional) – The CRS of the input geometry, defaults to epsg:4326.

Returns:

xarray.DataArray – Daily actual ET within a geometry in mm/day at 1 km resolution

Return type:

xarray.DataArray