Getting Started#

Why HyRiver?#

Some major capabilities of HyRiver are as follows:

  • Easy access to many web services for subsetting data on server-side and returning the requests as masked Datasets or GeoDataFrames.

  • Splitting large requests into smaller chunks, under-the-hood, since web services often limit the number of features per request. So the only bottleneck for subsetting the data is your local machine memory.

  • Navigating and subsetting NHDPlus database (both medium- and high-resolution) using web services.

  • Cleaning up the vector NHDPlus data, fixing some common issues, and computing vector-based accumulation through a river network.

  • A URL inventory for some popular (and tested) web services.

  • Some utilities for manipulating the obtained data and their visualization.

Software Stack#

Installation#

You can install all the packages using pip:

$ pip install py3dep pynhd pygeohydro pydaymet pygridmet pynldas2 hydrosignatures pygeoogc pygeoutils async-retriever

Please note that installation with pip fails if libgdal is not installed on your system. You should install this package manually beforehand. For example, on Ubuntu-based distros the required package is libgdal-dev. If this package is installed on your system you should be able to run gdal-config --version successfully.

Alternatively, you can install them using conda:

$ conda install -c conda-forge py3dep pynhd pygeohydro pydaymet pygridmet pynldas2 hydrosignatures pygeoogc pygeoutils async-retriever

or mambaforge (recommended):

$ mamba install py3dep pynhd pygeohydro pydaymet pygridmet pynldas2 hydrosignatures pygeoogc pygeoutils async-retriever

Additionally, you can create a new environment, named hyriver with all the packages and optional dependencies installed with mambaforge using the provided environment.yml file:

$ mamba env create -f ./environment.yml

Dependencies#

  • aiohttp[speedups]>=3.8.3

  • aiohttp-client-cache>=0.12.3

  • aiosqlite

  • cytoolz

  • nest-asyncio

  • ujson

  • async-retriever<0.19,>=0.18

  • cytoolz

  • defusedxml

  • joblib

  • multidict

  • owslib>=0.27.2

  • pyproj>=3.0.1

  • requests

  • requests-cache>=0.9.6

  • shapely>=2

  • typing-extensions

  • ujson

  • url-normalize>=1.4

  • urllib3

  • yarl

  • cytoolz

  • geopandas>=1

  • netcdf4

  • numpy>=2

  • pyproj>=3.0.1

  • rasterio>=1.2

  • rioxarray>=0.11

  • scipy

  • shapely>=2

  • ujson

  • xarray>=2023.1

  • async-retriever<0.19,>=0.18

  • cytoolz

  • geopandas>=1

  • networkx

  • numpy>=2

  • pandas>=1

  • pyarrow>=1.0.1

  • pygeoogc<0.19,>=0.18

  • pygeoutils<0.19,>=0.18

  • shapely>=2

  • async-retriever<0.19,>=0.18

  • click>=0.7

  • cytoolz

  • geopandas>=1

  • numpy>=1.17

  • pygeoogc<0.19,>=0.18

  • pygeoutils<0.19,>=0.18

  • rasterio>=1.2

  • rioxarray>=0.11

  • scipy

  • shapely>=2

  • xarray>=2023.1

  • async-retriever<0.19,>=0.18

  • cytoolz

  • defusedxml

  • folium

  • geopandas>=1

  • h5netcdf

  • hydrosignatures<0.19,>=0.18

  • matplotlib>=3.5

  • numpy>=2

  • pandas>=1

  • pygeoogc<0.19,>=0.18

  • pygeoutils<0.19,>=0.18

  • pynhd<0.19,>=0.18

  • pyproj>=3.0.1

  • rioxarray>=0.11

  • scipy

  • shapely>=2

  • ujson

  • xarray>=2023.1

  • async-retriever<0.19,>=0.18

  • click>=0.7

  • geopandas>=1

  • numpy>=2

  • pandas>=1

  • py3dep<0.19,>=0.18

  • pygeoogc<0.19,>=0.18

  • pygeoutils<0.19,>=0.18

  • pyproj>=3.0.1

  • scipy

  • shapely>=2

  • xarray>=2023.1

  • async-retriever<0.19,>=0.18

  • click>=0.7

  • geopandas>=1

  • numpy>=2

  • pandas>=1

  • pygeoogc<0.19,>=0.18

  • pygeoutils<0.19,>=0.18

  • pyproj>=3.0.1

  • shapely>=2

  • xarray>=2023.1

  • async-retriever<0.19,>=0.18

  • h5netcdf

  • numpy>=2

  • pandas>=1

  • pygeoutils<0.19,>=0.18

  • pyproj>=3.0.1

  • rioxarray>=0.11

  • xarray>=2023.1

  • numpy>=2

  • pandas>=1

  • scipy

  • xarray>=2023.1

Additionally, you can also install bottleneck and numba to improve the performance of some computations. Installing pyogrio is highly recommended for improving the performance of working with vector data. For NHDPlus, py7zr and pyogrio are required dependencies. For retrieving soil data, you should install planetary-computer and pystac-client.