Update the link to version 2.0 of the ENHD dataset in
prepare_nhdplusnow supports NHDPlus HR in addition to NHDPlus MR. It automatically detects the NHDPlus version based on the ID column name:
nhdplusidfor HR and
Convert relative imports to absolute with
Improve performance of
pandas.mergeinstead of applying a function to each row of the dataframe.
Add support for the new EPA’s StreamCat Restful API with around 600 NHDPlus catchment level metrics. One class is added for getting the service properties such as valid metrics, called
StreamCat. You can use
streamcatfunction to get the metrics as a
show_versionsfunction to improve performance and print the output in a nicer table-like format.
Skip 0.13.9 version so the minor version of all HyRiver packages become the same.
Modify the codebase based on the latest changes in
geopandasrelated to empty dataframes.
Add a new function, called
nhdplus_attrs_s3, for accessing the recently released NHDPlus derived attributes on a USGS’s S3 bucket. The attributes are provided in parquet files, so getting them is faster than
nhdplus_attrs. Also, you can request for multiple attributes at once whereas in
nhdplus_attrsyou had to request for each attribute one at a time. This function will replace
nhdplus_attrsin a future release, as soon as all data that are available on the ScienceBase version are also accessible from the S3 bucket.
Add two new functions called
enhd_flowlines_nx. These functions generate a
networkxdirected graph object of NHD HUC12 water boundaries and flowlines, respectively. They also return a dictionary mapping of COMID and HUC12 to the corresponding
networkxnode. Additionally, a topologically sorted list of COMIDs/HUC12s are returned. The generated data are useful for doing US-scale network analysis and flow accumulation on the NHD network. The NHD graph has about 2.7 million edges and the mainstem HUC12 graph has about 80K edges.
Add a new function for getting the entire NHDPlus dataset for CONUS (Lower 48), called
nhdplus_l48. The entire NHDPlus dataset is downloaded from here. This 7.3 GB file will take a while to download, depending on your internet connection. The first time you run this function, the file will be downloaded and stored in the
./cachedirectory. Subsequent calls will use the cached file. Moreover, there are two additional dependencies for using this function:
py7zr. These dependencies can be installed using
pip install pyogrio py7zror
conda install -c conda-forge pyogrio py7zr.
vector_accumulationfor significant performance improvements.
Modify the codebase based on Refurb suggestions.
Add a new function called
epa_nhd_catchmentsto access one of the EPA’s HMS endpoints called
WSCatchment. You can use this function to access 414 catchment-scale characteristics for all the NHDPlus catchments including 16-day average curve number. More information on the curve number dataset can be found at its project page here.
Fix a bug in
NHDToolswhere due to the recent changes in
pandasexception handling, the
NHDToolsfails in converting columns with
NaNvalues to integer type. Now,
astypemethod on a column.
pyupgradepackage to update the type hinting annotations to Python 3.10 style.
Add the missing PyPi classifiers for the supported Python versions.
Append “Error” to all exception classes for conforming to PEP-8 naming conventions.
Bump the minimum versions of
pygeoutilsto 0.13.5 and that of
Fix an issue in
enhd_attrsfunctions where if
cachefolder does not exist, it would not have been created, thus resulting to an error.
Use the new
async_retriever.stream_writefunction to download files in
enhd_attrsfunctions. This is more memory efficient.
Convert the type of list of not found items in
NLDI.feature_bylocto list of tuples of coordinates from list of strings. This matches the type of returned not found coordinates to that of the inputs.
Fix an issue with NLDI that was caused by the recent changes in the NLDI web service’s error handling. The NLDI web service now returns more descriptive error messages in a
jsonformat instead of returning the usual status errors.
Slice the ENHD dataframe in
NHDTools.clean_flowlinesbefore updating the flowline dataframe to reduce the required memory for the
Set the minimum supported version of Python to 3.8 since many of the dependencies such as
rioxarrayhave dropped support for Python 3.7.
Add support for all the GeoConnex web service endpoints. There are two ways to use it. For a single query, you can use the
geoconnexfunction and for multiple queries, it’s more efficient to use the
Add support for passing any of the supported NLDI feature sources to the
get_basinsmethod of the
NLDIclass. The default is
nwissiteto retain backward compatibility.
Set the type of “ReachCode” column to
Add two new functions called
network_resamplefor resampling a flowline or network of flowlines based on a given spacing. This is useful for smoothing jagged flowlines similar to those in the NHDPlus database.
Add support for the new NLDI endpoint called “hydrolocation”. The
NLDIclass now has two methods for getting features by coordinates:
feature_bylocmethod returns the flowline that is associated with the closest NHDPlus feature to the given coordinates. The
comid_bylocmethod returns a point on the closest downstream flowline to the given coordinates.
Add a new function called
pygeoapifor calling the API in batch mode. This function accepts the input coordinates as a
geopandas.GeoDataFrame. It is more performant than calling its counteract
PyGeoAPImultiple times. It’s recommended to switch to using this new batch function instead of the
PyGeoAPIclass. Users just need to prepare an input data frame that has all the required service parameters as columns.
Add a new step to
Add support for the
simplifiedflag of NLDI’s
get_basinsfunction. The default value is
Trueto retain the old behavior.
Remove caching-related arguments from all functions since now they can be set globally via three environmental variables:
HYRIVER_CACHE_NAME: Path to the caching SQLite database.
HYRIVER_CACHE_EXPIRE: Expiration time for cached requests in seconds.
HYRIVER_CACHE_DISABLE: Disable reading/writing from/to the cache file.
You can do this like so:
import os os.environ["HYRIVER_CACHE_NAME"] = "path/to/file.sqlite" os.environ["HYRIVER_CACHE_EXPIRE"] = "3600" os.environ["HYRIVER_CACHE_DISABLE"] = "true"
Add a new class called
NHDfor accessing the latest National Hydrography Dataset. More info regarding this data can be found here.
Add two new functions for getting cross-sections along a single flowline via
flowline_xsectionor throughout a network of flowlines via
network_xsection. You can specify spacing and width parameters to control their location. For more information and examples please consult the documentation.
Add a new property to
service_infoto include some useful info about the service including
feature_typeswhich can be handy for converting numeric values of types to their string equivalent.
Use the new PyGeoAPI API.
prepare_nhdplusfor improving the performance and robustness of determining
tocomidwithin a network of NHD flowlines.
Add empty geometries that
NLDI.getbasinsreturns to the list of
not foundIDs. This is because the NLDI service does not include non-network flowlines and instead returns an empty geometry for these flowlines. (GH#48)
Use the three new
ar.retrieve_*functions instead of the old
ar.retrievefunction to improve type hinting and to make the API more consistent.
Revert to the original PyGeoAPI base URL.
ScienceBaseto make it applicable for working with other ScienceBase items. A new function has been added for staging the Additional NHDPlus attributes items called
AGRBaseto remove unnecessary functions and make them more general.
PyGeoAPIclass to conform to the new
pygeoapiAPI. This web service is undergoing some changes at the time of this release and the API is not stable, might not work as expected. As soon as the web service is stable, a new version will be released.
WaterData.byidshow a warning if there are any missing feature IDs that are requested but are not available in the dataset.
ZeroMatchedexception if no features are found.
disable_cachingarguments to all functions that use
async_retriever. Set the default request caching expiration time to never expire. You can use
disable_cachingif you don’t want to use the cached responses. Please refer to documentation of the functions for more details.
prepare_nhdplusto reduce code complexity by grouping all the NHDPlus tools as a private class.
AGRBaseto reflect the latest API changes in
prepare_nhdplusby creating a private class that includes all the previously used private functions. This will make the code more readable and easier to maintain.
Add all the missing types so
Add a new argument to
split_catchmentthat if is set to
Truewill split the basin geometry at the watershed outlet.
Catch service errors in
PyGeoAPIand show useful error messages.
importlib-metadatafor getting the version instead of
pkg_resourcesto decrease import time as discussed in this issue.
More robust handling of inputs and outputs of
Use an alternative download link for NHDPlus VAA file on Hydroshare.
Restructure the codebase to reduce the complexity of
pynhd.pyfile by dividing it into three files:
pynhdall classes that provide access to the supported web services,
corethat includes base classes, and
nhdplus_derivedthat has functions for getting databases that provided additional attributes for the NHDPlus database.
Add support for PyGeoAPI. It offers four functionalities:
Add a function for getting all NHD
FCodesas a data frame, called
prepare_nhdplusfunction by removing all coastlines and better detection of the terminal point in a network.
Migrate to using
AsyncRetrieverfor handling communications with web services.
NLDIand raise a
ServiceErrorinstead. So user knows that data cannot be returned due to the out of service status of the server not
nhdplus_vaato access NHDPlus Value Added Attributes for all its flowlines.
To see a list of available layers in NHDPlus HR, you can instantiate its class without passing any argument like so
Drop support for Python 3.6 since many of the dependencies such as
pandashave done so.
Use persistent caching for all requests which can help speed up network responses significantly.
Improve documentation and testing.
Add an announcement regarding the new name for the software stack, HyRiver.
pipinstallation and release workflow.
The first release after renaming hydrodata to PyGeoHydro.
mypychecks more strict and fix all the errors and prevent possible bugs.
Speed up CI testing by using
Bump version to the same version as PyGeoHydro.
Add a new function for getting basins geometries for a list of USGS station IDs. The function is a method of
get_basins. So, now
NLDI.getfeature_byidfunction does not have a basin flag. This change makes getting geometries easier and faster.
NLDIand make a standalone function called
nhdplus_attrsfor accessing NHDPlus attributes directly from ScienceBase.
Add support for using hydro or edits webs services for getting NHDPlus High-Resolution using
NHDPlusHRfunction. The new arguments are
autos_switchflag for automatically switching to the other service if the ones passed by
Add a new argument to
edge_attrthat allows adding attribute(s) to the returned Networkx Graph. By default, it is
A new base class,
AGRBasefor connecting to ArcGISRESTful-based services such as National Map and EPA’s WaterGEOS.
Add support for setting the buffer distance for the input geometries to
NLDIclass for getting ComIDs of the closest flowlines from a list of lon/lat coordinates.
WaterDatafor getting features within a given radius of a point.
NLDIfunction to use API v3 of the NLDI service.
WaterDatanow is the target CRS of the output dataframe. The service CRS is now
EPSG:4269for all the layers.
NLDIsince it’s not applicable anymore.
Added support for NHDPlus High Resolution for getting features by geometry, IDs, or SQL where clause.
The following functions are added to
getcharacteristic_byid: Getting characteristics of NHDPlus catchments.
navigate_byloc: Getting the nearest ComID to a coordinate and performing navigation.
characteristics_dataframe: Getting all the available catchment-scale characteristics as a data frame.
get_validchars: Getting a list of available characteristic IDs for a specified characteristic type.
The following function is added to
byfilter: Getting data based on any valid CQL filter.
bygeom: Getting data within a geometry (polygon and multipolygon).
Add support for Python 3.9 and tests for Windows.
WaterDatato fix the CRS inconsistencies (#1).
orjsonto speed-up JSON operations.
show_versionsfunction for showing versions of the installed deps.
WaterDatato improve readability.
First release on PyPI.