This page was generated from dam_impact.ipynb. Interactive online version:

Working with Geospatial Hydrologic Data Using Web Services#

Dam Impact on streamflow: Python

Author	Affiliation
Taher Chegini	University of Houston
Mike Johnson	Lynker
Marc Weber	US Environmental Protection Agency

Introduction#

This is the second part of the “Working with Geospatial Hydrologic Data Using Web Services” workshop. In the first part, we covered some basic concepts related to geospatial analysis. Although R is used in the first part, the introduced concepts are applicable to Python as well. So, in this tutorial, we won’t repeat the same concepts, instead we focus on equivalent tools in Python and provide an in-depth hands-on example geospatial analysis workflow.

Geospatial Tools in Python#

This is non-exhaustive list of geospatial tools that we directly/indirectly use in this tutorial:

PyNHD: Navigate and subset NHDPlus (MR and HR) dataset using web services.
Py3DEP: Access topographic data through National Map’s 3DEP web service.
PyGeoHydro: Access NWIS, NID, WQP, HCDN 2009, NLCD, CAMELS, and SSEBop databases.
PyDaymet: Access Daymet for daily climate data both single pixel and gridded.
xarray: An open-source project and Python package that makes working with labeled multi-dimensional arrays simple, efficient, and fun!
rioxarray: Rasterio xarray extension.
GeoPandas: An open-source project to make working with geospatial data in python easier.
Folium: Makes it easy to visualize data that’s been manipulated in Python on an interactive leaflet map. It enables both the binding of data to a map for choropleth visualizations and passing rich vector/raster/HTML visualizations as markers on the map.

You can find more geospatial tools at Awesome Geospatial repository.

The primary goal of this hands-on tutorial is to introduce a handful of geospatial web services for conducting scientific studies. For this purpose, we’re going to take a look at the impact of building dams on their downstream streamflow. Here are some example peer-reviewed papers on this topic.

We set the area of interest (AOI) to Texas and study dams that have been built in the 1995-2005 period.

Workflow:

Get gauge stations within Texas that have enough streamflow data
Get dams within Texas that have been built have been built between 1995-2005
Filter out stations that don’t have any dams in their upstream up to 10 km
Get streamflow data for the remaining stations
Visually inspect the streamflow data for dam impacts
Check land use/land cover type around the dams

Streamflow Gauges#

First, we use the National Water Information System (NWIS) service to check streamflow data availability in our AOI and 10 years before and after the period of our study, i.e., the 1985-2015 period.

Let’s start by importing the required libraries.

[1]:

from pathlib import Path

import geopandas as gpd
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd

import pygeohydro as gh
import pygeoutils as geoutils
from pygeohydro import NID, NWIS
from pynhd import NLDI, WaterData, ZeroMatchedError

[2]:

start = "1985-01-01"
end = "2015-01-01"
texas = gh.helpers.get_us_states("TX")

[3]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
texas.plot(ax=ax, facecolor="b", edgecolor="r", linewidth=2, alpha=0.8, figsize=(10, 10))
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_8_0.png

[4]:

nwis = NWIS()
query = {
    "stateCd": "TX",
    "startDt": start,
    "endDt": end,
    "outputDataTypeCd": "dv",  # daily values
    "hasDataTypeCd": "dv",  # daily values
    "parameterCd": "00060",  # discharge
}
sites = nwis.get_info(query, nhd_info=True)
sites.shape[0]

/Users/tchegini/.local/apps/mambaforge/envs/hyriver-dev/lib/python3.11/site-packages/pynhd/pynhd.py:985: UserWarning: 95 of 3717 requests failed.. Indices of the failed requests are [735, 736, 737, 738, 739, 740, 741, 742, 743, 969, 970, 971, 972, 973, 974, 987, 988, 989, 990, 991, 992, 1570, 1571, 1572, 1573, 1574, 1575, 2395, 2396, 2397, 2398, 2399, 2400, 2401, 2402, 2403, 2404, 2405, 2406, 2407, 2408, 2409, 2410, 2411, 2412, 2413, 2414, 2415, 2416, 2417, 2418, 2419, 2420, 2421, 2422, 2423, 2796, 2797, 2798, 2799, 2806, 2856, 2857, 2858, 2859, 2860, 2861, 2862, 2863, 2864, 2970, 2971, 2972, 2973, 2974, 2975, 2986, 2987, 2988, 2989, 2990, 2991, 3639, 3640, 3641, 3642, 3643, 3644, 3686, 3694, 3695, 3696, 3697, 3698, 3699]
  return self._get_urls(urls, False)

[4]:

We can see that there are 30604 streamflow gauge stations in Texas that fit our criteria.

[5]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
texas.plot(ax=ax, facecolor="b", edgecolor="r", linewidth=2, alpha=0.8, figsize=(10, 10))
sites.plot(ax=ax, marker="*", markersize=90, color="w", edgecolor="k", label="NWIS sites")
ax.legend(loc="upper left")
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_11_0.png

Now, let’s filter these stations a step further to include only those stations that have around 30 years of daily streamflow data with drainage area of larger than 10 km\(^2\) that have been impacted by human activities.

[6]:

sites = sites[
    (sites.parm_cd == "00060")  # discharge
    & (sites.stat_cd == "00003")  # mean
    & (sites.count_nu >= 30 * 365)  # at least 30 years of data
    & ~sites.hcdn_2009  # not pristine
    & (sites.nhd_areasqkm > 10)
].copy()
sites.shape[0]

[6]:

Upon applying this filter, we’re left with 280 stations.

[7]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
texas.plot(ax=ax, facecolor="b", edgecolor="r", linewidth=2, alpha=0.8, figsize=(10, 10))
sites.plot(ax=ax, marker="*", markersize=90, color="w", edgecolor="k", label="NWIS sites")
ax.legend(loc="upper left")
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_15_0.png

Dams#

Next, we need to retrieve the dams in Texas that have been built between 1995 and 2005. We use National Inventory of Dams (NID) web service for this purpose. First, let’s check out the fields that NID exposes through their web service.

[8]:

nid = NID()
nid.valid_fields[nid.valid_fields.str.contains("year", case=False)]

[8]:

55                yearCompleted
56              yearCompletedId
57                yearsModified
178        hydroApprovedIdfYear
183    hydroOrigIdfApprovedYear
188        hydroPmfApprovedYear
194           hydroOtAepDevYear
200            hydroTopInfoYear
Name: name, dtype: object

[9]:

nid.valid_fields[nid.valid_fields.str.contains("state", case=False)]

[9]:

16                stateFedId
19                     state
21               countyState
27          stateRegulatedId
30     stateRegulatoryAgency
91                 cityState
250                 stateKey
Name: name, dtype: object

Based on these, we should use the stateKey and yearCompleted fields.

[10]:

dam_list = nid.get_byfilter(
    [
        {
            "stateKey": ["TX"],
            "yearCompleted": ["[1995 2005]"],
        },
    ],
)
dams = dam_list[0]

[11]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
texas.plot(ax=ax, facecolor="b", edgecolor="r", linewidth=2, alpha=0.8, figsize=(10, 10))
sites.plot(ax=ax, marker="*", markersize=110, color="w", edgecolor="k", label="NWIS sites")
dams.plot(ax=ax, marker="o", markersize=20, color="g", edgecolor="k", label="NID dams")
ax.legend(loc="upper left")
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_21_0.png

As is evident from the plot above, there are many stations that don’t have any dams in their vicinity. One way to eliminate these stations is using a spatial query based on a search radius. We can determine an estimation for our search radius based on the upstream drainage area distribution of the streamflow gauges.

[12]:

ax = sites.hist("nhd_areasqkm", figsize=(4, 2.5), bins="auto")
_ = ax[0][0].set_title("Drainage Area (km$^2$) Histogram")

../../_images/examples_notebooks_dam_impact_23_0.png

We can see that most stations have a drainage area of less than 15000 km\(^2\). Since they’re not very large a search radius of 10 km should be sufficient. Now, we define a function that carries out an efficient spatial query to find the stations that have at least one dam within a 10-km radius.

[13]:

def distance_filter(gdf1, gdf2, dist_km):
    """Filter gdf1 to only include points within distance of gdf2."""
    buff = gdf1.to_crs(5070).buffer(dist_km * 1e3)
    idx2, idx1 = buff.sindex.query(gdf2.to_crs(5070).geometry)
    return gdf1.iloc[pd.unique(idx1)].reset_index(drop=True), gdf2.iloc[
        pd.unique(idx2)
    ].reset_index(drop=True)

[14]:

sites, dams = distance_filter(sites, dams, 10)

ax = sites.hist("nhd_areasqkm", figsize=(4, 2.5), bins="auto")
_ = ax[0][0].set_title("Drainage Area (km$^2$) Histogram")

../../_images/examples_notebooks_dam_impact_26_0.png

[15]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
texas.plot(ax=ax, facecolor="b", edgecolor="r", linewidth=2, alpha=0.8, figsize=(10, 10))
sites.plot(ax=ax, marker="*", markersize=110, color="w", edgecolor="k", label="NWIS sites")
dams.plot(ax=ax, marker="o", markersize=20, color="g", edgecolor="k", label="NID dams")
ax.legend(loc="upper left")
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_27_0.png

Flowlines and Hydrolinking#

So far, we obtained only the stations that have at least one dam in their 10-km radius, but we didn’t find if those dams are in their upstream or downstream. We use the Hydro Network-Linked Data Index (NLDI) web service to obtain the upstream flowlines of the streamflow gauges up to 10 km. Note that this 10 km is the distance along the flowlines.

[16]:

nldi = NLDI()

[17]:

flw_up = {}
noflw = []
for agency, fid in sites[["agency_cd", "site_no"]].itertuples(index=False, name=None):
    try:
        flw_up[fid] = nldi.navigate_byid(
            fsource="nwissite",
            fid=f"{agency}-{fid}",
            navigation="upstreamTributaries",
            source="flowlines",
            distance=10,
        )
    except ZeroMatchedError:
        noflw.append(fid)
noflw

[17]:

[]

[18]:

sites = sites[sites.site_no.isin(flw_up)].copy()
sites, dams = distance_filter(sites, dams, 10)

We now have flowlines and dams that are closer than 10 km, but we still don’t have a way of linking these features. For this purpose, we use NHDPlus’s Common Identifier (comid). One way to obtain the first downstream flowline of coordinate is to use NLDI’s hydrolocation endpoint.

[19]:

coords = list(dams[["longitude", "latitude"]].astype(float).itertuples(index=False, name=None))
nid_comids = nldi.feature_byloc(coords)
dams["nhdplus_comid"] = nid_comids.comid.values

[21]:

coords = list(sites[["dec_long_va", "dec_lat_va"]].astype(float).itertuples(index=False, name=None))
site_comids = nldi.feature_byloc(coords)
sites["nhdplus_comid"] = site_comids.comid.values

We can then use these obtained comids to hydrolink the dams, stations, and flowlines.

[22]:

flw_all = pd.concat(flw_up.values(), keys=flw_up.keys()).reset_index().drop(columns="level_1")
flw_all = gpd.GeoDataFrame(
    flw_all.rename(columns={"level_0": "usgs_id"}), crs=flw_up["07312200"].crs
)
flw_all.head()

[22]:

	usgs_id	geometry	nhdplus_comid
0	08044000	LINESTRING (-97.69269 33.23604, -97.69307 33.2...	1305635
1	08044000	LINESTRING (-97.69154 33.23755, -97.69228 33.2...	1305593
2	08044000	LINESTRING (-97.71624 33.23529, -97.71599 33.2...	1305595
3	08044000	LINESTRING (-97.69438 33.24547, -97.69389 33.2...	1305591
4	08044000	LINESTRING (-97.69002 33.24999, -97.68994 33.2...	1305589

Finally, upon hydrolinking these features we can determine the stations that have at least one dam in their upstream up to 10 km.

[23]:

flw_dam = pd.merge(
    flw_all[["usgs_id", "nhdplus_comid"]],
    dams[["federalId", "nhdplus_comid"]].rename(columns={"federalId": "dam_id"}),
    on="nhdplus_comid",
    how="left",
)
flw_dam = flw_dam[~flw_dam.dam_id.isna()].reset_index(drop=True)
flw_dam

[23]:

	usgs_id	nhdplus_comid	dam_id
0	08158920	5781345	TX07494
1	08158920	5781345	TX07223
2	08095000	5531532	TX07074
3	08095000	5531532	TX07061
4	08178700	10839872	TX07211
5	08057445	1260363	TX09216
6	08177300	1638809	TX07200
7	07312200	13730353	TX07042

[24]:

site_dam = flw_dam.usgs_id.unique()
site_dam

[24]:

array(['08158920', '08095000', '08178700', '08057445', '08177300',
       '07312200'], dtype=object)

Streamflow Observations#

Upon finalizing the stations that satisfy our criteria, we use NWIS to obtain the daily streamflow data. But first, we need to obtain the year that the construction of dams were completed, so we can exactly specify the required streamflow data period.

[25]:

dams_info = nid.inventory_byid(flw_dam.dam_id.to_list())
dams_info.head()

[25]:

	associatedStructuresCount	conditionAssessDate	conditionAssessDetail	conditionAssessId	damHeight	damLength	distance	drainageArea	eapId	dsacAssignedDate	...	purposeIds	dsacId	femaRegion	lastEapExcerDate	dataUpdated	usaceDistrict	noConsequencesId	secretaryAgricultureBuiltId	nrcsWatershedAuthorizationId	geometry
id
544499	0	2014-06-18T12:00:00.000Z	NOT RATED	5	22.0	0	NaN	1.13	3	None	...	None	None	6	None	2023-09-18T12:00:00.000Z	Fort Worth District	0	0	5	POINT (-97.37830 28.77720)
543218	0	2014-06-18T12:00:00.000Z	FAIR	2	12.0	300	NaN	0.90	1	None	...	9	None	6	None	2023-09-18T12:00:00.000Z	Fort Worth District	0	0	5	POINT (-96.66611 32.73923)
543202	0	2022-11-01T12:00:00.000Z	FAIR	5	28.1	825	NaN	NaN	2	None	...	None	None	6	None	2023-09-18T12:00:00.000Z	Galveston District	0	0	5	POINT (-97.86973 30.24981)
542209	0	2021-04-30T12:00:00.000Z	SATISFACTORY	2	49.0	6615	0.0	16.72	1	None	...	9	None	6	None	2023-09-18T12:00:00.000Z	Galveston District	0	1	3	POINT (-98.45066 29.55025)
542221	0	2016-10-13T12:00:00.000Z	POOR	3	26.0	730	0.0	1.34	1	None	...	9	None	6	None	2023-09-18T12:00:00.000Z	Galveston District	0	0	5	POINT (-97.89934 30.24614)

5 rows × 96 columns

We use the yearCompleted field to determine the required study period by looking at 15 years before and after the year completed for each dam. The reason that we opted for 15 years is the fact that during the construction the streamflow is temporarily diverted from its original path and also upon completion it usually takes some time for dams to be filled and reach their operational capacity. So we consider a 5-year buffer around the year completed.

[26]:

flw_dam = pd.merge(
    flw_dam,
    dams_info[["federalId", "nidStorage", "nidHeight", "yearCompleted"]].rename(
        columns={"federalId": "dam_id"}
    ),
    on="dam_id",
    how="left",
)
flw_dam["start"] = pd.to_datetime(flw_dam.yearCompleted - 15, format="%Y")
flw_dam["end"] = pd.to_datetime(flw_dam.yearCompleted + 15, format="%Y") + pd.DateOffset(
    years=1, days=-1
)
flw_dam.head()

[26]:

	usgs_id	nhdplus_comid	dam_id	nidStorage	nidHeight	yearCompleted	start	end
0	08158920	5781345	TX07494	189.00	28.1	1996	1981-01-01	2011-12-31
1	08158920	5781345	TX07223	156.31	26.0	1999	1984-01-01	2014-12-31
2	08095000	5531532	TX07074	731.00	73.0	2000	1985-01-01	2015-12-31
3	08095000	5531532	TX07061	410.00	22.6	1998	1983-01-01	2013-12-31
4	08178700	10839872	TX07211	8741.00	53.0	2004	1989-01-01	2019-12-31

We then use these dates to obtain the streamflow.

[27]:

dates = (flw_dam.start.min(), flw_dam.end.max())
qobs = nwis.get_streamflow(site_dam, dates)

Now, let’s take a look at the data and assess the station by visual inspection. Note the there are several methods for quantifying the impact of dams on streamflow such as Reservoir Index, but in this tutorial we just rely on visual inspection for brevity.

[28]:

ma = qobs.groupby(pd.Grouper(freq="Y")).mean()
yc = flw_dam.set_index("usgs_id").yearCompleted
fig, axs = plt.subplots(
    nrows=ma.shape[1], figsize=(8, 10), sharex=True, constrained_layout=True, dpi=100
)

for i, s in enumerate(ma):
    (line1,) = axs[i].plot(ma[s], label="Mean annual flow (cms)")
    axs[i].set_ylim(0, ma[s].quantile(0.95) * 1.7)
    axs[i].set_title(s)
    yrs = yc.loc[s.replace("USGS-", "")]
    yrs = [yrs] if isinstance(yrs, np.int64) else yrs
    for yr in yrs:
        line2 = axs[i].axvline(
            pd.to_datetime(yr, format="%Y"),
            color="r",
            linestyle="--",
            linewidth=2,
            label="Dam Year Completed",
        )
_ = axs[0].legend(handles=[line1, line2], loc="best")

../../_images/examples_notebooks_dam_impact_47_0.png

We can see that based on the available data and visual inspection, the first station shows a noticeable difference before and after the dam construction. Next we take a closer look at this station, USGS-07312200.

Land Use/Land Cover (LULC)#

To gain a better understanding of the land features around and near the USGS-07312200 station and its dams, we use the LULC data. For this purpose, first, we obtain the local catchments up to 30 km upstream and downstream of the station. The WaterData web service has several layers one of which is local catchments. Let’s use the COMIDs to obtain their corresponding catchments.

[29]:

station_id = "07312200"
flw_ut = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="upstreamTributaries",
    source="flowlines",
    distance=30,
)
flw_dd = nldi.navigate_byid(
    fsource="nwissite",
    fid=f"USGS-{station_id}",
    navigation="downstreamDiversions",
    source="flowlines",
    distance=30,
)
flowlines = gpd.GeoDataFrame(pd.concat([flw_ut, flw_dd]), crs=4326)
wd = WaterData("catchmentsp")
catchment = wd.byid("featureid", flowlines.nhdplus_comid.to_list())
flw_dd["nhdplus_comid"] = flw_dd.nhdplus_comid.astype(catchment.featureid.dtype)
catchment["direction"] = "upstream"
catchment.loc[catchment.featureid.isin(flw_dd.nhdplus_comid), "direction"] = "downstream"
catchment.head()

[29]:

	geometry	gridcode	featureid	sourcefc	areasqkm	shape_length	shape_area	direction
0	MULTIPOLYGON (((-98.95702 34.00904, -98.95757 ...	1347768	13728581	NHDFlowline	3.667051	0.082690	3.579443e-04	upstream
1	MULTIPOLYGON (((-98.95302 34.00727, -98.95292 ...	1347933	13728583	NHDFlowline	0.005346	0.003188	5.217419e-07	upstream
2	MULTIPOLYGON (((-98.95483 34.00319, -98.95537 ...	1348044	13728585	NHDFlowline	0.150286	0.018680	1.466643e-05	upstream
3	MULTIPOLYGON (((-98.94654 34.00018, -98.94817 ...	1347921	13728587	NHDFlowline	0.010543	0.004362	1.028886e-06	upstream
4	MULTIPOLYGON (((-98.94831 34.00390, -98.94784 ...	1347981	13728589	NHDFlowline	0.142392	0.027063	1.389548e-05	upstream

[30]:

_, ax = plt.subplots(dpi=100, figsize=(6, 6))
catchment.plot(ax=ax, column="direction", edgecolor="k", lw=0.5)
flowlines.plot(ax=ax, color="b", linewidth=1.5, zorder=1)
sites[sites.site_no.isin([station_id])].plot(
    ax=ax, edgecolor="k", c="r", marker="*", markersize=150, label="Station"
)
dams[dams.nhdplus_comid.isin(flowlines.nhdplus_comid)].plot(
    ax=ax, edgecolor="k", c="m", marker="^", markersize=50, label="Dam"
)
ax.legend(loc="upper left")
ax.set_axis_off()

../../_images/examples_notebooks_dam_impact_51_0.png

Now, we can retrieve the LULC data at 30-m resolution within the union of these catchments using the Multi-Resolution Land Characteristics (MRLC) Consortium web service.

[31]:

geom = gpd.GeoSeries([catchment.unary_union], crs=catchment.crs, index=[station_id])
nlcd = gh.nlcd_bygeom(geom, 30, years={"cover": 2019})[station_id]
nlcd

[32]:

meta = gh.helpers.nlcd_helper()
pd.Series(meta["classes"])

[32]:

11     Open Water - All areas of open water, generall...
12     Perennial Ice/Snow - All areas characterized b...
21     Developed, Open Space - Includes areas with a ...
22     Developed, Low Intensity -Includes areas with ...
23     Developed, Medium Intensity - Includes areas w...
24     Developed, High Intensity - Includes highly de...
31     Barren Land (Rock/Sand/Clay) - Barren areas of...
41     Deciduous Forest - Areas dominated by trees ge...
42     Evergreen Forest - Areas dominated by trees ge...
43     Mixed Forest - Areas dominated by trees genera...
45     Shrub-Forest - Areas identified as currently s...
46     Herbaceous-Forest - Areas identified as curren...
51     Dwarf Scrub - Alaska only areas dominated by s...
52     Shrub/Scrub - Areas dominated by shrubs; less ...
71     Grassland/Herbaceous - Areas dominated by gram...
72     Sedge/Herbaceous - Alaska only areas dominated...
73     Lichens - Alaska only areas dominated by fruti...
74     Moss - Alaska only areas dominated by mosses, ...
81     Pasture/Hay - Areas of grasses, legumes, or gr...
82     Cultivated Crops - Areas used for the producti...
90     Woody Wetlands - Areas where forest or shrub l...
95     Emergent Herbaceous Wetlands - Areas where per...
127                                     Background value
dtype: object

[33]:

cmap, norm, levels = gh.plot.cover_legends()
fig, ax = plt.subplots(figsize=(8, 4), dpi=100)
cover = nlcd.cover_2019.where(nlcd.cover_2019 != nlcd.cover_2019.rio.nodata)
cover.plot(ax=ax, cmap=cmap, levels=levels, cbar_kwargs={"ticks": levels[:-1]})
ax.set_xlabel("")
ax.set_ylabel("")
ax.set_title("Land Use/Land Cover 2019")
dams[dams.nhdplus_comid.isin(flw_up[station_id].nhdplus_comid)].plot(
    ax=ax, edgecolor="k", c="m", marker="^", markersize=50, label="Dam"
)
ax.legend(loc="upper left")
ax.set_axis_off()

fig.savefig(Path("_static", "dam_impact.png"), bbox_inches="tight", facecolor="w")

../../_images/examples_notebooks_dam_impact_55_0.png

Now, let’s take a look at the differences in land cover types, upstream and downstream of the station. We use cover_statistics function of PyGeoHydro to get the statistics for each land cover types. For this purpose, first we need to mask the NLCD dataset based on the direction attribute of the catchment geo-dataframe using xarray_geomask function from PyGeoUtils.

[34]:

ut_geom = catchment[catchment.direction == "upstream"].unary_union
ut_cover = gh.cover_statistics(geoutils.xarray_geomask(nlcd.cover_2019, ut_geom, catchment.crs))

dd_geom = catchment[catchment.direction == "downstream"].unary_union
dd_cover = gh.cover_statistics(geoutils.xarray_geomask(nlcd.cover_2019, dd_geom, catchment.crs))

[35]:

stats = pd.DataFrame([ut_cover.categories, dd_cover.categories]).T
stats.columns = ["Upstream", "Downstream"]
stats["Upstream - Downstream"] = stats.Upstream - stats.Downstream
stats.round(1)

[35]:

	Upstream	Downstream	Upstream - Downstream
Water	1.1	0.4	0.7
Developed	1.4	3.6	-2.2
Barren	1.6	0.4	1.2
Forest	0.8	7.6	-6.8
Shrubland	28.9	12.9	16.0
Herbaceous	62.5	50.4	12.1
Planted/Cultivated	3.6	24.4	-20.8
Wetlands	0.1	0.3	-0.2

Notably we see that the upstream area has significantly more grassland than downstream, while the downstream catchments have significantly more cropland. The implication is that these dams were constructed to support the transition of grassland to agricultural land.

Final Remarks#

This tutorial demonstrated the application of various web services for obtaining hydrolinked geospatial data using only web services. These web services are powerful tools that make big geospatial data more equitable and the rich ecosystem of tools that are available in R and Python facilitate accessing them.