pygeoutils.pygeoutils#

Some utilities for manipulating GeoSpatial data.

Module Contents#

pygeoutils.pygeoutils.arcgis2geojson(arcgis, id_attr=None)#

Convert ESRIGeoJSON format to GeoJSON.

Notes

Based on arcgis2geojson.

Parameters:
  • arcgis (str or binary) – The ESRIGeoJSON format str (or binary)

  • id_attr (str, optional) – ID of the attribute of interest, defaults to None.

Returns:

dict – A GeoJSON file readable by GeoPandas.

Return type:

dict[str, Any]

pygeoutils.pygeoutils.geodf2xarray(geodf, resolution, attr_col=None, fill=0, projected_crs=5070)#

Rasterize a geopandas.GeoDataFrame to xarray.DataArray.

Parameters:
  • geodf (geopandas.GeoDataFrame or geopandas.GeoSeries) – GeoDataFrame or GeoSeries to rasterize.

  • resolution (float) – Target resolution of the output raster in the projected_crs unit. Since the default projected_crs is EPSG:5070, the default unit for the resolution is meters.

  • attr_col (str, optional) – Column name of the attribute to use as variable., defaults to None, i.e., the variable will be a boolean mask where 1 indicates the presence of a geometry. Also, note that the attribute must be numeric and have one of the following numpy types: int16, int32, uint8, uint16, uint32, float32, and float64.

  • fill (int or float, optional) – Value to use for filling the missing values (mask) of the output raster, defaults to 0.

  • projected_crs (int, str, or pyproj.CRS, optional) – A projected CRS to use for the output raster, defaults to EPSG:5070.

Returns:

xarray.Dataset – The xarray Dataset with a single variable.

Return type:

xarray.Dataset

pygeoutils.pygeoutils.gtiff2vrt(file_list, vrt_path)#

Create a VRT file from a list of (Geo)Tiff files.

Note

This function requires gdal to be installed.

Parameters:
  • file_list (list) – List of paths to the GeoTiff files.

  • vrt_path (str or Path) – Path to the output VRT file.

pygeoutils.pygeoutils.gtiff2xarray(r_dict, geometry=None, geo_crs=None, ds_dims=None, driver=None, all_touched=False, nodata=None, drop=True)#

Convert (Geo)Tiff byte responses to xarray.Dataset.

Parameters:
  • r_dict (dict) – Dictionary of (Geo)Tiff byte responses where keys are some names that are used for naming each responses, and values are bytes.

  • geometry (Polygon, MultiPolygon, or tuple, optional) – The geometry to mask the data that should be in the same CRS as the r_dict. Defaults to None.

  • geo_crs (int, str, or pyproj.CRS, optional) – The spatial reference of the input geometry, defaults to None. This argument should be given when geometry is given.

  • ds_dims (tuple of str, optional) – The names of the vertical and horizontal dimensions (in that order) of the target dataset, default to None. If None, dimension names are determined from a list of common names.

  • driver (str, optional) – A GDAL driver for reading the content, defaults to automatic detection. A list of the drivers can be found here.

  • all_touched (bool, optional) – Include a pixel in the mask if it touches any of the shapes. If False (default), include a pixel only if its center is within one of the shapes, or if it is selected by Bresenham’s line algorithm.

  • nodata (float or int, optional) – The nodata value of the raster, defaults to None, i.e., it is determined from the raster.

  • drop (bool, optional) – If True, drop the data outside of the extent of the mask geometries. Otherwise, it will return the same raster with the data masked. Default is True.

Returns:

xarray.Dataset or xarray.DataAraay – Requested dataset or dataarray.

Return type:

xarray.DataArray | xarray.Dataset

pygeoutils.pygeoutils.json2geodf(content, in_crs=4326, crs=4326)#

Create GeoDataFrame from (Geo)JSON.

Parameters:
  • content (dict or list of dict) – A (Geo)JSON dictionary e.g., response.json() or a list of them.

  • in_crs (int, str, or pyproj.CRS, optional) – CRS of the content, defaults to epsg:4326.

  • crs (int, str, or pyproj.CRS, optional) – The target CRS of the output GeoDataFrame, defaults to epsg:4326.

Returns:

geopandas.GeoDataFrame – Generated geo-data frame from a GeoJSON

Return type:

geopandas.GeoDataFrame

pygeoutils.pygeoutils.xarray2geodf(da, dtype, mask_da=None, connectivity=8)#

Vectorize a xarray.DataArray to a geopandas.GeoDataFrame.

Parameters:
  • da (xarray.DataArray) – The dataarray to vectorize.

  • dtype (type) – The data type of the dataarray. Valid types are int16, int32, uint8, uint16, and float32.

  • mask_da (xarray.DataArray, optional) – The dataarray to use as a mask, defaults to None.

  • connectivity (int, optional) – Use 4 or 8 pixel connectivity for grouping pixels into features, defaults to 8.

Returns:

geopandas.GeoDataFrame – The vectorized dataarray.

Return type:

geopandas.GeoDataFrame

pygeoutils.pygeoutils.xarray_geomask(ds, geometry, crs, all_touched=False, drop=True, from_disk=False)#

Mask a xarray.Dataset based on a geometry.

Parameters:
  • ds (xarray.Dataset or xarray.DataArray) – The dataset(array) to be masked

  • geometry (Polygon, MultiPolygon, or tuple of length 4) – The geometry to mask the data

  • crs (int, str, or pyproj.CRS) – The spatial reference of the input geometry

  • all_touched (bool, optional) – Include a pixel in the mask if it touches any of the shapes. If False (default), include a pixel only if its center is within one of the shapes, or if it is selected by Bresenham’s line algorithm.

  • drop (bool, optional) – If True, drop the data outside of the extent of the mask geometries. Otherwise, it will return the same raster with the data masked. Default is True.

  • from_disk (bool, optional) – If True, it will clip from disk using rasterio.mask.mask if possible. This is beneficial when the size of the data is larger than memory. Default is False.

Returns:

xarray.Dataset or xarray.DataArray – The input dataset with a mask applied (np.nan)

Return type:

XD