pygeoogc.utils#
Some utilities for PyGeoOGC.
Module Contents#
- class pygeoogc.utils.RetrySession(retries=3, backoff_factor=0.3, status_to_retry=(500, 502, 504), prefixes=('https://',), cache_name=None, expire_after=EXPIRE_AFTER, disable=False, ssl=True)#
Configures the passed-in session to retry on failed requests.
Notes
The fails can be due to connection errors, specific HTTP response codes and 30X redirections. The code was originally based on: bustawin/retry-requests
- Parameters:
retries (
int
, optional) – The number of maximum retries before raising an exception, defaults to 5.backoff_factor (
float
, optional) – A factor used to compute the waiting time between retries, defaults to 0.5.status_to_retry (
tuple
, optional) – A tuple of status codes that trigger the reply behaviour, defaults to (500, 502, 504).prefixes (
tuple
, optional) – The prefixes to consider, defaults to (”http://”, “https://”)cache_name (
str
, optional) – Path to a folder for caching the session, default to None which uses system’s temp directory.expire_after (
int
, optional) – Expiration time for the cache in seconds, defaults to -1 (never expire).disable (
bool
, optional) – IfTrue
temporarily disable caching request/responses, defaults toFalse
.ssl (
bool
, optional) – IfTrue
verify SSL certificates, defaults toTrue
.
- close()#
Close the session.
- get(url, payload=None, params=None, headers=None, stream=None)#
Retrieve data from a url by GET and return the Response.
- head(url, params=None, data=None, json=None, headers=None)#
Retrieve data from a url by POST and return the Response.
- post(url, payload=None, data=None, json=None, headers=None, stream=None)#
Retrieve data from a url by POST and return the Response.
- pygeoogc.utils.match_crs(geom, in_crs, out_crs)#
Reproject a geometry to another CRS.
- Parameters:
geom (
list
ortuple
orgeometry
) – Input geometry which could be a list of coordinates such as[(x1, y1), ...]
, a bounding box like so(xmin, ymin, xmax, ymax)
, or any validshapely
’s geometry such asPolygon
,MultiPolygon
, etc..in_crs (
str
,int
, orpyproj.CRS
) – Spatial reference of the input geometryout_crs (
str
,int
, orpyproj.CRS
) – Target spatial reference
- Returns:
same type as the input geometry
– Transformed geometry in the target CRS.- Return type:
GeomType
Examples
>>> from shapely import Point >>> point = Point(-7766049.665, 5691929.739) >>> match_crs(point, 3857, 4326).xy (array('d', [-69.7636111130079]), array('d', [45.44549114818127])) >>> bbox = (-7766049.665, 5691929.739, -7763049.665, 5696929.739) >>> match_crs(bbox, 3857, 4326) (-69.7636111130079, 45.44549114818127, -69.73666165448431, 45.47699468552394) >>> coords = [(-7766049.665, 5691929.739)] >>> match_crs(coords, 3857, 4326) [(-69.7636111130079, 45.44549114818127)]
- pygeoogc.utils.streaming_download(urls: str, kwds: dict[str, dict[Any, Any]] | None = None, fnames: str | pathlib.Path | None = None, root_dir: str | pathlib.Path | None = None, file_prefix: str = '', file_extention: str = '', method: Literal['GET', 'POST', 'get', 'post'] = 'GET', ssl: bool = True, chunk_size: int = CHUNK_SIZE, n_jobs: int = MAX_CONN) pathlib.Path | None #
- pygeoogc.utils.streaming_download(urls: list[str], kwds: list[dict[str, dict[Any, Any]]] | None = None, fnames: collections.abc.Sequence[str | pathlib.Path] | None = None, root_dir: str | pathlib.Path | None = None, file_prefix: str = '', file_extention: str = '', method: Literal['GET', 'POST', 'get', 'post'] = 'GET', ssl: bool = True, chunk_size: int = CHUNK_SIZE, n_jobs: int = MAX_CONN) list[pathlib.Path | None]
Download and store files in parallel from a list of URLs/Keywords.
Notes
This function runs asynchronously in parallel using
n_jobs
threads.- Parameters:
kwds (
tuple
orlist
, optional) – A list of keywords associated with each URL, e.g., ({“params”: …, “headers”: …}, …). Defaults toNone
.fnames (
tuple
orlist
, optional) – A list of filenames associated with each URL, e.g., (“file1.zip”, …). Defaults toNone
. If not provided, random unique filenames will be generated based on URL and keyword pairs.root_dir (
str
orPath
, optional) – Root directory to store the files, defaults toNone
which uses HyRiver’s cache directory. Note that you should either provideroot_dir
orfnames
. If both are provided,root_dir
will be ignored.file_prefix (
str
, optional) – Prefix to add to filenames when storing the files, defaults toNone
, i.e., no prefix. This argument will be only be used iffnames
is not passed.file_extention (
str
, optional) – Extension to use for storing the files, defaults toNone
, i.e., no extension iffnames
is not provided otherwise. This argument will be only be used iffnames
is not passed.method (
str
, optional) – HTTP method to use, i.e,GET
orPOST
, by default “GET”.ssl (
bool
, optional) – Whether to use SSL verification, defaults toTrue
.chunk_size (
int
, optional) – Chunk size to use when downloading, defaults to 100 * 1024 * 1024 i.e., 100 MB.n_jobs (
int
, optional) – The maximum number of concurrent downloads, defaults to 10.
- Returns:
list
– A list ofpathlib.Path
objects associated with URLs in the same order.
- pygeoogc.utils.traverse_json(json_data, ipath)#
Extract an element from a JSON-like object along a specified ipath.
This function is based on bcmullins.
Examples
>>> data = [ ... {"employees": [ ... {"name": "Alice", "role": "dev", "nbr": 1}, ... {"name": "Bob", "role": "dev", "nbr": 2}, ... ],}, ... {"firm": {"name": "Charlie's Waffle Emporium", "location": "CA"}}, ... ] >>> traverse_json(data, ["employees", "name"]) [['Alice', 'Bob'], [None]]