async_retriever.async_retriever#
Core async functions.
Module Contents#
- async_retriever.async_retriever.delete_url_cache(url, request_method='get', cache_name=None, **kwargs)#
Delete cached response associated with
url, along with its history (if applicable).- Parameters:
url (
str) – URL to be deleted from the cacherequest_method (
str, optional) – HTTP request method to be deleted from the cache, defaults toGET.cache_name (
str, optional) – Path to a file for caching the session, defaults to./cache/aiohttp_cache.sqlite.kwargs (
dict, optional) – Keywords to pass to thecache.delete_url().
- async_retriever.async_retriever.retrieve(urls, read_method, request_kwds=None, request_method='get', limit_per_host=MAX_HOSTS, cache_name=None, timeout=TIMEOUT, expire_after=EXPIRE_AFTER, ssl=True, disable=False, raise_status=True)#
Send async requests.
- Parameters:
read_method (
str) – Method for returning the request;binary,json, andtext.request_kwds (
listofdict, optional) – List of requests keywords corresponding to input URLs (1 on 1 mapping), defaults toNone. For example,[{"params": {...}, "headers": {...}}, ...].request_method (
str, optional) – Request type;GET(get) orPOST(post). Defaults toGET.limit_per_host (
int, optional) – Maximum number of simultaneous connections per host, defaults to 4.cache_name (
str, optional) – Path to a file for caching the session, defaults to./cache/aiohttp_cache.sqlite.timeout (
int, optional) – Requests timeout in seconds, defaults to 120.expire_after (
int, optional) – Expiration time for response caching in seconds, defaults to 2592000 (one week).ssl (
boolorSSLContext, optional) – SSLContext to use for the connection, defaults to None. Set to False to disable SSL certification verification.disable (
bool, optional) – IfTruetemporarily disable caching requests and get new responses from the server, defaults to False.raise_status (
bool, optional) – Raise an exception if the response status is not 200. IfFalsereturnNone. Defaults toTrue.
- Returns:
list– List of responses in the order of input URLs.- Return type:
list[str | bytes | dict[str, Any] | list[dict[str, Any]] | None]
Examples
>>> import async_retriever as ar >>> stations = ["01646500", "08072300", "11073495"] >>> url = "https://waterservices.usgs.gov/nwis/site" >>> urls, kwds = zip( ... *[ ... (url, {"params": {"format": "rdb", "sites": s, "siteStatus": "all"}}) ... for s in stations ... ] ... ) >>> resp = ar.retrieve(urls, "text", request_kwds=kwds) >>> resp[0].split("\n")[-2].split("\t")[1] '01646500'
- async_retriever.async_retriever.retrieve_binary(urls, request_kwds=None, request_method='get', limit_per_host=MAX_HOSTS, cache_name=None, timeout=TIMEOUT, expire_after=EXPIRE_AFTER, ssl=True, disable=False, raise_status=True)#
Send async requests and get the response as
bytes.- Parameters:
request_kwds (
listofdict, optional) – List of requests keywords corresponding to input URLs (1 on 1 mapping), defaults toNone. For example,[{"params": {...}, "headers": {...}}, ...].request_method (
str, optional) – Request type;GET(get) orPOST(post). Defaults toGET.limit_per_host (
int, optional) – Maximum number of simultaneous connections per host, defaults to 4.cache_name (
str, optional) – Path to a file for caching the session, defaults to./cache/aiohttp_cache.sqlite.timeout (
int, optional) – Requests timeout in seconds, defaults to 120.expire_after (
int, optional) – Expiration time for response caching in seconds, defaults to 2592000 (one week).ssl (
boolorSSLContext, optional) – SSLContext to use for the connection, defaults to None. Set to False to disable SSL certification verification.disable (
bool, optional) – IfTruetemporarily disable caching requests and get new responses from the server, defaults to False.raise_status (
bool, optional) – Raise an exception if the response status is not 200. IfFalsereturnNone. Defaults toTrue.
- Returns:
bytes– List of responses in the order of input URLs.- Return type:
- async_retriever.async_retriever.retrieve_json(urls, request_kwds=None, request_method='get', limit_per_host=MAX_HOSTS, cache_name=None, timeout=TIMEOUT, expire_after=EXPIRE_AFTER, ssl=True, disable=False, raise_status=True)#
Send async requests and get the response as
json.- Parameters:
request_kwds (
listofdict, optional) – List of requests keywords corresponding to input URLs (1 on 1 mapping), defaults toNone. For example,[{"params": {...}, "headers": {...}}, ...].request_method (
str, optional) – Request type;GET(get) orPOST(post). Defaults toGET.limit_per_host (
int, optional) – Maximum number of simultaneous connections per host, defaults to 4.cache_name (
str, optional) – Path to a file for caching the session, defaults to./cache/aiohttp_cache.sqlite.timeout (
int, optional) – Requests timeout in seconds, defaults to 120.expire_after (
int, optional) – Expiration time for response caching in seconds, defaults to 2592000 (one week).ssl (
boolorSSLContext, optional) – SSLContext to use for the connection, defaults to None. Set to False to disable SSL certification verification.disable (
bool, optional) – IfTruetemporarily disable caching requests and get new responses from the server, defaults to False.raise_status (
bool, optional) – Raise an exception if the response status is not 200. IfFalsereturnNone. Defaults toTrue.
- Returns:
dict– List of responses in the order of input URLs.- Return type:
list[dict[str, Any] | None] | list[list[dict[str, Any]] | None]
Examples
>>> import async_retriever as ar >>> urls = ["https://api.water.usgs.gov/api/nldi/linked-data/comid/position"] >>> kwds = [ ... { ... "params": { ... "f": "json", ... "coords": "POINT(-68.325 45.0369)", ... }, ... }, ... ] >>> r = ar.retrieve_json(urls, kwds) >>> print(r[0]["features"][0]["properties"]["identifier"]) 2675320
- async_retriever.async_retriever.retrieve_text(urls, request_kwds=None, request_method='get', limit_per_host=MAX_HOSTS, cache_name=None, timeout=TIMEOUT, expire_after=EXPIRE_AFTER, ssl=True, disable=False, raise_status=True)#
Send async requests and get the response as
text.- Parameters:
request_kwds (
listofdict, optional) – List of requests keywords corresponding to input URLs (1 on 1 mapping), defaults toNone. For example,[{"params": {...}, "headers": {...}}, ...].request_method (
str, optional) – Request type;GET(get) orPOST(post). Defaults toGET.limit_per_host (
int, optional) – Maximum number of simultaneous connections per host, defaults to 4.cache_name (
str, optional) – Path to a file for caching the session, defaults to./cache/aiohttp_cache.sqlite.timeout (
int, optional) – Requests timeout in seconds, defaults to 120.expire_after (
int, optional) – Expiration time for response caching in seconds, defaults to 2592000 (one week).ssl (
boolorSSLContext, optional) – SSLContext to use for the connection, defaults to None. Set to False to disable SSL certification verification.disable (
bool, optional) – IfTruetemporarily disable caching requests and get new responses from the server, defaults to False.raise_status (
bool, optional) – Raise an exception if the response status is not 200. IfFalsereturnNone. Defaults toTrue.
- Returns:
list– List of responses in the order of input URLs.- Return type:
Examples
>>> import async_retriever as ar >>> stations = ["01646500", "08072300", "11073495"] >>> url = "https://waterservices.usgs.gov/nwis/site" >>> urls, kwds = zip( ... *[ ... (url, {"params": {"format": "rdb", "sites": s, "siteStatus": "all"}}) ... for s in stations ... ] ... ) >>> resp = ar.retrieve_text(urls, kwds) >>> resp[0].split("\n")[-2].split("\t")[1] '01646500'