async_retriever.streaming#

Download multiple files concurrently by streaming their content to disk.

Module Contents#

async_retriever.streaming.generate_filename(url, params=None, data=None, prefix=None, file_extension='')#

Generate a unique filename using SHA-256 from a query.

Parameters:

url (str) – The URL for the request.
params (dict, multidict.MultiDict, optional) – Query parameters for the request, default is None.
data (dict, str, optional) – Data or JSON to include in the hash, default is None.
prefix (str, optional) – A custom prefix to attach to the filename, default is None.
file_extension (str, optional) – The file extension to append to the filename, default is "".

Returns:

str – A unique filename with the SHA-256 hash, optional prefix, and the file extension.

Return type:

str

async_retriever.streaming.stream_write(urls, file_paths, chunk_size=CHUNK_SIZE, limit_per_host=MAX_HOSTS, timeout=600, raise_status=True)#

Download multiple files concurrently by streaming their content to disk.

Parameters:

urls (list of str) – URLs to download.
file_paths (list of pathlib.Path) – Paths to save the downloaded files.
chunk_size (int, optional) – Size of the chunks to download, by default 1 MB.
limit_per_host (int, optional) – Maximum number of concurrent connections per host, by default 4.
timeout (int, optional) – Request timeout in seconds, by default 10 minutes.
raise_status (bool, optional) – Raise an exception if a request fails, by default True. Otherwise, the exception is logged and the function continues.

async_retriever.streaming#

Module Contents#

This Page