energy_fault_detector.utils.data_downloads
- download_file(session, url, dest)
Download a single file to disk using streaming.
- download_zenodo_data(identifier='10.5281/zenodo.15846963', dest='./downloads', overwrite=False)
Download a Zenodo record via API and unzip any .zip files.
- Parameters:
- Returns:
- List of paths the extracted content of all downloaded zip files. If there is only one
downloaded zip file only one path is returned
- Return type:
Union[List[Path], Path]
- fetch_record(session, record_id)
Fetch record metadata from Zenodo’s REST API.
- list_files(session, record_json)
Return a list of file descriptors for a Zenodo record.
Supports both embedded ‘files’ in the record JSON and the ‘links.files’ endpoint (newer API).
- Parameters:
session (
Session) – A requests.Session to use for any follow-up call.record_json (
dict) – Record JSON as returned by fetch_record().
- Return type:
- Returns:
A list of file dicts with at least ‘links’ and ‘key’/’filename’.
- Raises:
RuntimeError – If no files are found.
requests.HTTPError – If loading the files listing endpoint fails.
- parse_record_id(identifier)
Extract a Zenodo record ID from an ID, DOI, or URL.
- Accepts:
Numeric ID (e.g., “15846963”)
DOI (e.g., “10.5281/zenodo.15846963”)
Record URL (e.g., “https://zenodo.org/records/15846963”)
- Parameters:
identifier (
str) – Input string containing an ID, DOI, or URL.- Return type:
- Returns:
The numeric record ID as a string.
- Raises:
ValueError – If a record ID cannot be parsed from the input.
- prepare_output_dir(out_dir, overwrite)
Ensure the output directory is ready.
If the directory exists and overwrite is True, its contents are removed and the directory is recreated empty. If it exists and overwrite is False, it is left as is. If it does not exist, it is created.
- Parameters:
- Raises:
OSError – If filesystem operations fail.
RuntimeError – If out_dir points to an unsafe path to remove.
- Return type:
- safe_extract_zip(zip_path, dest_dir)
Extract a ZIP archive safely, preventing path traversal (zip-slip).
Validates that each member will extract under dest_dir before extraction.
- Parameters:
- Raises:
RuntimeError – If an unsafe member path is detected.
zipfile.BadZipFile – If the archive is invalid or corrupted.
OSError – If filesystem operations fail.