energy_fault_detector.evaluation.predist_dataset

class PreDistDataset(path, download_dataset=False)

Bases: object

Loader and preprocessor for the PreDist dataset.

The data can be downloaded either manually from https://doi.org/10.5281/zenodo.17522254 (in this case specify path) or it can be downloaded automatically by setting download_dataset to True.

Parameters:
  • path (Union[str, Path]) – Path to the dataset root.

  • download_dataset (bool) – If True, downloads the PreDist dataset from Zenodo.

events

preloaded events dataframe for each manufacturer.

Type:

Dict[int, pd.DataFrame

FAULT_HOURS_AFTER = 24
FAULT_HOURS_BEFORE = 48
create_normal_flag(data, manufacturer, substation_id)

Create a normal flag based on disturbances, so we can select normal behaviour for training models.

Parameters:
  • data (pd.DataFrame) – Dataframe containing sensor data for a specific substation.

  • manufacturer (int) – Dataset 1 or 2.

  • substation_id (int) – ID of the substation to load data from.

Returns:

Normal flag (boolean) based on disturbances with the same timestamp index as data.

Return type:

pd.Series

get_event_data(manufacturer, event_id, max_training_days=730)

Extracts training and test slices for a specific event row (fault or normal).

Return type:

Dict[str, Any]

load_substation_data(manufacturer, substation_id)

Loads raw CSV, maps string values, and cleans indices.

Return type:

DataFrame