energy_fault_detector.fault_detector

Main fault detection class

class FaultDetector(config=None, model_directory='fault_detector_model', model_subdir=None)

Bases: FaultDetectionModel

Main class for fault detection in renewable energy assets and power grids.

Parameters:

config (Optional[Config]) – Config object with fault detection configuration. Defaults to None. If None, the models need to be loaded from a path using the load_models method.
model_directory (str, optional) – Directory to save models to. Defaults to ‘fault_detector_model’.
model_subdir (Optional[Any], optional) – Deprecated. This argument will be removed in future versions. Defaults to None.

anomaly_score: AnomalyScore object.

autoencoder: Autoencoder object.

threshold_selector: ThresholdSelector object.

data_preprocessor: DataPreprocessorPipeline object.

save_timestamps: a list of string timestamps indicating when the model was saved.

fit(sensor_data, normal_index=None, save_models=True, overwrite_models=False, fit_autoencoder_only=False, fit_preprocessor=True, **kwargs)

Fit models on the given sensor_data and save them locally and return the metadata.

Parameters:

sensor_data (pd.DataFrame) – DataFrame with the sensor data of one asset for a specific time window. The timestamp should be the index and the sensor values as columns.
normal_index (Optional[pd.Series]) – Series indicating normal behavior as boolean with the timestamp as index. Optional; if not provided, assumes all sensor_data represents normal behavior.
save_models (bool, optional) – Whether to save models. Defaults to True.
overwrite_models (bool, optional) – If True, existing model directories can be overwritten. Defaults to False.
fit_autoencoder_only (bool, optional) – If True, only fit the data preprocessor and autoencoder objects. Defaults to False.
fit_preprocessor (bool, optional) – If True, the preprocessor is fitted. Defaults to True.

Returns:

metadata of the trained model: model_date, model_path, model reconstruction errors of the training and validation data.

Return type:

ModelMetadata

predict(sensor_data, model_path=None, root_cause_analysis=False, track_losses=False, track_bias=False)

Predict with given models for a specific asset

Parameters:

sensor_data (pd.DataFrame) – DataFrame with the sensor data of one asset for a specific time window. The timestamp should be the index and the sensor values as columns.
model_path (Optional[str], optional) – Path to the models to be applied. If None, assumes the attributes data_preprocessor, autoencoder, anomaly_score, and threshold_selector contain fitted instances.
root_cause_analysis (bool, optional) – Whether to run ARCANA. Defaults to False.
track_losses (bool, optional) – Optional; if True, ARCANA losses will be tracked over the iterations. Defaults to False.
track_bias (bool, optional) – Optional; if True, ARCANA bias will be tracked over the iterations. Defaults to False.

Returns:

with the following DataFrames:

predicted_anomalies: DataFrame with a column ‘anomaly’ (bool).
reconstruction: DataFrame with reconstruction of the sensor data with timestamp as index.
deviations: DataFrame with reconstruction errors.
anomaly_score: DataFrame with anomaly scores for each timestamp.
bias_data: DataFrame with ARCANA results with timestamp as index. None if ARCANA was not run.
arcana_losses: DataFrame containing recorded values for all losses in ARCANA. None if ARCANA was not run.
tracked_bias: List of DataFrames. None if ARCANA was not run.

Return type:

FaultDetectionResult

predict_anomalies(scores, x_prepped=None)

Predict anomalies based on anomaly scores.

Return type:: Series

predict_anomaly_score(sensor_data)

Predict the anomaly score.

Return type:: Series

preprocess_train_data(sensor_data, normal_index, fit_preprocessor=True)

Preprocesses the training data using the configured data_preprocessor

Parameters:

sensor_data (pd.DataFrame) – unprocessed training data
normal_index (pd.Series) – unprocessed normal index
fit_preprocessor (bool, optional) – if True the preprocessor is fitted. If False the preprocessor is not fitted and the user has to provide a ready-to-use preprocessor by loading models before calling this function.

Return type:

Tuple[DataFrame, DataFrame, Series]

Returns: tuple of (pd.Dataframe, pd.Dataframe, pd.Series): x_prepped (pd.DataFrame): preprocessed normal training data x: ordered training data (unprocessed) # needed for _fit_threshold y: ordered normal_index (unprocessed) # needed for _fit_threshold

run_root_cause_analysis(sensor_data, track_losses=False, track_bias=False)

Run ARCANA

Parameters:

sensor_data (DataFrame) – pandas DataFrame containing the sensor data which should be analyzed.
track_losses (bool) – optional bool. If True the arcana losses will be tracked over the iterations
track_bias (bool) – optional bool. If True the arcana bias will be tracked over the iterations

Return type:

Tuple[DataFrame, DataFrame, List[DataFrame]]

Returns: Tuple of (pd.DataFrame, pd.DataFrame, List[pd.DataFrame]): df_arcana_bias: pandas dataframe containing the arcana bias. arcana_losses: dictionary containing loss names as keys and lists of loss values as values. tracked_bias: list of pandas dataframe containing the arcana bias recorded over the iterations.

tune(sensor_data, normal_index=None, pretrained_model_path=None, new_learning_rate=0.0001, tune_epochs=10, tune_method='full', save_models=True, overwrite_models=False, data_preprocessor=None)

FaultDetector finetuning via the following methods:: ‘full’ (all autoencoder weights + threshold and anomaly-score scaling will be adapted), ‘decoder’ (only decoder weights + threshold will be adapted), ‘threshold’ (only the threshold and anomaly-score scaling is adapted)
Notes: Parameters tune_epochs and new_learning_rate should be chosen carefully while considering: potential overfitting issues depending on the similarity of the tuning data and the training data.

Parameters:

sensor_data (pd.DataFrame) – DataFrame with the sensor data of one asset for a specific time window. The timestamp should be the index and the sensor values as columns.
normal_index (pd.Series, optional) – Series indicating normal behavior as boolean with the timestamp as index. If not provided, it is assumed all data in sensor_data represents normal behaviour. Defaults to None.
pretrained_model_path (Optional[str], optional) – Path to pretrained model. If None, assumes attributes data_preprocessor, autoencoder, anomaly_score, and threshold_selector contain fitted instances.
tune_epochs (int, optional) – Number of epochs to fine-tune. Defaults to 10.
new_learning_rate (float, optional) – Learning rate to tune the autoencoder with. Defaults to 0.0001.
tune_method (str, optional) – Possible options: ‘full’ (all autoencoder weights + threshold and anomaly-score scaling will be adapted), ‘decoder’ (only decoder weights + threshold will be adapted), ‘threshold’ (only the threshold and anomaly-score scaling is adapted) Defaults to ‘full’.
save_models (bool, optional) – Whether to save models. Defaults to True.
overwrite_models (bool, optional) – If True, existing model directories can be overwritten. Defaults to False.
data_preprocessor (Optional[DataPreprocessor], optional) – Optional prefitted data preprocessor. Useful when using a generic preprocessor for all models.

Returns:

metadata of the trained model with model_date, model_path, model reconstruction errors of the training and validation data.

Return type:

ModelMetadata