energy_fault_detector.root_cause_analysis.arcana

ARCANA implementation, as described in https://doi.org/10.1016/j.egyai.2021.100065 .

class Arcana(model, learning_rate=0.001, init_x_bias='recon', alpha=0.8, num_iter=400, epsilon=1e-08, verbose=False, max_sample_threshold=1000, **kwargs)[source]

Bases: object

Anomaly root cause analysis. Tries to find which of the sensors/inputs caused the reconstruction error of an autoencoder model. Implementation details are found in https://doi.org/10.1016/j.egyai.2021.100065.

This method minimizes the loss function:

‘(1 - alpha) L2(X_corr - autoencoder(X_corr)) + alpha * L1(X_corr - X_obs)’

where alpha is a hyperparameter between 0 and 1, X_corr is the corrected observation signal (X_obs + X_bias, should be without anomaly after optimization) and X_obs the original signal. We are interested in finding the Arcana-correction X_bias, which indicates the deviation of each sensor causing the reconstruction error (the deviation from the case without an anomaly, still close to the original observation X_obs).

Minimizing the L2 term results in minimal deviations (small x_bias values), while the L1 term keeps the solution close to the original sensor values, effectively keeping the number of inputs responsible for the reconstruction error small.

For optimization itself the Adam Optimizer from tensorflow.keras.optimizers is used.

Parameters:

model (Union[Autoencoder, SequenceAutoencoder]) – Autoencoder model to consider. Must have a __call__ method expecting input data and returns a tf.Tensor
learning_rate (float) – Learning rate for the adam optimizer.
init_x_bias (str) – Where to start, one of ‘recon’ (reconstruction error), ‘zero’ (a zero vector), ‘weightedA’ (alpha * reconstruction error) or ‘weightedB’ ( (1-alpha) * reconstruction error). Default ‘recon’.
alpha (float) – hyperparameter of arcana loss function. A high alpha value means the L1-loss is weighted more, which results in an x_bias with smaller values.
num_iter (int) – Number of times to run the AdamOptimizer.
epsilon (float) – Small number to prevent division by zero for the adam optimizer, default 1e-8
verbose (bool) – Whether to log loss values every 50 iterations, default False
max_sample_threshold (int) – Maximum number of samples which are analyzed by ARCANA. This parameter ensures that ARCANA calculations are sufficiently fast. Default is 1000
kwargs – Any other arguments of the optimizer

opt: Adam optimizer object

Configuration example:

root_cause_analysis:
  alpha: 0.8
  init_x_bias: recon
  num_iter: 200
  max_sample_threshold: 1000
  verbose: false

draw_samples(x, conditions=None)[source]

Selects index values from 0 to data_length by choosing the indexes with the highest anomaly score, for defining the ARCANA samples.

Parameters:

x (ndarray) – Data of which samples should be drawn
conditions (ndarray) – (optional) Array of conditional features values

Return type:

ndarray

Returns:

array of booleans defining the selected samples.

find_arcana_bias(x, track_losses=False, track_bias=False)[source]

Find correction to input data x necessary to minimize the Arcana loss function. Large (absolute) correction in one of the inputs means that this is an important sensor/input for the reconstruction error. Detects which input are probably anomalous and resulted in an anomaly.

Parameters:

x (DataFrame) – pandas DataFrame containing data with timestamp as index.
track_losses (bool) – If True losses will be returned as a dictionary containing lists of combined loss, loss 1 and loss 2 for each 50th iteration)
track_bias (bool) – If True bias will be returned as a list arcana biases each 50th iteration)

Return type:

Tuple[DataFrame, DataFrame, List[DataFrame]]

Returns: A tuple with the following three objects

x_bias: pandas DataFrame

tracked_losses: A dataframe containing the combined loss, loss 1 (reconstruction) and loss 2 (regularization) for each 50th iteration (if track_losses is False this list is empty)

tracked_bias: A List of dataframes representing x_bias

initialize_x_bias(x, conditions=None)[source]

Initialize the ARCANA bias vector.

Parameters:

x (ndarray) – numpy array containing input data.
conditions (ndarray) – numpy array containing conditional data - for conditional autoencoders. Defaults to None.

Return type:

Tensor

Returns:

initial x_bias values

update_x_bias(x, x_bias, conditions=None)[source]

This function builds a tensor which can calculate the ARCANA loss (full_loss) and computes the gradient of that loss with respect to the Variable x + x_bias using tensorflow GradientTape.

Parameters:

x (Tensor) – tensorflow tensor containing input data
x_bias (Variable) – tensorflow variable containing the current ARCANA bias.
conditions (Tensor) – (optional) tensor containing the conditional feature values.

Returns:

tf.Variable losses: Tuple(tf.Variable, tf.Variable, tf.Variable) contains the losses of this x_bias update. grad: (tf.variable) contains the computed gradient for this x_bias update.

Return type:

Tuple[Variable, Tuple[Tensor, Tensor, Tensor], Variable]