energy_fault_detector.root_cause_analysis.arcana
ARCANA implementation, as described in https://doi.org/10.1016/j.egyai.2021.100065 .
- class Arcana(model, learning_rate=0.001, init_x_bias='recon', alpha=0.8, num_iter=400, epsilon=1e-08, verbose=False, max_sample_threshold=1000, **kwargs)
Bases:
objectAnomaly root cause analysis. Tries to find which of the sensors/inputs caused the reconstruction error of an autoencoder model. Implementation details are found in https://doi.org/10.1016/j.egyai.2021.100065.
This method minimizes the loss function:
‘(1 - alpha) L2(X_corr - autoencoder(X_corr)) + alpha * L1(X_corr - X_obs)’
where alpha is a hyperparameter between 0 and 1, X_corr is the corrected observation signal (X_obs + X_bias, should be without anomaly after optimization) and X_obs the original signal. We are interested in finding the Arcana-correction X_bias, which indicates the deviation of each sensor causing the reconstruction error (the deviation from the case without anomaly, still close to the original observation X_obs).
Minimizing the L2 term results in minimal deviations (small x_bias values), the L1 term keeps the solution close to the original sensor values, effectively keeping the number of inputs responsible for the reconstruction error small.
For optimization itself the Adam Optimizer from tensorflow.keras.optimizers is used.
- Parameters:
model (
Autoencoder) – Autoencoder model to consider. Must have a __call__ method expecting input data and returns a tf.Tensorlearning_rate (
float) – Learning rate for the adam optimizer.init_x_bias (
str) – Where to start, one of ‘recon’ (reconstruction error), ‘zero’ (a zero vector), ‘weightedA’ (alpha * reconstruction error) or ‘weightedB’ ( (1-alpha) * reconstruction error). Default ‘recon’.alpha (
float) – hyperparameter of arcana loss function. A high alpha value means the L1-loss is weighted more, which results in an x_bias with smaller values.num_iter (
int) – Number of times to run the AdamOptimizer.epsilon (
float) – Small number to prevent division by zero for the adam optimizer, default 1e-8verbose (
bool) – Whether to log loss values every 50 iterations, default Falsemax_sample_threshold (
int) – Maximum number of samples which are analyzed by ARCANA. This parameter ensures that ARCANA calculations are sufficiently fast. Default is 1000kwargs – Any other arguments of the optimizer
- opt
Adam optimizer object
Configuration example:
root_cause_analysis: alpha: 0.8 init_x_bias: recon num_iter: 200 max_sample_threshold: 1000 verbose: false
- draw_samples(x, conditions=None)
Selects index values from 0 to data_length by choosing the indexes with the highest anomaly score, for defining the ARCANA samples.
- Parameters:
x (np.ndarray) – Data of which samples should be drawn
conditions (np.ndarray) – (optional) Array of conditional features values
- Return type:
ndarray- Returns:
array of booleans defining the selected samples.
- find_arcana_bias(x, track_losses=False, track_bias=False)
Find correction to input data x necessary to minimize the Arcana loss function. Large (absolute) correction in one of the inputs means that this is an important sensor/input for the reconstruction error. Detects which input are probably anomalous and resulted in an anomaly.
- Parameters:
x (
DataFrame) – pandas DataFrame containing data with timestamp as index.track_losses (
bool) – If True losses will be returned as a dictionary containing lists of combined loss, loss 1 and loss 2 for each 50th iteration)track_bias (
bool) – If True bias will be returned as a list arcana biases each 50th iteration)
- Return type:
Returns: A tuple with the following three objects
x_bias: pandas DataFrame
tracked_losses: A dataframe containing the combined loss, loss 1 (reconstruction) and loss 2 (regularization) for each 50th iteration (if track_losses is False this list is empty)
tracked_bias: A List of dataframes representing x_bias
- initialize_x_bias(x, conditions=None)
Initialize the ARCANA bias vector.
- Parameters:
x (
ndarray) – numpy array containing input data.conditions (
ndarray) – numpy array containing conditional data - for conditional autoencoders. Defaults to None.
- Return type:
Tensor- Returns:
initial x_bias values
- update_x_bias(x, x_bias, conditions=None)
This function builds a tensor which can calculate the ARCANA loss (full_loss) and computes the gradient of that loss with respect to the Variable x + x_bias using tensorflow GradientTape.
- Parameters:
x (
Variable) – tensorflow variable containing input datax_bias (
Variable) – tensorflow variable containing the current ARCANA bias.conditions (
Tensor) – (optional) tensor containing the conditional feature values.
- Returns:
tf.Variable losses: Tuple(tf.Variable, tf.Variable, tf.Variable) contains the losses of this x_bias update. grad: (tf.variable) contains the computed gradient for this x_bias update.
- Return type:
x_corrected (x_bias-x)