energy_fault_detector.data_preprocessing.data_clipper

Clip data before standardization or normalization

class DataClipper(lower_percentile=0.01, upper_percentile=0.99, features_to_exclude=None, features_to_clip=None)

Bases: DataTransformer

Clip data to remove outliers.

Parameters:
  • lower_percentile (float) – The lower percentile for clipping (default: 0.01).

  • upper_percentile (float) – The upper percentile for clipping (default: 0.99).

  • features_to_exclude (List[str] | None) – Column names that should not be clipped.

  • features_to_clip (List[str] | None) – Column names that should be clipped (mutually exclusive with features_to_exclude).

Configuration example:

train:
  data_clipping:
    lower_percentile: 0.001
    upper_percentile: 0.999
      features_to_exclude:
        - do_not_clip_this_feature
fit(x, y=None)

Set feature names in and out.

Return type:

DataClipper

get_feature_names_out(input_features=None)

Returns the list of feature names in the output.

Return type:

List[str]

inverse_transform(x)

Not implemented for data clipper (not useful)

Return type:

DataFrame

transform(x)

Clips the data to remove outliers, excluding angles.

Parameters:

x (pd.DataFrame) – The input DataFrame.

Returns:

The clipped DataFrame.

Return type:

pd.DataFrame