energy_fault_detector.data_preprocessing.column_selector
- class ColumnSelector(max_nan_frac_per_col=0.05, features_to_exclude=None, features_to_select=None)
Bases:
DataTransformerClass for selecting columns, using the provided list of features to exclude/drop and the fraction of NaNs.
- Parameters:
max_nan_frac_per_col (
float) – maximum fraction of NaN values allowed per column. Defaults to 0.05. If the fraction exceeds max_nan_frac_per_col, the column is dropped.features_to_exclude (
Optional[List[str]]) – columns to drop (case-insensitive).features_to_select (
Optional[List[str]]) – columns to keep (case-insensitive). Mutually exclusive with features_to_exclude.
- feature_names_in_
list of column names in input.
- n_features_in_
number of columns in input.
- feature_names_out_
list of column names to keep / selected.
- columns_dropped_
list of columns that were dropped.
- fit(x, y=None)
Find columns to keep for training
- Parameters:
x (
DataFrame) – data to filter based on NaN fractionsy (
Optional[array]) – target variable, currently unused.
- Return type:
- get_feature_names_out(input_features=None)
Returns the list of feature names in the output.
- inverse_transform(x)
Inverse transform does nothing in case of column selector - since the columns dropped are not reconstructed.
- Return type:
DataFrame
- transform(x)
Drop columns from dataframe x.
- Return type:
DataFrame