energy_fault_detector.data_splitting.data_splitter

class BlockDataSplitter(train_block_size=5040, val_block_size=1680)

Bases: object

Split data into training and validation blocks. Train and val blocks will be chosen iteratively. Their size depends on train_block_size and val_block_size.

Parameters:

train_block_size (int) – determines the size of one training block
val_block_size (int) – determines the size of a validation block

train_selection: array containing booleans which indicate which sample of the input x belongs to the training blocks.

val_selection: array containing booleans which indicate which sample of the input x belongs to the validation blocks.

Configuration example:

train:
  data_splitter:
    train_block_size: 5040
    val_block_size: 1680

split(x, y=None)

This function constructs an array of booleans that indicate which samples belongs to the training data. At first the number of full blocks (training block + validation block) that fits into the data is determined. Each full block will then be divided into a training block and a validation block. If there is a remaining part at the end of data where no full block would fit, the remainder is defined as training data. After all training data was found, the validation data is defined as everything that is not training data.

Parameters:

x (array) – data to split.
y (Optional[array]) – data labels - optional

Return type:

Tuple