calpit

Subpackages

Submodules

Package Contents

Classes

CalPit

class CalPit(model, input_dim=None, hidden_layers=None, **args)[source]
fit(x_calib, y_calib=None, cde_calib=None, y_grid=None, pit_calib=None, oversample=1, n_cov_val=201, patience=20, n_epochs=1000, lr=0.001, weight_decay=1e-05, batch_size=2048, frac_train=0.9, lr_decay=0.99, trace_func=print, seed=299792458, num_workers=1, checkpt_path='_results/checkpoint_.pt')[source]

Train the model using the calibration data.

Parameters:
  • x_calib (numpy.ndarray) – The input features for calibration data.

  • y_calib (Either pit_calib or) – The target values for calibration data.

  • cde_calib (numpy.ndarray, optional) – The conditional density estimates for calibration.

  • y_grid (numpy.ndarray, optional) – The grid of target values for calibration.

  • pit_calib (numpy.ndarray, optional) – The probability integral transforms for the given CDEs evaluated at y_calib.

  • y_calib

  • provided. (cde_calib and y_grid must be) –

  • oversample (int, optional) – The oversampling factor for the training data. Default is 1.

  • training. (This is used to upsample the number of coverage values used for) –

  • n_cov_val (int, optional) – The number of coverage values to use for validation. Default is 201.

  • patience (int, optional) – The number of epochs to wait for improvement in validation loss before early stopping. Default is 20.

  • n_epochs (int, optional) – The maximum number of epochs for training. Default is 1000.

  • lr (float, optional) – The initial learning rate for the optimizer (AdamW). Default is 0.001.

  • weight_decay (float, optional) – The weight decay for the optimizer. Default is 1e-5.

  • batch_size (int, optional) – The batch size for training and validation. Default is 2048.

  • frac_train (float, optional) – The fraction of data to use for training.

  • 0.9. (The rest is used for the validation set used to determine when to stop training. Default is) –

  • lr_decay (float, optional) – The learning rate decay factor for the rule,

  • learning_rate (epoch) –

  • trace_func (function, optional) – The function used for printing training progress. Default is print.

  • seed (int, optional) – The random seed for reproducibility. Default is 299792458.

  • num_workers (int, optional) – The number of CPU worker threads for data loading. Default is 1.

  • checkpt_path (str, optional) – The path to save the checkpoint of the best model. Default is “_results/checkpoint_.pt”.

Returns:

The trained model.

Return type:

torch.nn.Module

predict(x_test, cov_grid, batch_size=2048)[source]

Predicts the conditional PIT values for the given test data and coverage grid.

Parameters:
  • x_test (numpy.ndarray) – The input features of the test data.

  • cov_grid (numpy.ndarray) – The coverage grid at which the PIT values are to be evaluated.

  • batch_size (int, optional) – The batch size for prediction. Defaults to 2048.

Returns:

The predicted conditional PIT values.

Return type:

numpy.ndarray

transform(x_test, cde_test, y_grid, batch_size=2048)[source]

Transforms the input CDEs for a test data set to calibrated CDEs.

Parameters:
  • x_test (array-like) – The input features of the test data.

  • cde_test (array-like) – The initial CDEs for the test data that is to be transformed.

  • y_grid (array-like) – The grid of values for the CDEs.

  • batch_size (int, optional) – The batch size for prediction. Defaults to 2048.

Returns:

The transformed CDEs for the given.

Return type:

numpy.ndarray

abstract fit_transform(**args)[source]

Fit the model and transform the data in one go