calpit.diagnostics_and_calibration
Module Contents
Classes
- class CalPit(model, input_dim=None, hidden_layers=None, **args)[source]
- fit(x_calib, y_calib=None, cde_calib=None, y_grid=None, pit_calib=None, oversample=1, n_cov_val=201, patience=20, n_epochs=1000, lr=0.001, weight_decay=1e-05, batch_size=2048, frac_train=0.9, lr_decay=0.99, trace_func=print, seed=299792458, num_workers=1, checkpt_path='_results/checkpoint_.pt')[source]
Train the model using the calibration data.
- Parameters:
x_calib (numpy.ndarray) – The input features for calibration data.
y_calib (Either pit_calib or) – The target values for calibration data.
cde_calib (numpy.ndarray, optional) – The conditional density estimates for calibration.
y_grid (numpy.ndarray, optional) – The grid of target values for calibration.
pit_calib (numpy.ndarray, optional) – The probability integral transforms for the given CDEs evaluated at y_calib.
y_calib –
provided. (cde_calib and y_grid must be) –
oversample (int, optional) – The oversampling factor for the training data. Default is 1.
training. (This is used to upsample the number of coverage values used for) –
n_cov_val (int, optional) – The number of coverage values to use for validation. Default is 201.
patience (int, optional) – The number of epochs to wait for improvement in validation loss before early stopping. Default is 20.
n_epochs (int, optional) – The maximum number of epochs for training. Default is 1000.
lr (float, optional) – The initial learning rate for the optimizer (AdamW). Default is 0.001.
weight_decay (float, optional) – The weight decay for the optimizer. Default is 1e-5.
batch_size (int, optional) – The batch size for training and validation. Default is 2048.
frac_train (float, optional) – The fraction of data to use for training.
0.9. (The rest is used for the validation set used to determine when to stop training. Default is) –
lr_decay (float, optional) – The learning rate decay factor for the rule,
learning_rate (epoch) –
trace_func (function, optional) – The function used for printing training progress. Default is print.
seed (int, optional) – The random seed for reproducibility. Default is 299792458.
num_workers (int, optional) – The number of CPU worker threads for data loading. Default is 1.
checkpt_path (str, optional) – The path to save the checkpoint of the best model. Default is “_results/checkpoint_.pt”.
- Returns:
The trained model.
- Return type:
torch.nn.Module
- predict(x_test, cov_grid, batch_size=2048)[source]
Predicts the conditional PIT values for the given test data and coverage grid.
- Parameters:
x_test (numpy.ndarray) – The input features of the test data.
cov_grid (numpy.ndarray) – The coverage grid at which the PIT values are to be evaluated.
batch_size (int, optional) – The batch size for prediction. Defaults to 2048.
- Returns:
The predicted conditional PIT values.
- Return type:
numpy.ndarray
- transform(x_test, cde_test, y_grid, batch_size=2048)[source]
Transforms the input CDEs for a test data set to calibrated CDEs.
- Parameters:
x_test (array-like) – The input features of the test data.
cde_test (array-like) – The initial CDEs for the test data that is to be transformed.
y_grid (array-like) – The grid of values for the CDEs.
batch_size (int, optional) – The batch size for prediction. Defaults to 2048.
- Returns:
The transformed CDEs for the given.
- Return type:
numpy.ndarray