Output Contracts
This page documents the return types and output shapes for key public functions. Use it alongside the Input Data Contracts to understand the full data flow through each workflow.
Loss Forecasting
forecast_lifetime_loss
from cranalytics import forecast_lifetime_loss
loss = forecast_lifetime_loss(portfolio_df, transition_input)
Returns: float — total expected lifetime loss across the portfolio in dollar terms.
summarize_lifetime_loss
from cranalytics import summarize_lifetime_loss
summary = summarize_lifetime_loss(portfolio_df, transition_input)
Returns: dict[str, float]
| Key | Type | Description |
|---|---|---|
total_portfolio_balance |
float | Sum of all principal values |
estimated_lifetime_loss |
float | Total expected loss |
reserve_ratio |
float | estimated_lifetime_loss / total_portfolio_balance |
lgd_assumption |
float | The LGD value used in the calculation |
forecast_portfolio_states
from cranalytics import forecast_portfolio_states
states_df = forecast_portfolio_states(matrix, initial_states, n_periods=24)
Returns: pd.DataFrame
- Index:
period— integer from 0 (current) ton_periodsinclusive - Columns: One column per state in the transition matrix
| Column | Type | Description |
|---|---|---|
(state names, e.g. Current, Delinquent, Charged Off) |
float | Count or weight of loans in that state at each period |
Shape: (n_periods + 1, n_states)
Portfolio & Segmentation
segment_fico
from cranalytics import segment_fico
segmented_df = segment_fico(df)
Returns: pd.DataFrame — input DataFrame with two columns added.
| New Column | Type | Values |
|---|---|---|
fico_band |
str | <600, 600-649, 650-699, 700-749, 750-799, 800+ |
risk_grade |
int | 1 (lowest risk, 800+) → 6 (highest risk, <600) |
All original columns are preserved.
calculate_lgd
from cranalytics import calculate_lgd
lgd_series = calculate_lgd(loan_df)
Returns: pd.Series of float — one LGD value per loan, same index as input.
- Range: 0.0 – 1.0 (decimal, not percent)
- Formula:
LGD = 1 − (collateral_value × (1 − haircut)) / principal - Unsecured loans: LGD = 1.0
Vintage Curve Fitting
run_vintage_analysis_session
from cranalytics import run_vintage_analysis_session
session = run_vintage_analysis_session(vintage_df, min_maturity_months=12)
Returns: VintageAnalysisSessionResult — mapping-compatible dataclass
| Attribute | Type | Description |
|---|---|---|
triangle |
pd.DataFrame | Vintage triangle used to detect maturity coverage |
incomplete_vintages |
list[str] | Vintage names flagged as incomplete |
selected_vintage |
pd.DataFrame | Long-format curve chosen for comparison and validation |
validation_issues |
pd.DataFrame | Issue table surfaced from the contract boundary |
comparison_metrics |
pd.DataFrame | Per-method metric table from smoothing comparison |
rankings |
pd.DataFrame | Ranked method scores, best first |
validation_summary |
pd.DataFrame | Temporal validation summary by smoothing method |
summary_table |
str | Human-readable validation summary table |
tail_projections |
pd.DataFrame | Projected rows for incomplete vintages when requested |
Compatibility note: Prefer attribute access such as session.rankings. Legacy
dict-style access such as session["rankings"] is still supported.
CurveFitter.predict / CurveFitter.forecast
from cranalytics import CurveFitter
fitter = CurveFitter(model="weibull")
fitter.fit(mob_array, loss_rate_array)
predictions = fitter.predict(future_mob_array)
Returns: np.ndarray — 1-D array of predicted cumulative loss rates.
- Shape:
(n_samples,)matching inputfuture_mob_array - Range:
[0, fitter.ultimate_]— clipped at fitted ultimate loss
Key fitted attributes after fit():
| Attribute | Type | Description |
|---|---|---|
ultimate_ |
float | Fitted ultimate (terminal) loss rate |
params_ |
np.ndarray | Raw curve parameters |
smooth_vintage
from cranalytics import smooth_vintage
result = smooth_vintage(vintage_df, method="moving_average", window=3)
Returns: SmoothedCurve dataclass
| Attribute | Type | Description |
|---|---|---|
mob |
np.ndarray | Months on book (same as input) |
smoothed_values |
np.ndarray | Smoothed cumulative loss rates |
original_values |
np.ndarray | Original (unsmoothed) cumulative loss rates |
method_name |
str | Human-readable method name and parameters |
parameters |
dict | Fitted parameters (method-specific) |
fitted_model |
Any | None | Underlying model object (spline, isotonic, etc.) |
Key method: result.forecast(future_mob) → np.ndarray — extrapolate to future months.
Rollforward
fit_flow_hazard_curves
from cranalytics import fit_flow_hazard_curves
curves_df = fit_flow_hazard_curves(flow_df)
Returns: pd.DataFrame
| Column | Type | Description |
|---|---|---|
segment_id |
str | Segment identifier |
month_on_book |
int | Month on book |
payment_hazard_rate |
float | Monthly payment hazard (0.0–1.0) |
chargeoff_hazard_rate |
float | Monthly charge-off hazard (0.0–1.0) |
Constraint: payment_hazard_rate + chargeoff_hazard_rate ≤ 1.0 in every row.
forecast_balance_flows
from cranalytics import forecast_balance_flows
forecast_df = forecast_balance_flows(known_actuals, hazard_curves, max_month=36)
Returns: pd.DataFrame
| Column | Type | Description |
|---|---|---|
segment_id |
str | Segment identifier |
month_on_book |
int | Month on book |
outstanding_balance |
float | Balance at start of period ($) |
payments |
float | Payments received in period ($) |
chargeoffs |
float | Charge-offs in period ($) |
payment_hazard_rate |
float | Hazard rate used |
chargeoff_hazard_rate |
float | Hazard rate used |
forecast_flag |
str | "Actual" or "Forecast" |
outstanding_balance_ratio |
float | outstanding_balance / amtloan |
payments_ratio |
float | payments / amtloan |
chargeoffs_ratio |
float | chargeoffs / amtloan |
Shape: One row per segment per month from 0 to max_month.
run_rollforward_workflow
from cranalytics import run_rollforward_workflow
result = run_rollforward_workflow(flow_df, output_dir=Path("./out"))
Returns: RollforwardWorkflowResult dataclass (frozen)
| Attribute | Type | Description |
|---|---|---|
status |
str | "ok", "data_issues", or "insufficient_data" |
output_dir |
Path | Directory where artifacts were written |
champion |
str | Name of selected champion variant |
challengers |
list[str] | Names of challenger variants |
promotion_reason |
str | Human-readable explanation for champion selection |
run_metadata |
dict | Portfolio KPIs: total_outstanding_balance, total_chargeoffs_last_month, n_segments, max_mob, champion_variant, data_issue_count |
Side effects — files written to output_dir: See CLI Reference — rollforward-workflow output files.
Predictive Modeling
run_predictive_modeling_session
from cranalytics import run_predictive_modeling_session
session = run_predictive_modeling_session(
df,
feature_cols=feature_cols,
target_col="fpf30_flag",
split_col="origination_month",
model_family="logistic",
)
Returns: PredictiveModelingSessionResult — mapping-compatible dataclass
| Attribute | Type | Description |
|---|---|---|
estimator |
sklearn-compatible | Fitted estimator trained on the validated session frame |
training_metadata |
dict | Training configuration and target metadata |
training_diagnostics |
pd.DataFrame | In-sample diagnostics for the trained estimator |
validation_issues |
pd.DataFrame | Issue table surfaced from the predictive contract |
scored_data |
pd.DataFrame | Scored DataFrame with the requested output column |
backtest |
pd.DataFrame | Fold-level temporal backtest output |
backtest_summary |
pd.DataFrame | Aggregate backtest summary |
Compatibility note: Prefer attribute access such as session.backtest. Legacy
dict-style access such as session["backtest"] is still supported.
train_binary_model
from cranalytics import train_binary_model
estimator, metadata, diagnostics = train_binary_model(df, feature_cols, target_col="fpf30_flag")
Returns: tuple[estimator, dict, pd.DataFrame]
| Element | Type | Description |
|---|---|---|
estimator |
sklearn-compatible | Fitted model object with .predict_proba() |
metadata |
dict | model_family, n_train, n_features, target_col, feature_cols, train_date |
diagnostics |
pd.DataFrame | feature, importance — one row per feature, sorted by importance |
score_model
from cranalytics import score_model
scored_df = score_model(df, estimator, feature_cols, output_col="pd_score")
Returns: pd.DataFrame — input DataFrame with one column appended.
| New Column | Type | Description |
|---|---|---|
(user-specified output_col) |
float | Predicted probability (0.0–1.0) or class label depending on prediction_type |
run_predictive_backtest / summarize_predictive_backtest
from cranalytics import run_predictive_backtest, summarize_predictive_backtest
results_df = run_predictive_backtest(df, feature_cols, target_col, split_col)
summary_df = summarize_predictive_backtest(results_df)
run_predictive_backtest returns: pd.DataFrame — one row per temporal fold
| Column | Type | Description |
|---|---|---|
split |
str | Temporal split label |
n_train |
int | Training set size |
n_test |
int | Test set size |
auc |
float | ROC-AUC on test fold |
ks |
float | KS statistic |
gini |
float | Gini coefficient |
summarize_predictive_backtest returns: pd.DataFrame — aggregate across all folds
| Column | Type | Description |
|---|---|---|
metric |
str | auc, ks, gini |
mean |
float | Mean across folds |
std |
float | Standard deviation across folds |
min |
float | Minimum across folds |
max |
float | Maximum across folds |
Score Monitoring
compute_psi
from cranalytics import compute_psi
psi_table, total_psi = compute_psi(expected_scores, actual_scores)
Returns: tuple[pd.DataFrame, float]
DataFrame (PSI table) columns:
| Column | Type | Description |
|---|---|---|
bin |
str | Bin interval, e.g. "(0.1, 0.2]" |
expected_pct |
float | Fraction of expected population in this bin |
actual_pct |
float | Fraction of actual population in this bin |
psi_contribution |
float | Per-bin PSI: (actual − expected) × ln(actual / expected) |
Scalar (total PSI): Sum of all psi_contribution values.
| PSI range | Interpretation |
|---|---|
| < 0.10 | Distribution is stable |
| 0.10 – 0.25 | Moderate shift — investigate |
| > 0.25 | Significant shift — model may need recalibration |
score_performance_monitoring_report
from cranalytics import score_performance_monitoring_report
report = score_performance_monitoring_report(scored_df, ...)
Returns: dict with keys:
| Key | Type | Description |
|---|---|---|
psi_table |
pd.DataFrame | PSI table (see compute_psi above) |
total_psi |
float | Total PSI scalar |
ae_table |
pd.DataFrame | Actual vs expected by score band |
calibration_table |
pd.DataFrame | Observed vs predicted event rates by bin |
summary |
dict | psi, mean_ae, calibration_slope, assessment string |
See also
- Input Data Contracts — what each function expects as input
- API Reference — full function signatures and docstrings