Skip to content

Output Contracts

This page documents the return types and output shapes for key public functions. Use it alongside the Input Data Contracts to understand the full data flow through each workflow.


Loss Forecasting

forecast_lifetime_loss

from cranalytics import forecast_lifetime_loss

loss = forecast_lifetime_loss(portfolio_df, transition_input)

Returns: float — total expected lifetime loss across the portfolio in dollar terms.


summarize_lifetime_loss

from cranalytics import summarize_lifetime_loss

summary = summarize_lifetime_loss(portfolio_df, transition_input)

Returns: dict[str, float]

Key Type Description
total_portfolio_balance float Sum of all principal values
estimated_lifetime_loss float Total expected loss
reserve_ratio float estimated_lifetime_loss / total_portfolio_balance
lgd_assumption float The LGD value used in the calculation

forecast_portfolio_states

from cranalytics import forecast_portfolio_states

states_df = forecast_portfolio_states(matrix, initial_states, n_periods=24)

Returns: pd.DataFrame

  • Index: period — integer from 0 (current) to n_periods inclusive
  • Columns: One column per state in the transition matrix
Column Type Description
(state names, e.g. Current, Delinquent, Charged Off) float Count or weight of loans in that state at each period

Shape: (n_periods + 1, n_states)


Portfolio & Segmentation

segment_fico

from cranalytics import segment_fico

segmented_df = segment_fico(df)

Returns: pd.DataFrame — input DataFrame with two columns added.

New Column Type Values
fico_band str <600, 600-649, 650-699, 700-749, 750-799, 800+
risk_grade int 1 (lowest risk, 800+) → 6 (highest risk, <600)

All original columns are preserved.


calculate_lgd

from cranalytics import calculate_lgd

lgd_series = calculate_lgd(loan_df)

Returns: pd.Series of float — one LGD value per loan, same index as input.

  • Range: 0.0 – 1.0 (decimal, not percent)
  • Formula: LGD = 1 − (collateral_value × (1 − haircut)) / principal
  • Unsecured loans: LGD = 1.0

Vintage Curve Fitting

run_vintage_analysis_session

from cranalytics import run_vintage_analysis_session

session = run_vintage_analysis_session(vintage_df, min_maturity_months=12)

Returns: VintageAnalysisSessionResult — mapping-compatible dataclass

Attribute Type Description
triangle pd.DataFrame Vintage triangle used to detect maturity coverage
incomplete_vintages list[str] Vintage names flagged as incomplete
selected_vintage pd.DataFrame Long-format curve chosen for comparison and validation
validation_issues pd.DataFrame Issue table surfaced from the contract boundary
comparison_metrics pd.DataFrame Per-method metric table from smoothing comparison
rankings pd.DataFrame Ranked method scores, best first
validation_summary pd.DataFrame Temporal validation summary by smoothing method
summary_table str Human-readable validation summary table
tail_projections pd.DataFrame Projected rows for incomplete vintages when requested

Compatibility note: Prefer attribute access such as session.rankings. Legacy dict-style access such as session["rankings"] is still supported.


CurveFitter.predict / CurveFitter.forecast

from cranalytics import CurveFitter

fitter = CurveFitter(model="weibull")
fitter.fit(mob_array, loss_rate_array)
predictions = fitter.predict(future_mob_array)

Returns: np.ndarray — 1-D array of predicted cumulative loss rates.

  • Shape: (n_samples,) matching input future_mob_array
  • Range: [0, fitter.ultimate_] — clipped at fitted ultimate loss

Key fitted attributes after fit():

Attribute Type Description
ultimate_ float Fitted ultimate (terminal) loss rate
params_ np.ndarray Raw curve parameters

smooth_vintage

from cranalytics import smooth_vintage

result = smooth_vintage(vintage_df, method="moving_average", window=3)

Returns: SmoothedCurve dataclass

Attribute Type Description
mob np.ndarray Months on book (same as input)
smoothed_values np.ndarray Smoothed cumulative loss rates
original_values np.ndarray Original (unsmoothed) cumulative loss rates
method_name str Human-readable method name and parameters
parameters dict Fitted parameters (method-specific)
fitted_model Any | None Underlying model object (spline, isotonic, etc.)

Key method: result.forecast(future_mob)np.ndarray — extrapolate to future months.


Rollforward

fit_flow_hazard_curves

from cranalytics import fit_flow_hazard_curves

curves_df = fit_flow_hazard_curves(flow_df)

Returns: pd.DataFrame

Column Type Description
segment_id str Segment identifier
month_on_book int Month on book
payment_hazard_rate float Monthly payment hazard (0.0–1.0)
chargeoff_hazard_rate float Monthly charge-off hazard (0.0–1.0)

Constraint: payment_hazard_rate + chargeoff_hazard_rate ≤ 1.0 in every row.


forecast_balance_flows

from cranalytics import forecast_balance_flows

forecast_df = forecast_balance_flows(known_actuals, hazard_curves, max_month=36)

Returns: pd.DataFrame

Column Type Description
segment_id str Segment identifier
month_on_book int Month on book
outstanding_balance float Balance at start of period ($)
payments float Payments received in period ($)
chargeoffs float Charge-offs in period ($)
payment_hazard_rate float Hazard rate used
chargeoff_hazard_rate float Hazard rate used
forecast_flag str "Actual" or "Forecast"
outstanding_balance_ratio float outstanding_balance / amtloan
payments_ratio float payments / amtloan
chargeoffs_ratio float chargeoffs / amtloan

Shape: One row per segment per month from 0 to max_month.


run_rollforward_workflow

from cranalytics import run_rollforward_workflow

result = run_rollforward_workflow(flow_df, output_dir=Path("./out"))

Returns: RollforwardWorkflowResult dataclass (frozen)

Attribute Type Description
status str "ok", "data_issues", or "insufficient_data"
output_dir Path Directory where artifacts were written
champion str Name of selected champion variant
challengers list[str] Names of challenger variants
promotion_reason str Human-readable explanation for champion selection
run_metadata dict Portfolio KPIs: total_outstanding_balance, total_chargeoffs_last_month, n_segments, max_mob, champion_variant, data_issue_count

Side effects — files written to output_dir: See CLI Reference — rollforward-workflow output files.


Predictive Modeling

run_predictive_modeling_session

from cranalytics import run_predictive_modeling_session

session = run_predictive_modeling_session(
    df,
    feature_cols=feature_cols,
    target_col="fpf30_flag",
    split_col="origination_month",
    model_family="logistic",
)

Returns: PredictiveModelingSessionResult — mapping-compatible dataclass

Attribute Type Description
estimator sklearn-compatible Fitted estimator trained on the validated session frame
training_metadata dict Training configuration and target metadata
training_diagnostics pd.DataFrame In-sample diagnostics for the trained estimator
validation_issues pd.DataFrame Issue table surfaced from the predictive contract
scored_data pd.DataFrame Scored DataFrame with the requested output column
backtest pd.DataFrame Fold-level temporal backtest output
backtest_summary pd.DataFrame Aggregate backtest summary

Compatibility note: Prefer attribute access such as session.backtest. Legacy dict-style access such as session["backtest"] is still supported.


train_binary_model

from cranalytics import train_binary_model

estimator, metadata, diagnostics = train_binary_model(df, feature_cols, target_col="fpf30_flag")

Returns: tuple[estimator, dict, pd.DataFrame]

Element Type Description
estimator sklearn-compatible Fitted model object with .predict_proba()
metadata dict model_family, n_train, n_features, target_col, feature_cols, train_date
diagnostics pd.DataFrame feature, importance — one row per feature, sorted by importance

score_model

from cranalytics import score_model

scored_df = score_model(df, estimator, feature_cols, output_col="pd_score")

Returns: pd.DataFrame — input DataFrame with one column appended.

New Column Type Description
(user-specified output_col) float Predicted probability (0.0–1.0) or class label depending on prediction_type

run_predictive_backtest / summarize_predictive_backtest

from cranalytics import run_predictive_backtest, summarize_predictive_backtest

results_df = run_predictive_backtest(df, feature_cols, target_col, split_col)
summary_df = summarize_predictive_backtest(results_df)

run_predictive_backtest returns: pd.DataFrame — one row per temporal fold

Column Type Description
split str Temporal split label
n_train int Training set size
n_test int Test set size
auc float ROC-AUC on test fold
ks float KS statistic
gini float Gini coefficient

summarize_predictive_backtest returns: pd.DataFrame — aggregate across all folds

Column Type Description
metric str auc, ks, gini
mean float Mean across folds
std float Standard deviation across folds
min float Minimum across folds
max float Maximum across folds

Score Monitoring

compute_psi

from cranalytics import compute_psi

psi_table, total_psi = compute_psi(expected_scores, actual_scores)

Returns: tuple[pd.DataFrame, float]

DataFrame (PSI table) columns:

Column Type Description
bin str Bin interval, e.g. "(0.1, 0.2]"
expected_pct float Fraction of expected population in this bin
actual_pct float Fraction of actual population in this bin
psi_contribution float Per-bin PSI: (actual − expected) × ln(actual / expected)

Scalar (total PSI): Sum of all psi_contribution values.

PSI range Interpretation
< 0.10 Distribution is stable
0.10 – 0.25 Moderate shift — investigate
> 0.25 Significant shift — model may need recalibration

score_performance_monitoring_report

from cranalytics import score_performance_monitoring_report

report = score_performance_monitoring_report(scored_df, ...)

Returns: dict with keys:

Key Type Description
psi_table pd.DataFrame PSI table (see compute_psi above)
total_psi float Total PSI scalar
ae_table pd.DataFrame Actual vs expected by score band
calibration_table pd.DataFrame Observed vs predicted event rates by bin
summary dict psi, mean_ae, calibration_slope, assessment string

See also