Vintage & Smoothing API Reference

Vintage Curve Fitting and Analysis

Use the vintage facade when you have cohort-level cumulative loss observations by month on book. Start with vintage.run() for comparison, validation, and optional tail projection; use CurveFitter directly only for a narrow parametric fit.

`cranalytics.vintage`

Vintage analysis compatibility re-export surface.

The deep entry point for the vintage workflow is cranalytics.vintage.run (defined in :mod:cranalytics.vintage._session), which orchestrates triangle construction, smoothing comparison, ranking, and validation behind one call. Prefer it for end-to-end analysis; it returns a result with .summary() and .plot().

This module carries no logic of its own — it only re-exports the focused vintage_* submodules (transforms, fitting, smoothing, validation) so that existing from cranalytics.vintage import ... call sites keep working. Its pure re-export status is pinned by tests/test_vintage_reexport_guard.py; add new behavior to the appropriate vintage_* submodule, not here.

`CurveFitter`

Bases: BaseEstimator, RegressorMixin

Fit a parametric cumulative-loss curve to vintage observations.

Parameters:

Name	Type	Description	Default
`method`	`str`	Curve family. One of `weibull`, `gompertz`, `lognormal`, or `burr`.	`'weibull'`

Examples:

>>> import numpy as np
>>> from cranalytics import CurveFitter
>>> mob = np.arange(1, 13)
>>> losses = 0.08 * (1.0 - np.exp(-mob / 6.0))
>>> fitter = CurveFitter("weibull").fit(mob, losses)
>>> fitter.predict(np.array([12, 18])).shape
(2,)

.. warning:: Macro-Level Approximation Only This estimator fits cumulative loss rates over time. Iteratively predicting loss extrapolates cumulative impact based on the original starting cohort size. It is blind to actual principal paydown, scheduled amortization, and constant prepayment rates (CPR). It should be used for high-level scenario approximation, not as a substitute for true loan-level cashflow modeling.

`fit(X: ArrayLike, y: ArrayLike, sample_weight: ArrayLike | None = None) -> CurveFitter`

Fit the selected curve family.

Parameters:

Name	Type	Description	Default
`X`	`ArrayLike`	Months-on-book observations.	required
`y`	`ArrayLike`	Observed cumulative loss rates.	required
`sample_weight`	`ArrayLike \| None`	Optional non-negative observation weights.	`None`

Returns:

Type	Description
`CurveFitter`	Fitted estimator with `params_` and `ultimate_` attributes.

Raises:

Type	Description
`ValueError`	If the method is unsupported.
`RuntimeError`	If numerical optimization fails.

`predict(X: ArrayLike) -> np.ndarray`

Predict cumulative loss rates for months on book.

Parameters:

Name	Type	Description	Default
`X`	`ArrayLike`	Months-on-book values to evaluate.	required

Returns:

Type	Description
`ndarray`	Predicted cumulative loss rates.

Raises:

Type	Description
`ValueError`	If the estimator is not fitted or the method is unsupported.

`forecast(months: np.ndarray) -> np.ndarray`

Alias for predict().

`VintageAnalysisSessionResult` `dataclass`

Bases: _SessionResultMapping

`summary() -> pd.DataFrame`

Compact, best-first ranking of the smoothing methods considered.

`plot(**kwargs: Any) -> Any`

Plotly heatmap of the vintage triangle. Requires the viz extra.

`BaseSmoother`

Base class for smoothing methods.

`run(df: pd.DataFrame, *, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', balance_col: str | None = None, segment_col: str | None = None, vintage_name: str | None = None, smoothers: list[str | Any] | None = None, min_maturity_months: int | None = None, include_cv: bool = True, strict: bool = False, extrapolate_tails: bool = False, tail_max_mob: int | None = None) -> VintageAnalysisSessionResult`

Run one representative vintage analysis workflow from raw data.

Parameters

extrapolate_tails: When True and incomplete vintages are present, project each immature vintage's tail forward using the best-ranked smoother. Results are returned on tail_projections as a long-format DataFrame with an is_projected bool column. tail_max_mob: Project tails to this month-on-book. Defaults to the maximum MOB observed in the selected (complete) vintage.

`smooth_curve(mob: np.ndarray, values: np.ndarray, method: str | BaseSmoother, weights: np.ndarray | None = None, **kwargs) -> SmoothedCurve`

Functional wrapper to smooth a curve by name or smoother instance.

`aggregate_by_dollar_weights(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', segment_col: str | None = 'fico_band', loss_col: str = 'charge_off_amount', balance_col: str = 'outstanding_balance') -> pd.DataFrame`

Aggregate segmented data using dollar-weighted averaging.

Aggregation is always performed by vintage and months-on-book. When segment_col is provided and present in the DataFrame, results are further segmented by that column.

`create_vintage_triangle(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', balance_col: str | None = None) -> pd.DataFrame`

Create a vintage triangle from loan-level or aggregated data.

When balance_col is provided, loss rates are dollar-weighted (weighted average by balance). Otherwise a simple mean is used.

`detect_incomplete_vintages(triangle: pd.DataFrame, min_maturity_months: int = 24) -> list[str]`

Detect vintages that have dropped off (e.g., after charge-off).

`normalize_vintage_data(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', segment_col: str | None = None) -> list[pd.DataFrame]`

Normalize raw data into a list of standardized DataFrames for curve fitting. Each DataFrame corresponds to a unique vintage-segment and contains 'mob' and 'cumulative_loss_rate'.

`project_incomplete_vintage_tails(df: pd.DataFrame, incomplete_vintages: list[str], smoother: BaseSmoother | str, max_mob: int, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate') -> pd.DataFrame`

Return projected tail rows for incomplete vintages.

For each vintage in incomplete_vintages, fits smoother to the observed mob/loss pairs and forecasts forward to max_mob. Projected values are clipped to be non-decreasing and bounded by 1.0 (cumulative loss semantics). If a vintage has fewer than two observations, a constant tail (last observed value held flat) is used instead.

Parameters

df: Long-format vintage data (same format passed to cranalytics.vintage.run). incomplete_vintages: Vintage names as they appear in df[vintage_col]. smoother: A :class:~cranalytics.vintage.BaseSmoother instance or a method name string accepted by :func:~cranalytics.vintage.create_smoother (e.g. "moving_average", "spline"). max_mob: The final month-on-book to project to. vintage_col, mob_col, loss_col: Column names matching those used in df.

Returns

pd.DataFrame Long-format DataFrame with synthetic rows for each (vintage, mob) pair beyond the observed range. Includes an is_projected bool column marking all returned rows.

`cross_validate_smoother(smoother: Any, mob: np.ndarray, values: np.ndarray, weights: np.ndarray | None = None, n_folds: int = 5, strict: bool = True, cv_method: str = 'rolling_origin') -> tuple[float, float]`

Cross-validate a smoother and return (mean_mse, std_mse).

Parameters:

Name	Type	Description	Default
`smoother`	`Any`	A fitted smoother object with `fit` and `forecast` methods.	required
`mob`	`ndarray`	Months-on-book array (time index). Must be monotonically non-decreasing for `rolling_origin` to be meaningful.	required
`values`	`ndarray`	Observed curve values.	required
`weights`	`ndarray \| None`	Optional per-observation weights.	`None`
`n_folds`	`int`	Number of CV folds.	`5`
`strict`	`bool`	If True, raise on fold failure; if False, skip failed folds.	`True`
`cv_method`	`str`	Fold strategy: `"rolling_origin"` (default) -- walk-forward / expanding-window folds. Train always ends before test begins; no temporal leakage. Recommended for time-indexed vintage curves. `"kfold"` -- standard k-fold with non-adjacent test blocks. Allows future data into training; retained for backward compatibility and non-time-ordered inputs.	`'rolling_origin'`

Vintage & Smoothing API Reference

Vintage Curve Fitting and Analysis

cranalytics.vintage

CurveFitter

fit(X: ArrayLike, y: ArrayLike, sample_weight: ArrayLike | None = None) -> CurveFitter

predict(X: ArrayLike) -> np.ndarray

forecast(months: np.ndarray) -> np.ndarray

VintageAnalysisSessionResult dataclass

summary() -> pd.DataFrame

plot(**kwargs: Any) -> Any

BaseSmoother

Parameters

smooth_curve(mob: np.ndarray, values: np.ndarray, method: str | BaseSmoother, weights: np.ndarray | None = None, **kwargs) -> SmoothedCurve

aggregate_by_dollar_weights(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', segment_col: str | None = 'fico_band', loss_col: str = 'charge_off_amount', balance_col: str = 'outstanding_balance') -> pd.DataFrame

create_vintage_triangle(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', balance_col: str | None = None) -> pd.DataFrame

detect_incomplete_vintages(triangle: pd.DataFrame, min_maturity_months: int = 24) -> list[str]

normalize_vintage_data(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', segment_col: str | None = None) -> list[pd.DataFrame]

project_incomplete_vintage_tails(df: pd.DataFrame, incomplete_vintages: list[str], smoother: BaseSmoother | str, max_mob: int, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate') -> pd.DataFrame

Parameters

Returns

cross_validate_smoother(smoother: Any, mob: np.ndarray, values: np.ndarray, weights: np.ndarray | None = None, n_folds: int = 5, strict: bool = True, cv_method: str = 'rolling_origin') -> tuple[float, float]

`cranalytics.vintage`

`CurveFitter`

`fit(X: ArrayLike, y: ArrayLike, sample_weight: ArrayLike | None = None) -> CurveFitter`

`predict(X: ArrayLike) -> np.ndarray`

`forecast(months: np.ndarray) -> np.ndarray`

`VintageAnalysisSessionResult` `dataclass`

`summary() -> pd.DataFrame`

`plot(**kwargs: Any) -> Any`

`BaseSmoother`

`smooth_curve(mob: np.ndarray, values: np.ndarray, method: str | BaseSmoother, weights: np.ndarray | None = None, **kwargs) -> SmoothedCurve`

`aggregate_by_dollar_weights(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', segment_col: str | None = 'fico_band', loss_col: str = 'charge_off_amount', balance_col: str = 'outstanding_balance') -> pd.DataFrame`

`create_vintage_triangle(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', balance_col: str | None = None) -> pd.DataFrame`

`detect_incomplete_vintages(triangle: pd.DataFrame, min_maturity_months: int = 24) -> list[str]`

`normalize_vintage_data(df: pd.DataFrame, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate', segment_col: str | None = None) -> list[pd.DataFrame]`

`project_incomplete_vintage_tails(df: pd.DataFrame, incomplete_vintages: list[str], smoother: BaseSmoother | str, max_mob: int, vintage_col: str = 'vintage_date', mob_col: str = 'months_on_book', loss_col: str = 'cumulative_loss_rate') -> pd.DataFrame`

`cross_validate_smoother(smoother: Any, mob: np.ndarray, values: np.ndarray, weights: np.ndarray | None = None, n_folds: int = 5, strict: bool = True, cv_method: str = 'rolling_origin') -> tuple[float, float]`