GC.OS Logo
GC.OS Brandmark

skpro

skpro is an open-source Python library for supervised probabilistic prediction.

Back to Projects

skpro is an open-source Python library for supervised probabilistic prediction, providing scikit-learn–like and scikit-base–compatible interfaces for uncertainty-aware modeling. It enables probabilistic regression, interval and quantile prediction, full distribution forecasting, and survival analysis using a unified and extensible API.

The library is developed by the sktime community and is designed to integrate seamlessly with the broader Python machine learning ecosystem.

Purpose

Many machine learning workflows focus solely on point predictions, ignoring uncertainty. skpro addresses this limitation by making probabilistic prediction a first-class concept, allowing models to output distributions, intervals, and quantiles rather than single values.

Its goal is to improve decision-making, risk awareness, and model evaluation by providing robust tools for uncertainty quantification across tabular and time-to-event prediction tasks.

Core Capabilities

  • Probabilistic tabular regression

  • Interval, quantile, and full distribution prediction

  • Time-to-event and survival prediction

  • Probabilistic performance metrics

  • Reductions that convert classical regressors into probabilistic models

  • Pipeline construction and hyperparameter tuning using probabilistic metrics

  • Symbolic probability distributions with pandas-compatible interfaces

Probabilistic Prediction

Tabular Regression

skpro supports probabilistic regression for tabular data, enabling predictions in multiple modes:

  • Mean and variance

  • Prediction intervals

  • Quantiles

  • Full predictive distributions

This allows users to quantify uncertainty directly alongside predictions.

Survival and Time-to-Event Prediction

The library includes tools for probabilistic survival analysis, producing instance-level survival distributions rather than single risk scores or point estimates.

Model Reductions and Pipelines

Probabilistic Reductions

skpro provides reductions that wrap classical scikit-learn regressors and extend them with probabilistic outputs, including:

  • Bootstrap-based methods

  • Conformal prediction techniques

  • Residual-based probabilistic modeling

This enables uncertainty-aware modeling without abandoning familiar estimators.

Pipelines and Composite Models

Models can be combined into pipelines and composite estimators, with full support for tuning and evaluation using probabilistic performance metrics.

Probability Distributions

Symbolic Distribution Objects

skpro includes symbolic probability distributions with:

  • Explicit mathematical representations

  • pandas DataFrame–based value domains

  • pandas-like interfaces for manipulation and inspection

These distributions can be used consistently across prediction, evaluation, and downstream analysis.

Evaluation and Metrics

Probabilistic Performance Metrics

The library provides a comprehensive set of metrics for evaluating probabilistic predictions, including:

  • Pinball loss

  • Empirical coverage

  • Continuous Ranked Probability Score (CRPS)

  • Survival-specific loss functions

This ensures realistic and uncertainty-aware model assessment.

Ecosystem Compatibility

Integration with scikit-learn and sktime

skpro is fully compatible with scikit-learn and sktime, enabling hybrid workflows such as:

  • Building probabilistic forecasters from deterministic regressors

  • Combining time series and tabular probabilistic models

  • Reusing existing estimators with added uncertainty modeling

Interoperability with External Libraries

The project curates interfaces to third-party probabilistic libraries such as cyclic-boosting, MAPIE, and ngboost.

Use Cases

  • Uncertainty-aware regression modeling

  • Risk-sensitive decision support systems

  • Survival and time-to-event analysis

  • Probabilistic forecasting pipelines

  • Research in uncertainty quantification and model evaluation

Open Source

skpro is released under the BSD 3-Clause License and developed openly by the sktime community. It follows modern open-source practices, includes extensive documentation and tutorials, and welcomes contributions of all kinds.

GC.OS supports skpro as an open-source project that advances interoperable, reliable, and transparent probabilistic machine learning.