GC.OS Logo
GC.OS Brandmark

pgmpy

pgmpy is an open-source Python library for causal and probabilistic inference using Bayesian Networks and DAGs.

Back to Projects

pgmpy is an open-source Python package for causal inference and probabilistic inference based on Directed Acyclic Graphs (DAGs) and Bayesian Networks. It provides a modular and extensible framework for modeling, learning, and reasoning under uncertainty.

The library is widely used in research and applied settings and implements a comprehensive collection of algorithms for structure learning, parameter estimation, inference, simulation, and causal analysis.

Purpose

Reasoning about uncertainty and causality is a core challenge in many scientific and industrial domains. pgmpy addresses this challenge by offering a unified toolkit for building probabilistic graphical models, learning them from data, and performing both probabilistic and causal inference.

Its design emphasizes clarity, modularity, and extensibility, making it suitable for experimentation, education, and production-grade research workflows.

Core Capabilities

  • Modeling of Bayesian Networks and DAGs

  • Causal discovery and structure learning

  • Parameter estimation from data

  • Exact and approximate probabilistic inference

  • Causal inference using intervention-based reasoning

  • Simulation of probabilistic models

Supported Data Types

Categorical Data

Fully supported across causal discovery, parameter estimation, probabilistic inference, causal inference, and simulation workflows.

Continuous Data

Supported for structure learning, parameter estimation, probabilistic inference, and simulation, with partial support for causal inference.

Mixed Data

Supported for causal discovery and simulation use cases.

Time Series Data

Supported for parameter estimation, probabilistic inference, approximate causal inference, and simulation.

Algorithms

Causal Discovery and Structure Learning

pgmpy includes multiple algorithms for learning graph structures from data, including:

  • PC algorithm and variants

  • Greedy Equivalence Search (GES)

  • Hill-Climb Search

  • Max-Min Hill-Climb

  • Tree-based search methods

  • Exhaustive search approaches

  • Expert-in-the-loop workflows

Parameter Estimation

Model parameters can be learned using established statistical techniques, including:

  • Maximum Likelihood Estimation

  • Bayesian Estimation

  • Expectation Maximization (EM)

Probabilistic Inference

Both exact and approximate inference methods are available, such as:

  • Variable Elimination

  • Belief Propagation

  • Message Passing Linear Programming (MPLP)

  • Sampling-based inference methods

Causal Inference

pgmpy supports causal reasoning through:

  • Do-calculus and do-operations

  • Identification of adjustment sets

  • Intervention-based effect estimation

Workflows and Usage

pgmpy supports end-to-end workflows ranging from data ingestion and model specification to inference, simulation, and evaluation. Users can define

Team

Ankur Ankan

Ankur Ankan

Radboud University