pgmpy
pgmpy is an open-source Python library for causal and probabilistic inference using Bayesian Networks and DAGs.
pgmpy is an open-source Python package for causal inference and probabilistic inference based on Directed Acyclic Graphs (DAGs) and Bayesian Networks. It provides a modular and extensible framework for modeling, learning, and reasoning under uncertainty.
The library is widely used in research and applied settings and implements a comprehensive collection of algorithms for structure learning, parameter estimation, inference, simulation, and causal analysis.
Purpose
Reasoning about uncertainty and causality is a core challenge in many scientific and industrial domains. pgmpy addresses this challenge by offering a unified toolkit for building probabilistic graphical models, learning them from data, and performing both probabilistic and causal inference.
Its design emphasizes clarity, modularity, and extensibility, making it suitable for experimentation, education, and production-grade research workflows.
Core Capabilities
Modeling of Bayesian Networks and DAGs
Causal discovery and structure learning
Parameter estimation from data
Exact and approximate probabilistic inference
Causal inference using intervention-based reasoning
Simulation of probabilistic models
Supported Data Types
Categorical Data
Fully supported across causal discovery, parameter estimation, probabilistic inference, causal inference, and simulation workflows.
Continuous Data
Supported for structure learning, parameter estimation, probabilistic inference, and simulation, with partial support for causal inference.
Mixed Data
Supported for causal discovery and simulation use cases.
Time Series Data
Supported for parameter estimation, probabilistic inference, approximate causal inference, and simulation.
Algorithms
Causal Discovery and Structure Learning
pgmpy includes multiple algorithms for learning graph structures from data, including:
PC algorithm and variants
Greedy Equivalence Search (GES)
Hill-Climb Search
Max-Min Hill-Climb
Tree-based search methods
Exhaustive search approaches
Expert-in-the-loop workflows
Parameter Estimation
Model parameters can be learned using established statistical techniques, including:
Maximum Likelihood Estimation
Bayesian Estimation
Expectation Maximization (EM)
Probabilistic Inference
Both exact and approximate inference methods are available, such as:
Variable Elimination
Belief Propagation
Message Passing Linear Programming (MPLP)
Sampling-based inference methods
Causal Inference
pgmpy supports causal reasoning through:
Do-calculus and do-operations
Identification of adjustment sets
Intervention-based effect estimation
Workflows and Usage
pgmpy supports end-to-end workflows ranging from data ingestion and model specification to inference, simulation, and evaluation. Users can define