Inference of Graphical Causal Models: Representing the Meaningful Information of Probability Distributions
Jan Lemeire and Kris Steenhaut; JMLR W&CP 6:107-120,
2010.
Abstract
This paper studies the feasibility and interpretation of learning the causal structure from observational data with the
principles behind the Kolmogorov Minimal Sufficient Statistic (KMSS). The KMSS provides a generic solution to inductive inference.
It states that we should seek for the minimal model that captures all regularities of the data. The conditional independencies
following from the system's causal structure are the regularities incorporated in a graphical causal model. The meaningful information
provided by a Bayesian network corresponds to the decomposition of the description of the system into Conditional Probability Distributions (CPDs).
The decomposition is described by the Directed Acyclic Graph (DAG). For a causal interpretation of the DAG, the decomposition should
imply modularity of the CPDs. The CPDs should match up with independent parts of reality that can be changed independently.
We argue that if the shortest description of the joint distribution is given by separate descriptions of the conditional distributions
for each variable given its effects, the decomposition given by the DAG should be considered as the top-ranked causal hypothesis.
Even when the causal interpretation is faulty, it serves as a reference model. Modularity becomes, however, implausible if the concatenation
of the description of some CPDs is compressible. Then there might be a kind of meta-mechanism governing some of the mechanisms or either a
single mechanism responsible for setting the state of multiple variables.