SNAP: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation
Mátyás Schubert1,
Tom Claassen2,
Sara Magliacane1,
1University of Amsterdam
2Radboud University Nijmegen
Artificial Intelligence and Statistics (AISTATS), 2025
Abstract
Causal discovery can be computationally demanding for large numbers of variables. If we only wish to estimate the causal effects on a small subset of target variables, we might not need to learn the causal graph for all variables, but only a small subgraph that includes the targets and their adjustment sets. In this paper, we focus on identifying causal effects between target variables in a computationally and statistically efficient way. This task combines causal discovery and effect estimation, aligning the discovery objective with the effects to be estimated. We show that definite non-ancestors of the targets are unnecessary to learn causal relations between the targets and to identify efficient adjustments sets. We sequentially identify and prune these definite non-ancestors with our Sequential Non-Ancestor Pruning (SNAP) framework, which can be used either as a preprocessing step to standard causal discovery methods, or as a standalone sound and complete causal discovery algorithm. Our results on synthetic and real data show that both approaches substantially reduce the number of independence tests and the computation time without compromising the quality of causal effect estimations.
Targeted Causal Effect Estimation with an Unknown Graph
Estimating causal effects, for example to understand how human activities drive climate change, is fundamental to our scientific understanding and practical decision-making. The gold standard for this is to conduct experiments, but these can be expensive or unethical. For instance, deliberately increasing greenhouse gas emissions to study their effects would be irresponsible. Fortunately, causal effects can also be estimated from observational data, using adjustment variables. These adjustment sets are determined by the causal graph of the underlying process, which can also be estimated by causal discovery methods.
However, discovering the full causal graph can be computationally expensive for large numbers of variables. If we are only interested in the causal effects between a small set of target variables, we may not need to learn the entire causal graph, but just a smaller subgraph that includes the targets and their statistically efficient adjustment sets.
We formalize this problem as targeted causal effect estimation with an unknown graph, which focuses on identifying the causal effects between all pairs of target variables in a computationally and statistically efficient way.
Possible Ancestors are All You Need
Discovering uninformative parts of the causal graph wastes computational resources. Conversely, removing all non-target variables risks confounded causal effects and fewer identifiable causal relations. Even with hints about a valid adjustment set, causal discovery might still not be able to verify all relevant causal relations.
Our goal is to find a small subset of variables, such that causal discovery on this subset still identifies efficient adjustment sets the same way as discovering the full graph would. To this end, we show that definite non-ancestors of the targets can be safely removed, leaving only the set of possible ancestors, which always contains all efficient adjustment sets. While finding the exact set of possible ancestors might require reconstructing the full graph, we can overestimate it with a possibly ancestral set, i.e. a set of variables that also contains all possible ancestors of the variables in it. We prove that causal discovery on a possibly ancestral set yields the same graph as first discovering the full graph and then restricting it to the possibly ancestral set.
if the possibly ancestral set is significantly smaller than the total number of variables, we can save substantial computational effort without sacrificing the quality of adjustment sets. The challenge is efficiently identifying as many definite non-ancestors as possible.
Sequential Non-Ancestor Pruning
We introduce Sequential Non-Ancestor Pruning (SNAP), which iteratively identifies and removes definite non-ancestors, and provide two implementations: SNAP$(k)$ and SNAP$(\infty$) . SNAP$(k)$ is a constraint-based causal discovery algorithm that adapts the PC-style skeleton search. It iteratively increases conditioning set sizes of conditional independence (CI) tests starting from order 0. At every iteration, it orients v-structures using the intermediate skeleton and discovered separating sets to identify and prune some definite non-ancestors of the targets before proceeding to the next iteration.
Orienting v-structures based with intermediate skeletons and CI test results restricted to a maximum order requires particular care. For instance, we need to handle conflicting v-structures, and additional steps are needed for proper orientation starting from order 2. We discuss these details and provide the complete pseudocode in the full paper.
SNAP$(k)$ considers fewer and fewer variables as conditioning set sizes increase, reducing higher-order CI tests significantly. We can stop SNAP$(k)$ at any iteration $k$ and run a standard causal discovery algorithm on the remaining variables, which always form a possibly ancestral set, to identify valid and efficient adjustment sets. We call this approach prefiltering with SNAP$(k)$ .
SNAP$(\infty)$ extends SNAP$(k)$ into a stand-alone causal discovery algorithm. It runs SNAP$(k)$ until completion, then applies Meek's rules, and identifies and removes non-ancestors one final time. The result is the full graph restricted to the possible ancestors of the targets, ensuring the same valid and efficient adjustment sets as full-graph discovery.
Experiments
We compare SNAP$(\infty)$ with global and local causal discovery algorithms on synthetic and semisynthetic graphs across multiple domains. SNAP$(\infty)$ consistently ranks among the best in the number of CI tests and computation time, while maintaining a comparable intervention distance. Other methods vary in performance depending on the domain.
On the right, we show that prefiltering with SNAP(0) substantially speeds up discovery. Increasing the prefiltering iterations $k$ further reduces the number of CI tests, especially in denser graphs. SNAP variants also demonstrate significant efficiency gains on semi-synthetic data generated from the MAGIC-NIAB network.