SNAP: Sequential Non-Ancestor Pruning

SNAP: Sequential Non-Ancestor Pruning
for Targeted Causal Effect Estimation

¹University of Amsterdam
²Radboud University Nijmegen

Artificial Intelligence and Statistics (AISTATS), 2025

Abstract

Causal discovery can be computationally demanding for large numbers of variables. If we only wish to estimate the causal effects on a small subset of target variables, we might not need to learn the causal graph for all variables, but only a small subgraph that includes the targets and their adjustment sets. In this paper, we focus on identifying causal effects between target variables in a computationally and statistically efficient way. This task combines causal discovery and effect estimation, aligning the discovery objective with the effects to be estimated. We show that definite non-ancestors of the targets are unnecessary to learn causal relations between the targets and to identify efficient adjustments sets. We sequentially identify and prune these definite non-ancestors with our Sequential Non-Ancestor Pruning (SNAP) framework, which can be used either as a preprocessing step to standard causal discovery methods, or as a standalone sound and complete causal discovery algorithm. Our results on synthetic and real data show that both approaches substantially reduce the number of independence tests and the computation time without compromising the quality of causal effect estimations.

Targeted Causal Effect Estimation with an Unknown Graph

Estimating causal effects, for example to understand how human activities drive climate change, is fundamental to our scientific understanding and practical decision-making. The gold standard for this is to conduct experiments, but these can be expensive or unethical. For instance, deliberately increasing greenhouse gas emissions to study their effects would be irresponsible. Fortunately, causal effects can also be estimated from observational data, using adjustment variables. These adjustment sets are determined by the causal graph of the underlying process, which can also be estimated by causal discovery methods.

Causal discovery estimates the causal graph, from which valid adjustment sets can be read off for causal effect estimation.

However, discovering the full causal graph can be computationally expensive for large numbers of variables. If we are only interested in the causal effects between a small set of target variables, we may not need to learn the entire causal graph, but just a smaller subgraph that includes the targets and their statistically efficient adjustment sets.

Removing \(V_3\) and \(V_5\) in the previous example decreases the cost of causal discovery, while yielding the same efficient adjustment sets.

We formalize this problem as targeted causal effect estimation with an unknown graph, which focuses on identifying the causal effects between all pairs of target variables in a computationally and statistically efficient way.

Possible Ancestors are All You Need

Discovering uninformative parts of the causal graph wastes computational resources. Conversely, removing all non-target variables risks confounded causal effects and fewer identifiable causal relations. Even with hints about a valid adjustment set, causal discovery might still not be able to verify all relevant causal relations.

Considering only the targets for causal discovery might not identify their causal relations or valid adjustment sets.

Also considering a valid adjustment set for causal discovery might still not identify all relevant causal relations.

Our goal is to find a small subset of variables, such that causal discovery on this subset still identifies efficient adjustment sets the same way as discovering the full graph would. To this end, we show that definite non-ancestors of the targets can be safely removed, leaving only the set of possible ancestors, which always contains all efficient adjustment sets. While finding the exact set of possible ancestors might require reconstructing the full graph, we can overestimate it with a possibly ancestral set, i.e. a set of variables that also contains all possible ancestors of the variables in it. We prove that causal discovery on a possibly ancestral set yields the same graph as first discovering the full graph and then restricting it to the possibly ancestral set.

Causal discovery on a possibly ancestral set yields the same graph as first discovering the full graph and then restricting it to that set.

If the possibly ancestral set is significantly smaller than the total number of variables, we can save substantial computational effort without sacrificing the quality of adjustment sets. The challenge is efficiently identifying as many definite non-ancestors as possible.

Sequential Non-Ancestor Pruning

We introduce Sequential Non-Ancestor Pruning (SNAP), which iteratively identifies and removes definite non-ancestors, and provide two implementations: SNAP(\(k\)) and SNAP(\(\infty\)). SNAP(\(k\)) is a constraint-based causal discovery algorithm that adapts the PC-style skeleton search. It iteratively increases conditioning set sizes of conditional independence (CI) tests starting from order 0. At every iteration, it orients v-structures using the intermediate skeleton and discovered separating sets to identify and prune some definite non-ancestors of the targets before proceeding to the next iteration.

SNAP(\(k\)) progressively identifies and prunes non-ancestors by orienting v-structures at every order of the PC-style skeleton search.

Orienting v-structures based with intermediate skeletons and CI test results restricted to a maximum order requires particular care. For instance, we need to handle conflicting v-structures, and additional steps are needed for proper orientation starting from order 2. We discuss these details and provide the complete pseudocode in the full paper.

SNAP(\(k\)) considers fewer and fewer variables as conditioning set sizes increase, reducing higher-order CI tests significantly. We can stop SNAP(\(k\)) at any iteration k and run a standard causal discovery algorithm on the remaining variables, which always form a possibly ancestral set, to identify valid and efficient adjustment sets. We call this approach pre-filtering with SNAP(\(k\)).

Pre-filtering with SNAP(\(k\)) can efficiently reduce the number of variables before causal discovery.

SNAP(\(\infty\)) extends SNAP(\(k\)) into a stand-alone causal discovery algorithm. It runs SNAP(\(k\)) until completion, then applies Meek's rules, and identifies and removes non-ancestors one final time. The result is the full graph restricted to the possible ancestors of the targets, ensuring the same valid and efficient adjustment sets as full-graph discovery.

SNAP(\(\infty\)) needs much fewer CI tests than PC to find optimal adjustment sets for the causal effects between targets, by progressively removing definite non-ancestors.

Experiments

We compare SNAP(\(\infty\)) with global and local causal discovery algorithms on synthetic and semisynthetic graphs across multiple domains. SNAP(\(\infty\)) consistently ranks among the best in the number of CI tests and computation time, while maintaining a comparable intervention distance. Other methods vary in performance depending on the domain. Furthermore, pre-filtering with SNAP(0) can substantially speed up discovery. Interact with the plots below to see how pre-filtering with SNAP(0) impacts baseline methods.

Number of CI tests and computation time over synthetic graphs with different number of nodes and 4 targets.

SNAP variants also demonstrate significant efficiency gains on semi-synthetic data generated from realistic networks.

	CI tests	Time
PC	\(12807\ (\pm 2086)\)	\(79.3\ (\pm 24.7)\)
PC-SNAP(0)	\(\mathbf{955\ (\pm 10)}\)	\(0.5\ (\pm 0.1)\)
MARVEL	\(8873\ (\pm 3056)\)	\(27.3\ (\pm 9.4)\)
MARVEL-SNAP(0)	\(960\ (\pm 5)\)	\(0.6\ (\pm 0.1)\)
LDECC*	\(18142\ (\pm 2608)\)	\(19.2\ (\pm 4.0)\)
LDECC*-SNAP(0)	\(981\ (\pm 23)\)	\(0.8\ (\pm 0.1)\)
MB-by-MB*	\(11464\ (\pm 1995)\)	\(25.7\ (\pm 4.7)\)
MB-by-MB*-SNAP(0)	\(972\ (\pm 17)\)	\(0.7\ (\pm 0.2)\)
FGES	\(-\)	\(0.7\ (\pm 0.1)\)
FGES-SNAP(0)	\(-\)	\(\mathbf{0.4\ (\pm 0.1)}\)
\(\text{SNAP}(\infty)\)	\(\mathbf{955\ (\pm 10)}\)	\(0.6\ (\pm 0.2)\)

Results for the MAGIC-NIAB network using semi-synthetic data.

Citation

@inproceedings{schubert2025snap, title={{SNAP}: Sequential Non-Ancestor Pruning for Targeted Causal Effect Estimation With an Unknown Graph}, author={M{\'a}ty{\'a}s Schubert and Tom Claassen and Sara Magliacane}, booktitle={The 28th International Conference on Artificial Intelligence and Statistics}, year={2025}, url={https://openreview.net/forum?id=0gEjlLdjK9}}