Process Tracing by Hand

1. Input Model

2. Set Restrictions (Optional)


              

3. Set Parameters (Optional)


              

Current model

Model parameters

4. Input Data

5. Input Query

Results

Summary

Detailed Results Table

Process Tracing by Hand: A Walkthrough

What is this tool?

This tool helps you perform process tracing using causal models. Process tracing involves:

  • Starting with a causal model (DAG) representing your theory
  • Observing data on a single case (values of nodes in your model)
  • Asking a causal question (query) about that case
  • Calculating the probability that your query is true, given the observed data

The Logic Behind It

The calculation works by:

  1. Identifying causal types: All possible ways the world could work (combinations of nodal types)
  2. Filtering to types consistent with data: Which types could have generated the observed data?
  3. Calculating the denominator: Total probability of all types consistent with data
  4. Calculating the numerator: Probability of types consistent with data and the query
  5. Computing the posterior: Numerator / Denominator = probability query is true given the data

Example Walkthrough: SXCRY Model

Let's work through an example step by step using the model from the solutions document.

Step 1: Define the Model

Enter this model specification:

S -> C -> Y <- R <- X; X -> C -> R

This creates a model with 5 nodes: S, C, Y, R, and X, with the following structure:

  • S → C → Y
  • R ← X (R is caused by X)
  • Y ← R (Y is caused by R)
  • X → C → R (X causes C, which causes R)

Click "Create Model" to build the model.

Step 2: Set Restrictions (Optional)

You can restrict which nodal types are allowed. For this example, we'll restrict:

  • C: Keep types "1110" and "1111"
  • R: Keep types "0001" and "0000"
  • Y: Keep type "0001"

To do this:

  1. Select "C" from the node dropdown
  2. Choose "Keep selected types"
  3. Check "1110" and "1111"
  4. Click "Apply Restriction"
  5. Repeat for R and Y nodes
Step 3: Input Data for One Case

For this example, let's say we observe:

  • S = 1
  • X = 1
  • C = 0
  • R = 0
  • Y = 0

Set these values using the radio buttons that appear after creating the model.

Step 4: Define a Query

Enter a causal query. For example:

Y[S=1] < Y[S=0]

This asks: "Does S=1 cause Y to be lower than it would be if S=0?"

Other example queries you could try:

  • Y[S=1] > Y[S=0] - Does S=1 cause Y to be higher?
  • Y[X=1] == Y[X=0] - Does X have no effect on Y?
Step 5: Calculate and Interpret Results

Click "Calculate" to see the results:

  • Denominator: The total probability of all causal types that are consistent with the observed data (S=1, X=1, C=0, R=0, Y=0). This represents all the ways the world could work that would produce this data pattern.
  • Numerator: The probability of causal types that are consistent with the data and satisfy the query. This represents the ways the world could work that produce the data and make the query true.
  • Posterior: Numerator / Denominator. This is the probability that your query is true, given the observed data. A value close to 1 means the query is very likely true; close to 0 means it's very unlikely.
Understanding the Detailed Table

The detailed results table shows:

  • All causal types consistent with the observed data
  • Whether each type satisfies the query (highlighted in green if Yes)
  • The prior probability of each type
  • The rescaled prior (normalized so they sum to 1, conditional on the data)

The posterior is simply the sum of rescaled priors for all types where "in_query" is Yes.


Key Insights

  • Process tracing is theory-dependent: The same data can support different conclusions depending on your model and priors
  • The posterior depends on both the data and your prior beliefs about how the world works
  • Restrictions allow you to incorporate theoretical knowledge about which causal mechanisms are possible
  • The calculation is transparent: you can see exactly which types contribute to the numerator and denominator