Understanding Interaction Effects in Clinical Trials

Clinical Trial Scenario: AntiHyp Drug Study

A clinical trial is studying a new antihypertensive medication (AntiHyp) across different patient groups. Researchers want to understand if the drug works differently for male versus female patients.

Variables:

Treatment: Drug dosage level (standardized units, ranging from -2 to 2)
Group: Patient biological sex
Response (Y): Reduction in blood pressure (mm Hg)

How We Encode the Categorical Variable

We code patient group as a number: Female = -1, Male = +1.

This ±1 encoding centers the variable at zero. The benefit: β₀ becomes the overall average response (not just the response for one group), and the math for interpreting interactions stays clean. You'll see two clusters of points in the plots below—one for each group.

Let's look at the raw data. Do these groups seem to respond differently to treatment?

We want to build a model to predict blood pressure reduction:

$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \text{???}$$

What should go in place of ??? to capture how treatment effect might differ by group?

First Attempt: Ignoring Interactions

$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group}$$

This model assumes treatment works equally well for both groups. The only difference between groups is a vertical shift (β₂ moves one line up or down).

Look at the residuals. If the model fit well, they would scatter randomly around zero. Instead, notice how the colors separate—Female residuals trend one direction while Male residuals trend the other. The model is systematically wrong. It's missing something.

Second Attempt: Adding an "Interaction" Term

A colleague suggests: "We need to account for how Treatment and Group interact. Let's add a term that combines them: (Treatment + Group)."

What do you think—will this capture the interaction effect?

$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \beta_3(\text{Treatment} + \text{Group})$$

The lines are still parallel. The residuals still show the same pattern. Adding the variables together didn't help. Why?

The Problem: Linear Dependence

We can rearrange the additive model:

$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \beta_3(\text{Treatment} + \text{Group})$$ $$= \beta_0 + (\beta_1 + \beta_3)\text{Treatment} + (\beta_2 + \beta_3)\text{Group}$$

The "interaction" term just gets absorbed into the existing coefficients! It's mathematically equivalent to the first model—no new information is added.

In matrix terms: the column for (Treatment + Group) is a linear combination of existing columns, making the design matrix rank-deficient.

The Key Insight: Multiplication

$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \beta_3(\text{Treatment} \times \text{Group})$$

Now the lines diverge! The residuals scatter randomly—no more systematic pattern by group. The model finally captures the different treatment effects.

Why Multiplication Works

The slope of Y with respect to Treatment now depends on the group:

$$\frac{\partial Y}{\partial \text{Treatment}} = \beta_1 + \beta_3 \cdot \text{Group}$$

For Females (Group = -1): slope = β₁ - β₃
For Males (Group = +1): slope = β₁ + β₃

Multiplication creates a new piece of information that cannot be replicated by adjusting other coefficients. The product column is linearly independent.

Explore: Build Your Own Interaction

Now it's your turn. Adjust the sliders to set the true interaction strength and noise level, then generate data to see how well the multiplicative model captures it.

Understanding Interaction Effects in Clinical Trials

Clinical Trial Scenario: AntiHyp Drug Study

How We Encode the Categorical Variable

First Attempt: Ignoring Interactions

Model Fit

Residuals

Second Attempt: Adding an "Interaction" Term

Model Fit

Residuals

The Problem: Linear Dependence

The Key Insight: Multiplication

Model Fit

Residuals

Why Multiplication Works

Explore: Build Your Own Interaction

Model Fit

Residuals

Model Fit Statistics (Multiplicative Model)