A clinical trial is studying a new antihypertensive medication (AntiHyp) across different patient groups. Researchers want to understand if the drug works differently for male versus female patients.
Variables:
We code patient group as a number: Female = -1, Male = +1.
This ±1 encoding centers the variable at zero. The benefit: β₀ becomes the overall average response (not just the response for one group), and the math for interpreting interactions stays clean. You'll see two clusters of points in the plots below—one for each group.
Let's look at the raw data. Do these groups seem to respond differently to treatment?
We want to build a model to predict blood pressure reduction:
$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \text{???}$$What should go in place of ??? to capture how treatment effect might differ by group?
Let's start simple and fit a model with just the main effects:
This model assumes treatment works equally well for both groups. The only difference between groups is a vertical shift (β₂ moves one line up or down).
The lines are still parallel. The residuals still show the same pattern. Adding the variables together didn't help. Why?
We can rearrange the additive model:
$$Y = \beta_0 + \beta_1 \text{Treatment} + \beta_2 \text{Group} + \beta_3(\text{Treatment} + \text{Group})$$ $$= \beta_0 + (\beta_1 + \beta_3)\text{Treatment} + (\beta_2 + \beta_3)\text{Group}$$The "interaction" term just gets absorbed into the existing coefficients! It's mathematically equivalent to the first model—no new information is added.
In matrix terms: the column for (Treatment + Group) is a linear combination of existing columns, making the design matrix rank-deficient.
What if instead of adding Treatment and Group, we multiply them?
The slope of Y with respect to Treatment now depends on the group:
$$\frac{\partial Y}{\partial \text{Treatment}} = \beta_1 + \beta_3 \cdot \text{Group}$$Multiplication creates a new piece of information that cannot be replicated by adjusting other coefficients. The product column is linearly independent.
Now it's your turn. Adjust the sliders to set the true interaction strength and noise level, then generate data to see how well the multiplicative model captures it.