Below, I’ve posted some code that
- generates an artificial data set
- creates both treatment (a.k.a. dummy) and sum (a.k.a. contrast or ANOVA-style) coding for the data set
- compares the lmer output for the two coding systems
- suggests a way to test simple effects in a linear mixed model
Mostly though the code is just meant as a starting point for people who want to play with a balanced (non-Latin square design) data set to understand the consequences of coding your predictor variables in different ways.
# number of items and subjects ni= 32 ns= 40 # predictors # s for subject, i for item # x for predictor 1, z for predictor 2, xz for their interaction s= sort(rep(1:ns, ni)) l= rep(c(rep(1,ni), rep(2,ni), rep(3,ni), rep(4,ni)), ns/4) i= rep(1:ni, ns) x=rep(c(1,0),(ni/2)*ns) z=rep(c(1,1,0,0),(ni/4)*ns) xz=x*z # noise noise= rnorm(ns*ni, sd=1) snoise= rnorm(ns, sd=2) inoise= rnorm(ni, sd=0.4) # underlying group means of dependent variable y # let's imagine the following situation # a main effect of z of 3.5, # no effect of x # a mild interaction of -0.5: # z=0 z=1 # x=0 10 14 # x=1 10.25 13.75 # using the traditional way of describing the formula, # where the above group means are described as follows: # z=0 z=1 # x=0 int int+zcoef # x=1 int+xcoef int+xcoef+zcoef+xzcoef # # we set: int= 10 xcoef= .25 zcoef= 4 xzcoef= -.5 y= int + xcoef * x + zcoef * z + xzcoef * xz + noise + snoise[s] + inoise[i] # test: look at the group means (keep in mind that there is noise, # so they won't match perfectly to the table given above). lapply(split(y, paste("x=",x,",z=",z)), mean) # dataframe d <- as.data.frame(cbind(s,i,x,z,y)) d$s= as.factor(d$s) d$i= as.factor(d$i) d$x= as.factor(d$x) d$z= as.factor(d$z) # analysis library(lme4) # treatment coding (which would also be the default in R) d$x.treat <- ifelse(d$x == "1", 1, 0) d$z.treat <- ifelse(d$z == "1", 1, 0) l.treat <- lmer(y ~ x.treat * z.treat + (1|s) + (1|i), d) summary(l.treat) # sum coding (which gives us ANOVA-style main effects) d$x.sum <- ifelse(d$x == "1", .5, -.5) d$z.sum <- ifelse(d$z == "1", .5, -.5) l.sum <- lmer(y ~ x.sum * z.sum + (1|s) + (1|i), d) summary(l.sum)
For treatment coding, the coefficients correspond to simple effects. For example, the coefficient for x.treat is the estimated effect of x at z=0. For sum coding, on the other hand, the coefficients correspond to main effects. For example, the coefficient for x.sum is the estimated effect of x marginalizing over z.
So how would one test simple effects in such a framework? It would be nice, if one could just assess the coefficients in the treatment coded model, but most of the time that won’t work since treatment coding inevitably leads to collinearity between the simple predictors and their interactions (see for yourself by running the above code). I think the simplest way is to do model comparison, comparing models with and without the simple effects.
l.treat.woX <- lmer(y ~ z.treat + x.treat : z.treat + (1|s) + (1|i), d) l.treat.woZ <- lmer(y ~ x.treat + x.treat : z.treat + (1|s) + (1|i), d) l.treat.woXZ <- lmer(y ~ x.treat + z.treat + (1|s) + (1|i), d) anova(l.treat, l.treat.woX) anova(l.treat, l.treat.woZ) anova(l.treat, l.treat.woXZ)
Comments? What do you think? I have tested this a bit since I wasn’t/am not entirely sure and it seems to work.