Plotting effects for glmer(, family=”binomial”) models

Posted on January 19, 2009 Updated on December 16, 2010

UPDATE 12/15/10: Bug fix. Thanks to Christian Pietsch.

UPDATE 10/31/10: Some further updates and bug fixes. The code below is the updated one.

UPDATE 05/20/10: I’ve updated the code with a couple of extensions (both linear and binomial models should now work; the plot now uses ggplot2) and minor fixes (the code didn’t work if the model only had one fixed effect predictor). I also wanted to be clear that the dashed lines in the plots aren’t confidence intervals. They are multiples of the standard error of the effect.

Here’s a new function for plotting the effect of predictors in multilevel logit models fitted in R using lmer() from the lme4 package. It’s based on code by Austin Frank and I also borrowed from Harald Baayen’s plotLMER.fnc() (package languageR). First a cool pic:

Predicted effect of speechrate on complementizer-mentioning

These plots contain the distribution of the predictor (x-axis) against the predicted values (based on the entire model, y-axis) using hexbinplot() from the package hexbin. On top of that, you see the model prediction fo the selected predictor along with confidence intervals. Note that the predictor is given in its original form (here speech rate) although it was entered into the model as the centered log-transformed speechrate. The plot consideres that. Of course, you can configure things.

For example, you could plot the effect in probability space:

my.glmergplot <- function(
# version 0.43
# written by tiflo@csli.stanford.edu
# code contributions from Austin Frank, Ting Qian, and Harald Baayen
# remaining errors are mine (tiflo@csli.stanford.edu)
#
# last modified 12/15/10
#
# now also supports linear models
# backtransforms centering and standardization
#
# known bugs:
#   too simple treatment of random effects
#
	model,
	name.predictor,
	name.outcome= "outcome",
	predictor= NULL,

	# is the predictor centered IN THE MODEL?
	# is the predictor transformed before
	# (centered and) ENTERED INTO THE MODEL?
	predictor.centered= if(!is.null(predictor)) { T } else { F },
	predictor.standardized= F,
	predictor.transform= NULL,
	fun= NULL,

	type= "hex",
	main= NA,
	xlab= NA,
	ylab= NA,
	xlim= NA,
	ylim= NA,
	legend.position="right",
	fontsize=16,
	col.line= "#333333",
	col.ci= col.line,
	lwd.line= 1.2,
	lty.line= "solid",
	alpha.ci= 3/10,
	hex.mincnt= 1,
	hex.maxcnt= nrow(model@frame) / 2,
	hex.limits = c(round(hex.mincnt), round(hex.maxcnt)),
	hex.limits2 = c(round(match.fun(hex.trans)(hex.mincnt)), round(match.fun(hex.trans)(hex.maxcnt))),
	hex.midpoint = (max(hex.limits) - (min(hex.limits) - 1)) / 2,
	hex.nbreaks = min(5, round(match.fun(hex.trans)(max(hex.limits)) - match.fun(hex.trans)(min(hex.limits))) + 1),
	hex.breaks = round(seq(min(hex.limits), max(hex.limits), length.out=hex.nbreaks)),
	hex.trans = "log10",
	...
)
{
    	if (!is(model, "mer")) {
     		stop("argument should be a mer model object")
    	}
    	if ((length(grep("^glmer", as.character(model@call))) == 1) &
          (length(grep("binomial", as.character(model@call))) == 1)) {
		model.type = "binomial"
	} else {
		if (length(grep("^lmer", as.character(model@call))) == 1) {
			model.type = "gaussian"
		}
	}
	if (!(model.type %in% c("binomial","gaussian"))) {
		stop("argument should be a glmer binomial or gaussian model object")
	}
    	if (!is.na(name.outcome)) {
     	   	if (!is.character(name.outcome))
            	stop("name.outcome should be a string\n")
    	}
	if (!is.na(xlab[1])) {
        	if (!is.character(xlab))
            	stop("xlab should be a string\n")
    	}
    	if (!is.na(ylab)) {
     	   	if (!is.character(ylab))
            	stop("ylab should be a string\n")
    	}
	# load libaries
	require(lme4)
	require(Design)
	require(ggplot2)

	if (predictor.standardized) { predictor.centered = T }
    	if (is.null(fun)) {
		if (is.na(ylab)) {
		 	if (model.type == "binomial") { ylab= paste("Predicted log-odds of", name.outcome) }
			if (model.type == "gaussian") { ylab= paste("Predicted ", name.outcome) }
		}
		fun= I
	} else {
		if (is.na(ylab)) {
		 	if (model.type == "binomial") { ylab= paste("Predicted probability of", name.outcome) }
			if (model.type == "gaussian") { ylab= paste("Predicted ", name.outcome) }
		}
		fun= match.fun(fun)
	}
    	if (!is.null(predictor.transform)) {
		predictor.transform= match.fun(predictor.transform)
    	} else { predictor.transform= I }

	indexOfPredictor= which(names(model@fixef) == name.predictor)

	# get predictor
	if (is.null(predictor)) {
		# simply use values from model matrix X
 		predictor= model@X[,indexOfPredictor]

		# function for predictor transform
		fun.predictor= I

		if (is.na(xlab)) { xlab= name.predictor }
    	} else {
                # make sure that only defined cases are included
                predictor = predictor[-na.action(model@frame)]

                # function for predictor transform
		trans.pred = predictor.transform(predictor)
		m= mean(trans.pred, na.rm=T)
		rms = sqrt(var(trans.pred, na.rm=T) / (sum(ifelse(is.na(trans.pred),0,1)) - 1))
		fun.predictor <- function(x) {
                          x= predictor.transform(x)
                          if (predictor.centered == T) { x= x - m }
                          if (predictor.standardized == T) { x= x / rms }
                          return(x)
               }
               if ((is.na(xlab)) & (label(predictor) != "")) {
                          xlab= label(predictor)
               }
    	}
        # get outcome for binomial or gaussian model
        if (model.type == "binomial") {
               outcome= fun(qlogis(fitted(model)))
        } else {
               outcome= fun(fitted(model))
        }
        ## calculate grand average but exclude effect to be modeled
        ## (otherwise it will be added in twice!)
        ## random effects are all included, even those for predictor (if any).
        ## should random slope terms for the predictor be excluded?
        ## prediction from fixed effects
        if (ncol(model@X) > 2) {
	     Xbeta.hat = model@X[, -indexOfPredictor] %*% model@fixef[-indexOfPredictor]
	} else {
		Xbeta.hat = model@X[, -indexOfPredictor] %*% t(model@fixef[-indexOfPredictor])
	}

     ## adjustment from random effects
     Zb = crossprod(model@Zt, model@ranef)@x

     ## predicted value using fixed and random effects
     Y.hat = Xbeta.hat + Zb

     ## intercept is grand mean of predicted values
     ## (excluding fixed effect of predictor)
     ## (including random effects of predictor, if any)
     int = mean(Y.hat)

	# slope
	slope <- fixef(model)[name.predictor]

	## error and confidence intervals
	stderr <- sqrt(diag(vcov(model)))
	names(stderr) <- names(fixef(model))
	slope.se <- stderr[name.predictor]
	lower <- -1.96 * slope.se
	upper <- 1.96 * slope.se

	# setting graphical parameters
	if (is.na(ylim)) { ylim= c(min(outcome) - 0.05 * (max(outcome) - min(outcome)), max(outcome) + 0.05 * (max(outcome) - min(outcome)) ) }
   	if (is.na(xlim)) { xlim= c(min(predictor) - 0.05 * (max(predictor) - min(predictor)), max(predictor) + - 0.05 * (max(predictor) - min(predictor))) }

	print("Printing with ...")
	print(paste("   int=", int))
	print(paste("   slope=", slope))
	print(paste("   centered=", predictor.centered))
	print("   fun:")
	print(fun.predictor)

	pdata= data.frame( 	predictor=predictor, outcome=outcome	)
	x= seq(xlim[1], xlim[2], length=1000)
	fit= int + slope * fun.predictor(x)
	ldata= data.frame(
				predictor= x,
				outcome= fun(fit),
				transformed.lower= fun(fit + lower),
				transformed.upper= fun(fit + upper)
	)
	theme_set(theme_grey(base_size=fontsize))
	theme_update(axis.title.y=theme_text(angle=90, face="bold", size=fontsize, hjust=.5, vjust=.5))
	theme_update(axis.title.x=theme_text(angle=0, face="bold", size=fontsize, hjust=.5, vjust=.5))
	p <- ggplot(data=pdata, aes(x=predictor, y=outcome)) +
		xlab(xlab) +
		ylab(ylab) +
		xlim(xlim) +
		ylim(ylim) +
		opts(legend.position=legend.position, aspect.ratio=1)

	     	# for degbugging:
		# panel.lines(rep(mean(x),2), c(min(y),max(y)))
		# panel.lines(c(min(x),max(x)), c(mean(y),mean(y)))

	if (type == "points") {
		p <- p + geom_point(alpha=3/10)
	} else if (type == "hex") {
		p <- p + geom_hex(bins = 30) +
			scale_fill_gradient2(low= "lightyellow",
							mid="orange",
							high=muted("red"),
							midpoint= hex.midpoint,
							space="rgb",
							name= "Count",
							limits= hex.limits,
							breaks= hex.breaks,
							trans = hex.trans
							)
	}
	p + 	geom_ribbon(data=ldata,
			aes(	x= predictor,
				ymin=transformed.lower,
				ymax=transformed.upper
			),
			fill= col.ci,
			alpha= alpha.ci
		) +
		geom_line(data=ldata,
			aes(x= predictor,
				y=outcome
			),
			colour= col.line,
			size= lwd.line,
			linetype= lty.line,
			alpha=1
		)
}

The only thing this function really needs is a model and a NAME of a predictor, e.g. this plots the distribution of centered log-transformed speechrate against the predicted probability of a complementizer (the outcome of my model). On top of the distribution, it plots the prediction dependent on the value of cLSPEECHRATE and the CIs around it. Per default this plots in log-odds space.

I think this is a great way plot to detect outliers – everything is in the right space: the y-axis is the average predicted log-odds (=fitted(model)) and the x-axis is in the scale actually used in the model

my.glmerplot(lmer.speaker, "cLSPEECHRATE")

# but you can go to probability space (now debugged)
my.glmerplot(lmer.speaker, "cLSPEECHRATE", fun=plogis)

The above plots are nice since they plot against the actual scale of the predictor, but they are often hard to interpret.

What if you want the nice scale from your original variable on the x-axis? that is, if you don’t want centered log-transformed speechrate, but rather speechrate? The distribution would then be nicely plotted in the space you can understand (but beware: to detect relevant outliers, see above). The predicted effect should also be plotted correctly — so, we need to tell the model what was done to the original input variable and we need to give it the original variable:

my.glmerplot(lmer.speaker, "cLSPEECHRATE", predictor= d$SPEECHRATE, predictor.transform = log, predictor.centered=T)

# or shorter (predictor.centered defaults to T)
my.glmerplot(lmer.speaker, "cLSPEECHRATE", predictor= d$SPEECHRATE, predictor.transform = log)

# or in probability space
my.glmerplot(lmer.speaker, "cLSPEECHRATE", predictor= d$SPEECHRATE, predictor.transform = log, fun= plogis)

The following, however, would be wrong (since variable IS actually centered and logged in model). Usually that becomes apparent when you plot, but be cautious

my.glmerplot(lmer.speaker, "cLSPEECHRATE", predictor= d$SPEECHRATE, predictor.centered=F)
my.glmerplot(lmer.speaker, "cLSPEECHRATE", predictor= d$SPEECHRATE)

of course, you can modify all the color settings and other parameters. see above. Here is an example:

my.glmerplot(lmer.speaker, "cLSPEECHRATE", fun=plogis,
    xlab="Centered log-transformed speechrate (syllables/second)",
    name.outcome="complementizer",
    col.line = "black",
    col.int = "gray",
    colramp = function (n) { plinrain(n, beg=20, end=225) })

Which, in all ugliness, would look like:

I am sure there are plenty of bugs in there or room for improvement and extension. Please comment on this post, or send us your updated improved version =).

Hhhmm, probably it would be better to include an option to plot the “real” distribution, i.e. not against the predicted (fitted) values, but against true averages in the “bin”. If you know what I mean …

This entry was posted in HLP lab, Statistics & Methodology, statistics/R and tagged hexbinplot, lmer, mixed logit model, multilevel logit model, plot, R, visualization.

36 thoughts on “Plotting effects for glmer(, family=”binomial”) models”

Jianghua Liu said:
November 18, 2009 at 4:50 pm

Hi,
I used your new code to plot standard error lines from the logistic regression by lmer. However, it seems that predicted lines (with standard error lines) depart significantly from predicted values. If needed, I can send my picture to you, so you can point out where my mistake is.
Thank you very much!
Best wishes,
Jianghua

Plotting effects for glmer(, family=”binomial”) models

Share this:

Related

36 thoughts on “Plotting effects for glmer(, family=”binomial”) models”

Questions? Thoughts? Cancel reply