### Visualizing the quality of an glmer(family=”binomial”) model

Ah, while I am at, I may as well put this plot up, too. The code needs to be updated, but let me know if you think this could be useful. It’s very similar to the *calibrate()* plots from Harell’s *Design *library, just that it works for *lmer()* models from Doug Bates’ *lme4* library.

The plot below is from a model of complementizer *that*-mentioning (a type of syntactic reduction as in *I believe (that) it is time to go to bed*). The model uses 26 parameters to predict speakers’ choice between complement clauses with and without *that*. This includes predictors modeling the accessibility, fluency, etc. at the complement clause onset, overall domain complexity, the potential for ambiguity avoidance, predictability of the complement clause, syntactic persistence effects, social effects, individual speaker differences, etc.

Mean predicted probabilities vs. observed proportions of that. The data is divided into 20 bins based on 0.05 intervals of predicted values from 0 to 1. The amount of observed data points in each bin is expressed as multiples of the minimum bin size. The data rug at the top of the plot visualizes the distribution of the predicted values. See Jaeger (almost-submitted, Figure 2).

October 31, 2014 at 3:26 pm

As I was recently asked for the code to make the above plot, I’m pasting it below. It’s probably outdated (I found it in an old code file), but perhaps it contains some useful pointers. Good luck.

my.plot.logistic.fit levels(sh)[[i]][1] &

probs levels(sh)[[i]][1] &

probs < levels(sh)[[i]][2]])

}

names(means) = as.character(midpoints)

}

}

plot(as.numeric(names(means)), means, xlab = "Mean predicted probabilities",

ylab = "Observed proportions", type = "n", …)

if (rug == T) {

myRug(x= probs, y= rug.y, ticksize= 0.02, col = rug.col, stacked = rug.stacked)

}

abline(0, 1, col = line.col, lwd = line.width, lty = line.lty)

if (is.na(MinBin)) { minbin = min(lengths)

} else { minbin = MinBin }

if (type == "sunflower") {

sunflowerplot(as.numeric(names(means)), means, number= (lengths / minbin), add=T, …)

}

else {

if (type == "squares") {

# to print squares sized by amount of data

points(as.numeric(names(means)), means, pch = 46, cex = lengths / minbin * 2, …)

}

else {

# to print standard points

points(as.numeric(names(means)), means, pch = 19, cex = 1, …)

}

}

# lines(smooth(means ~ as.numeric(names(means))), lty = 3)

# lines(loess(means ~ as.numeric(names(means)), weights=lengths, span=0.05), lty = 3)

if(print.minbinsize == T) {

text(locator(1), pos= 4, paste(

paste("R-squared: ", round(cor(as.numeric(names(means)), means)^2, 2), sep = ""),

paste("Min. bin size: ", minbin, sep= ""), sep="\n"),

cex= legend.cex, font= legend.font, col= legend.col)

}

else {

text(locator(1), pos= 4,

paste("R-squared: ", round(cor(as.numeric(names(means)), means)^2, 2), sep = ""),

cex= legend.cex, font= legend.font, col= legend.col)

}

}

LikeLike