# multilevel logit model

### Updated slides on GLM, GLMM, plyr, etc. available

Some of you asked for the slides to the** Mixed effect regression class** I taught at the **2013 LSA Summer Institute in Ann Arbor, MI**. The class covered some Generalized Linear Model, Generalized Linear Mixed Models, extensions beyond the linear model, simulation-based approaches to assessing the validity (or power) of your analysis, data summarization and visualization, and reporting of results. The class included slides from Maureen Gillespie, Dave Kleinschmidt, and Judith Degen (see above link). Dave even came by to Ann Arbor and gave his lecture on the awesome power of plyr (and reshape etc.), which I recommend. You might also just browse through them to get an idea of some new libraries (such as Stargazer for quick and nice looking latex tables). There’s also a small example to work through for time series analysis (for beginners).

Almost all slides were created in knitr and latex (very conveniently integrated into RStudio — I know some purists hate it, but comm’on), so that the code on the slides is the code that generated the output on the slides. Feedback welcome.

### Some thoughts on the sensitivity of mixed models to (two types of) outliers

Outliers are a nasty problem for both ANOVA or regression analyses, though at least some folks consider them more of a problem for the latter type of analysis. So, I thought I post some of my thoughts on a recent discussion about outliers that took place at CUNY 2009. Hopefully, some of you will react and enlighten me/us (maybe there are some data, some simulations out there that may speak to the issues I mention below?). I first summarize a case where one outlier out of 400 apparently drove the result of a regression analysis, but wouldn’t have done so if the researchers had used ANO(C)OVA. After that I’ll have some simulation data for you on another type of “outlier” (I am not even sure whether outlier is the right word): the case where a few levels of a group-level predictor may be driving the result. That is, how do we make sure that our results aren’t just due to item-specific or subject-specific properties. Read the rest of this entry »

### Nagelkerke and CoxSnell Pseudo R2 for Mixed Logit Models

What to do when you need an intuitive measure of model quality for your logit (logistic) model? The problem is that logit models don’t have a nice measure such as R-square for linear models, which has a super intuitive interpretation. However, several pseudo R-square measures have been suggested are some are more commonly used (e.g. Nagelkerke R2). In R, some model-fitting procedures for ordinary logistic regression provide the Nagelkerke R-square as part of the standard output (e.g. *lrm *in Harrell’s *Design* package). However, no such measure is provided for the most widely used mixed logit model-fitting procedure (*lmer* in Bates’ *lme4* library). Below I provide some code that provides Nagelkerke and CoxSnell pseudo R-squares for mixed logit models. Read the rest of this entry »

### Multinomial random effects models in R

This post is partly a response to this message. The author of that question is working on ordered categorical data. For that specific case, there are several packages in R that might work, none of which I’ve tried. The most promising is the function `DPolmm()`

from DPpackage. It’s worth noting, though, that in that package you are committed to a Dirichlet Process prior for the random effects (instead of the more standard Gaussian). A different package, mprobit allows one clustering factor. This could be suitable, depending on the data set. MNP, mlogit, multinomRob, vbmp, `nnet`

, and msm all offer some capability of modeling ordered categorical data, and it’s possible that one of them allows for random effects (though I haven’t discovered any yet). MCMCpack may also be useful, as it provides MCMC implementations for a large class of regression models. `lrm()`

from the `Design`

package handles ordered categorical data, and clustered bootstrap sampling can be used for a single cluster effect.

I’ve recently had some success using MCMCglmm for the analysis of unordered multinomial data, and want to post a quick annotated example here. It should be noted that the tutorial on the CRAN page is extremely useful, and I encourage anyone using the package to work through it.

I’m going to cheat a bit in my choice of data sets, in that I won’t be using data from a real experiment with a multinomial (or polychotomous) outcome. Instead, I want to use a publicly available data set with some relevance to language research. I also need a categorical dependent variable with more than two levels for this demo to be interesting. Looking through the data sets provided in the `languageR`

package, I noticed that the `dative`

data set has a column `SemanticClass`

which has five levels. We’ll use this as our dependent variable for this example. We’ll investigate whether the semantic class of a ditransitive event is influenced by the modality in which it is produced (spoken or written).

library(MCMCglmm) data("dative", package = "languageR") k <- length(levels(dative$SemanticClass)) I <- diag(k-1) J <- matrix(rep(1, (k-1)^2), c(k-1, k-1)) m <- MCMCglmm(SemanticClass ~ -1 + trait + Modality, random = ~ us(trait):Verb + us(Modality):Verb, rcov = ~ us(trait):units, prior = list( R = list(fix=1, V=0.5 * (I + J), n = 4), G = list( G1 = list(V = diag(4), n = 4), G2 = list(V = diag(2), n = 2))), burnin = 15000, nitt = 40000, family = "categorical", data = dative)

Read on for an explanation of this model specification, along with some functions for evaluating the model fit.

### Jaeger (2008), J Memory Language, 59, 434-446 (ANOVA)

Since I get asked for the R code I promised in my 2008 JML paper on mixed logit models every now and then, I have posted it here. **If you find this code useful, please consider citing the Jaeger (2008) paper:**

- Jaeger, T. Florian (2008). Categorical Data Analysis: Away from ANOVAs (transformation or not) and towards Logit Mixed Models.
*Journal of Memory and Language 59*, 434-446*.*

Please note, however, that **the data analyzed in that paper is not mine and you need to acquire it from the Inbal Arnon**, who conducted the study. With Inbal’s permission, here’s the data file I used:

- Data from the comprehension component of Study 2 from Arnon, Inbal. “Rethinking child difficulty: The effect of NP type on children’s processing of relative clauses in Hebrew.”
*Journal of Child Language*37.01 (2010): 27-57.

If you try to work your way through my paper, you may also find the following wiki pages from our lab with readings and more code helpful:

http://wiki.bcs.rochester.edu/HlpLab/StatsCourses/

As a quick intro you may find the talks from a recent workshop on the conceptual background, common issues and solutions for ordinary and multilevel regression models that some colleagues (Dale Barr, Roger Levy, Harald Baayen, Victor Kuperman, Austin Frank) and I gave at the CUNY sentence processing conference 2009 useful. The talk slides are all linked to the schedule on that page. You’ll find detailed walk-throughs, R code, and a conceptual overviews.

I appreciate if you leave a comment here in case this was useful. It helps to see what we should be posting. cheers.

### Visualizing the quality of an glmer(family=”binomial”) model

Ah, while I am at, I may as well put this plot up, too. The code needs to be updated, but let me know if you think this could be useful. It’s very similar to the *calibrate()* plots from Harell’s *Design *library, just that it works for *lmer()* models from Doug Bates’ *lme4* library.

The plot below is from a model of complementizer *that*-mentioning (a type of syntactic reduction as in *I believe (that) it is time to go to bed*). The model uses 26 parameters to predict speakers’ choice between complement clauses with and without *that*. This includes predictors modeling the accessibility, fluency, etc. at the complement clause onset, overall domain complexity, the potential for ambiguity avoidance, predictability of the complement clause, syntactic persistence effects, social effects, individual speaker differences, etc.

Mean predicted probabilities vs. observed proportions of that. The data is divided into 20 bins based on 0.05 intervals of predicted values from 0 to 1. The amount of observed data points in each bin is expressed as multiples of the minimum bin size. The data rug at the top of the plot visualizes the distribution of the predicted values. See Jaeger (almost-submitted, Figure 2).

### Plotting effects for glmer(, family=”binomial”) models

**UPDATE 12/15/10: **Bug fix. Thanks to Christian Pietsch.

**UPDATE 10/31/10:** Some further updates and bug fixes. The code below is the updated one.

**UPDATE 05/20/10: I’ve updated the code with a couple of extensions (both linear and binomial models should now work; the plot now uses ggplot2) and minor fixes (the code didn’t work if the model only had one fixed effect predictor). I also wanted to be clear that the dashed lines in the plots aren’t confidence intervals. They are multiples of the standard error of the effect.**

Here’s a new function for plotting the effect of predictors in multilevel logit models fitted in R using *lmer()* from the *lme4* package. It’s based on code by Austin Frank and I also borrowed from Harald Baayen’s *plotLMER.fnc()* (package *languageR*).* *First a cool pic:

These plots contain the distribution of the predictor (x-axis) against the predicted values (based on the entire model, y-axis) using *hexbinplot() *from the package *hexbin*. On top of that, you see the model prediction fo the selected predictor along with confidence intervals. Note that the predictor is given in its original form (here speech rate) although it was entered into the model as the centered log-transformed speechrate. The plot consideres that. Of course, you can configure things.