Unfortunately, I was only able to attend the first day of the LSA meeting this year, but it was good being there (ran into lots of interesting folks and saw a couple of good talks). Ting gave his presentation on Constant Entropy Rate in Mandarin Chinese and he was a real pro😉. Ting presented evidence that out-of-context per-word entropy increases over the course of a discourse, even after sentence length and out-of-vocabulary effects are controlled for using linear mixed models (crucially, normalizing by sentence length is shown to be insufficient since there are also non-linear relations between sentence length and entropy, maybe linked to out-of-vocabulary words). The effect is also clearly observed when further problems with Chinese word boundaries are removed by analysing per-syllable entropy (Pinyin). Ting got a couple of good questions and handled them really well (I thought). Several folks in the audience were interested in how genre differences would affect increases in out-of-context entropy rates — the only work that comes to mind that directly bears on this issue is by Finegan and Biber (1999? or 2001?) who showed that all kinds of reduction phenomena (phonetic reduction, phonological reduction, that-omission, etc) are more frequent the more common ground the interlocutors seem to share. This is definitely an area that deserves further attention. In particular, it’s one thing to show that overall out-of-context entropy rates are increased if speakers share more common ground (since that increases the effect of ‘context’), but it’s another thing to test whether the slope of the entropy rate throughout the entire discourse also changes.
I also really liked Roger Levy’s and Klinton Bicknell’s talks in the same session. It’ll be interesting to compare their approaches to account for local coherence effects to the one presented in Tabor et al (2008-CUNY), though it seems that the Levy approach has the immediate advantage of also potentially accounting for other effects. One thing that came to mind while listening to those talks is that I consider the use of the term “noise” to account for the uncertainty that a parsing model inherits from word recognition as somewhat misleading. I think Josh Tenenbaum pointed that out to me. It seems that ‘uncertainty’ is more appropriate. Anyway, just something to keep in mind. It’s great to see these far more sophisticated models of rational comprehension coming along so fast. It seems that we soon will see the same development that has been taking place in machine learning: rather than trying to solve individual research problems, it may turn out that modeling several interactive processes all in one (Bayesian) model may actually not only be more insightful, but even easier (since it keeps us from trying to explain strange findings at one level of processing that may ultimately follow straightforwardly from the uncertainty that this level of processing inherits from lower levels of processing).