uniform information density
A few days ago, I posted a summary of some recent work on syntactic alignment with Kodi Weatherholtz and Kathryn Campell-Kibler (both at The Ohio State University), in which we used the WAMI interface to collect speech data for research on language production over Amazon’s Mechanical Turk.
The Human Language Processing (HLP/Jaeger) Lab in the Department of Brain and Cognitive Sciences at the University of Rochester is looking for PhD researchers to join the lab. Admission is through the PhD program in the Brain and Cognitive Sciences, which offers full five-year scholarship. International applications are welcome.
And while I am at it, let me post three more papers that are interesting for anyone interested in uniform information density and, more generally, theories of communicatively efficient language production (though most of you may already know these papers):
- They call it speech information rate, but it’s essentially the same: Pellegrine, F., Coupe, C., and Marsico, E. 2011. A cross-linguistic perspective on speech information rate. Language 87(3), 539-558.
- Maurits, L., Perfors, A., and Navarro, D. 2010. Why are some word orders more common than others. A uniform information density account. NIPS.
- S.T. Piantadosi, H. Tily, and E. Gibson. 2011. Word lengths are optimized for efficient communication.Proceedings of the National Academy of Sciences, 108(9):3526.
Ah, just when I thought it couldn’t get any better: Uniform Information Density has been applied to text generation ;). Have a look at this paper (thanks, Raja, for forwarding it):
- Rajakrishnan Rajkumar and Michael White. 2011. Linguistically Motivated Complementizer Choice in Surface Realization. In Proc. of the EMNLP-11 Workshop on Using Corpora in NLG. (bib)
At this year’s CUNY Sentence Processing Conference, Emily Bender and Jennifer Arnold presented a Festschrift celebrating Thomas Wasow. Here’s what the publisher’s site (CSLI) says (picture taken from the publisher’s website, which is hopefully ok; see the book for copyrights):
This book is a collection of papers on language processing, usage, and grammar, written in honor of Tom Wasow to commemorate his career on the occasion of his 65th birthday. Tom is a professor of linguistics and philosophy. But more accurately, he is a renaissance academic, having done work that connects with many different disciplines, including formal linguistics, sociolinguistics, historical linguistics, psycholinguistics, computational linguistics, and philosophy. Appropriately, this book reflects the diversity of Tom’s research and interests, including topics from multiple branches of linguistics and human information processing. These papers are written with minimal background assumed, so they can be used as teaching materials for beginning scholars. As such, this volume is a tribute to what is perhaps Tom’s most lasting contribution to the field—the mentorship and inspiration he provided to his students and collaborators, many of whom have contributed to this volume.
The book contains introductory and overview articles on a variety of topics in cognitive science from Emily M. Bender, Dan Flickinger, Stephan Oepen, Ash Asudeh, Peter Sells, Amy Perfors, James Paul Gee, John R. Rickford, T. Florian Jaeger, Jennifer E. Arnold, Harry J. Tily, Neal Snider, John A. Hawkins, and Susanne Riehemann.
- Degen, J. and Jaeger, T. F. 2011. Speakers sacrifice some (of the) precision in conveyed meaning to accommodate robust communication. Talk to be presented at the 2011 Meeting of the LSA.
- Session: Pragmatics II 31
- Room: Le Batea
- Time: Friday 2pm
The process of encoding an intended meaning into a linguistic utterance is well-known to be affected by production pressures. We present corpus data suggesting that the choice between even two seemingly non-meaning-equivalent forms as in (1a) and (1b) can be affected by speakers’ preference to distribute information uniformly across the linguistic signal (Uniform Information Density (UID), Jaeger 2006). This suggests that even when two forms do not encode the same (but a similar enough) message, speakers may sacrifice precision in meaning for increased processing efficiency.
(1a) Alex ate some chard.
(1b) Alex ate some of the chard
- Fedzechkina, M., Jaeger. T. F. , and Newport, E. 2011. Word order and case marking in language acquisition and processing. Poster to be presented at the 2011 Meeting of the LSA.
- Session: Language Acquisition/Psycholinguistics/Syntax
- Room: Grand Ballroom Foyer
- Time: 9:00 – 10:30 AM.
To understand a sentence, comprehenders must identify its actor and patient. In principle, these relationships can be signaled using a single cue, but most languages employ several redundant cues, including word order and case marking. In artificial language learning experiments we investigate word order and case as cues in processing and learning. In languages without case marking, learners regularize word order; but when case marking is present, it is favored and limits word order regularization. Case-marking comes with a disadvantage: it is more complex to acquire. But the present results suggest that this may be outweighed by clarity for processing.
And here is one more poster on Yucatec, following Lindsay’s example. This is work by Lis Norcliffe, who just graduated from Stanford and join the MPI in Nijmegen. Her thesis work is on the (possibly resumptive) morphology discussed in this poster and the experiments were part of that thesis, too. You’ll find effects of definiteness and dependency length, which we investigated since they (in our view) provide evidence that this morphological reduction alternation is affected by both a preference for uniform information density and a preference for dependency minimization. Feedback welcome.