Austin Frank

HLP Lab and collaborators at CMCL, ACL, and CogSci

Posted on Updated on

The summer conference season is coming up and HLP Lab, friends, and collaborators will be presenting their work at CMCL (Baltimore, joint with ACL), ACL (Baltimore), CogSci (Quebec City), and IWOLP (Geneva). I wanted to take this opportunity to give an update on some of the projects we’ll have a chance to present at these venues. I’ll start with three semi-randomly selected papers. Read the rest of this entry »

Grinking #2

Posted on Updated on

My sabbatical it’s nearing its end (shiver). So, there’s much to catch up on. HLP Lab has once again grown and shrunk, leading to grinking report #2 (cf. #1):

First a farewell to the lost ones:

  • Austin Frank has graduated with an absolutely wonderful thesis (work with Mike Tanenhaus and Dick Aslin) on perturbation. In his studies, Austin manipulates what participants think they are saying by changing the first formants of the acoustic signal produced by them up or down within about 14msecs to play it back to them over head sets, thereby creating the misleading perception of having mispronounced the word (the ‘perturbation’). I won’t go into the gory technical challenges Austin had to overcome to run these studies. His thesis work provides evidence that (a) speakers adapt their pronunciation partly based on auditory feedback about their own production, (b) these adaptations are pretty rapid, (c) they are sensitive to the structure of the phonological lexicon. For example, speakers are less likely to shift their production into a corner of the phonological space that is already occupied by other words in the language …. (yeah, cool, right?). He’s currently holding a post-doc position at Haskins and UConn, working with Jim Magnuson.

and a welcome to the newbies:

  • Esteban Buz has joined us from Johns Hopkins where worked with Robert Frank and Kerry LeDoux. It seems he has chosen some questions on functional explanations to language change as his first research topic, which he will explore using iterative learning studies. In particular, he’s interested in how changes over time are, in part, a reflection of acquisition and processing biases.
  • David Kleinschmidt has joined the lab after a year at Maryland. He did is undergraduate at Williams College with stints at Emory and the University of Maine. He’s interested in computational modeling and speech perception, and specifically in developing models of how phonetic categories are learned and deployed that are plausible from linguistic, computational, neural, and developmental perspectives.  Dave’s also working with Dick Aslin and Alex Pouget.

How good is the web as an approximation of language experience?

Posted on Updated on

Benjamin Van Durme and Austin Frank (who still doesn’t have a webpage) have been doing some neat comparisons of web-based estimate of language experience vs. traditional data sources. This work is part of a project funded by the University of Rochester’s Provost Award for Multidisciplinary Research. Since I really like the results, I am gonna use some lazy time to blog about my favorites.

We found that web-based probability estimates can be used to investigate probability-sensitive human behavior. We used databases of word naming, picture naming, and lexical decision tasks, as well as a database of word durations derived from the Switchboard corpus of spontaneous speech. Comparing Google Web 1T 5-gram counts vs. CELEX (spoken and written), BNC (spoken and written), and Switchboard counts, we estimated word frequencies and compared models using these different frequency estimates against the different types of probability sensitive language behaviors mentioned above (word naming RTs, etc.).

I find this encouraging, as web data, unlike traditionally used data sources, is cheap and readily available for many languages, thereby facilitating cross-linguistics work on probability-sensitive human language processing. Additionally, we found at least preliminary evidence that simple principal component analysis over the various frequency estimates leads to better correlation against human language behavior.

CELEX (written), Google Ngram, and their 1st principal component fitted against probability-sensitive human language behavior

Read the rest of this entry »