In my earlier post I provided a summary of a workshop on Gradience in Grammar last week at Stanford. The workshop prompted many interesting discussion, but here I want to talk about an (admittedly long ongoing) discussion it didn’t prompt. Several of the presentations at the workshop talked about prediction/expectation and how they are a critical part of language understanding. One implication of these talks is that understanding the nature and structure of our implicit knowledge of linguistic distributions (linguistic statistics) is crucial to advancing linguistics. As I was told later, there were, however, a number of people in the audience who thought that this type of data doesn’t tell us anything about linguistics and, in particular, grammar (unfortunately, this opinion was expressed outside the Q&A session and not towards the people giving the talks, so that it didn’t contribute to the discussion).
I guess this perspective isn’t surprising and, under some definitions of grammar, it’s also a tautology. One might add that it’s well-known that there are opposing views of what linguistic research ought to care about -for example, the contrast between what’s often referred to as ‘usage-based’ vs. ‘generative’ approaches to grammatical theory (though there would seem to be accounts that don’t fall neatly within either group). Still, what struck on me this particular occasion was that at least a few of the workshop talks on prediction took a perspective and formal framing that is in one critical sense closely related to the original development of generative grammar. I’ll try to illustrate that point using my talk, so as to not put words into anybody else’s mouth.
In my presentation, I outlined how implicit knowledge of linguistic distributions allows comprehenders to robustly infer the intended message despite the fact that the linguistic signal is perturbed by noise from the environment, production planning and executation, etc (this idea is, of course, not new and I’m by no means the first to propose it). Even when the same speaker produces the same word in the same phonological context, the acoustic realizations of the same phoneme form a distribution over acoustic dimensions (rather than a point estimate). Similarly, the same speaker describing the same picture on multiple occasions might use different syntactic realizations (e.g., when describing an event compatible both with an active and a passive or with either variant of the ditransitive alternations).
Like an increasing number of accounts, I was trying to highlight that one can think of this in terms of generative models (e.g., Bayesian models, cf. for phoneme perception, Bejjanki et al., 2010; Clayards et al., 2008; for word recognition, Norris & McQueen, 2008; for sentence processing, Levy, 2008). Although I think this property of Bayesian approaches hasn’t necessarily received that much attention in research on language procesing, these models are generative in that they allow inference by generating the data (under different hypotheses). I went on from there to talk about systematic variability in the speech signal: as we know from much research in phonetics, sociolinguistics, and variationist linguistics, the statistics of the speech signal depend on the context.
So why should all of this be relevant (in the broad sense) to “theoretical linguistics” or “formal linguistics”? For what it’s worth, there’s a certain historical irony in the fact that some proponents of generative grammar do not consider generative models of language to be something that they should know about. First, generative models are arguably formal and mathematically well-defined –something that researchers in generative grammar claim distinguishes their work from, say, certain functional linguistics (though I’d say that might be debatable, both in the sense that not all generative grammars are actually well defined and in the sense that there are formalizations of functional approaches). Second, generative models are, well, generative. One of the things that Chomsky is well known for is that he developed and made broadly assessible the notion of generative grammars. These generative grammars generate (infinite) sets of sentences. Those sentences generated by a grammar are part of the grammar’s language, those that the grammar cannot generate are not: i.e., they are ‘ungrammatical’ (actually, Roger Levy discussed some of these points in his introduction to the workshop). In this world view, we the object we seek to explain is ‘grammaticality’.
The whole concept of generative models is rather similar – just that it’s arguably more ambitious in what it seeks to achieve. Rather than to just see what can be generated by a grammar, generative models of the type used in the references above can explain how likely different outputs are to be generated, and thereby how often these outputs occur in the observable data. This would seem to be a desirable upgrade since a) this type of probabilistic information is part of speakers’/listeners’ implicit knowledge (as evidenced by decades of research in psycholinguistics, computational linguistics, and sociolinguistics) and b) we don’t loose any of the original power of the generative grammar approach.
(I am simplifying, of course. Some might point out that generative grammar was meant to capture ‘competence’, not ‘performance’. Others have replied to this in more detail and with more quality than I could. At the very least, this distinction –while a productive simplification for certain research question– can be quite problematic and too limiting to understand the biological systems that underlie our ability to learn, produce, and comprehend language. At the very least though, having generative models that make predictions that are testable against actually observable behavior, neural correlates, etc. can’t possibly be a bad thing for scientific progress, or can it?)
So, in short, I was (mildly) baffled why understanding the generative structure of the architecture and mechanisms of language processing is perceived to be uninformative about the nature of linguistic representations. At the very least, if we’re interested in how the mind/brain represents linguistic knowledge, this type of study would seem to be rather relevant to that question. And, it would seem to be at least sensible to consider the study of these mental/neural representations as part of the linguistic sciences. Anyway, I know this is hardly a new topic. Comments are welcome though ;).