We recently submitted a research review on “Speech perception and generalization across talkers and accents“, which provides an overview of the critical concepts and debates in this domain of research. This manuscript is still under review, but we wanted to share the current version. Of couse, feedback is always welcome.
In this paper, we review the mixture of processes that enable robust understanding of speech across talkers despite the lack of invariance. These processes include (i) automatic pre-speech adjustments of the distribution of energy over acoustic frequencies (normalization); (ii) sensitivity to category-relevant acoustic cues that are invariant across talkers (acoustic invariance); (iii) sensitivity to articulatory/gestural cues, which can be perceived directly (audio-visual integration) or recovered from the acoustic signal (articulatory recovery); (iv) implicit statistical learning of talker-specific properties (adaptation, perceptual recalibration); and (v) the use of past experiences (e.g., specific exemplars) and structured knowledge about pronunciation variation (e.g., patterns of variation that exist across talkers with the same accent) to guide speech perception (exemplar-based recognition, generalization).