Phonetic convergence is defined as an increase in the similarity of acoustic-phonetic form between talkers. Previous research has demonstrated phonetic convergence both when a talker listens passively to speech and while talkers engage in social interaction. Much of this research has focused on a diverse array of acoustic-phonetic attributes, with fewer studies incorporating perceptual measures of phonetic convergence. The current paper reviews research on phonetic convergence in both non-interactive and conversational settings, and attempts to consolidate the diverse array of findings by proposing a paradigm that models perceptual and acoustic measures together. By modeling acoustic measures as predictors of perceived phonetic convergence, this paradigm has the potential to reconcile some of the diverse and inconsistent findings currently reported in the literature. The acoustic-phonetic form of an utterance is highly variable due to the influence of multiple factors in speech production. Some of these factors relate to relatively stable aspects of talker physiology, but other factors also play a role in shaping an utterance. These factors include talker dialect and idiolect, as well as relatively transient aspects of a conversational setting. Recent research on phonetic convergence has demonstrated that talkers converge to an interlocutor or to an auditory model on multiple acoustic-phonetic dimensions. Across numerous studies, investigators have assessed convergence in many acoustic attributes, with differing degrees of success and consistency. The current paper reviews research on phonetic convergence in both non-interactive and conversational settings, and attempts to consolidate the diverse array of findings by proposing a paradigm that models perceptual and acoustic measures together. Ultimately, a more comprehensive model of speech communication must integrate social and cognitive approaches to speech production and perception.
All Science Journal Classification (ASJC) codes
- Phonetic convergence
- Speech perception
- Speech production