Speech dynamics
| Authors | |
|---|---|
| Publication date | 2011 |
| Book title | The 17th International Congress of Phonetic Sciences: Hong Kong, China, August 17-21, 2011: congress proceedings |
| Event | 17th International Congress of Phonetic Sciences (ICPhS XVII) |
| Pages (from-to) | 35-43 |
| Publisher | Hong Kong: Department of Chinese, Translation & Linguistics, City University of Hong Kong |
| Organisations |
|
| Abstract |
In order for speech to be informative and communicative, segmental and suprasegmental variation is mandatory. Only this leads to meaningful words and sentences. The building blocks are no stable entities put next to each other (like beads on a string or like printed text), but there are gradual transitions from one to the other. This is what I call speech dynamics (energy envelope, spectral variation, voicing and pitch variation, speaking style and pronunciation variation, and influence of communication channel).
In human speech production and perception interesting questions arise. Do we run against the limitations of the articulatory system f.i. in fast speech, and are we able to make good use of local context, such as formant transitions, to optimize speech understanding? For ASR it is hard to properly interpret this variation under variable conditions, just as it is hard for TTS to properly generate these speech dynamics for reaching high intelligibility and greater naturalness. Speakers with a speech handicap can get effective feedback about their speech quality with appropriate acoustic measures. This leads to interesting applications in rehabilitation programs. One also wonders how many of the cochlear implant users can process speech dynamics so well, despite deviant speech processing. |
| Document type | Conference contribution |
| Language | English |
| Published at | http://www.icphs2011.hk/resources/OnlineProceedings/PlenaryLecture/Pols/Pols.pdf |
| Downloads |
Keynote_Pols_ICPhS_2011.pdf
(Final published version)
|
| Permalink to this page | |