Cantonese turn-initial minimal particles: annotation of discourse-interactional functions in dialog corpora


This interactional linguistic study is concerned with the annotation of discourse-interactional functions of turn-initial particles in Cantonese conversation. These particles (or intersections) are commonly transcribed as ngo (哦), ng (嗯), aa (啊), aak (呃) and can format a range of functions both as turn-initial utterances or as stand-alone turns. Based on the analysis of 20 hours of naturally-occurring video corpus data, the study identifies five discourse-interactional functions that the most ‘minimal’ (i.e. shortest and mostly monophthongic) of these utterances can format, continuers, positive response tokens, change-ofstate tokens, turn management tokens and repair initiators. I then show that three dimensions have to be taken into account to annotate those functions, sequential position, pitch contour and phonetic production format. In contrast to existing annotation taxonomies that directly map production format to function, I argue that discourse-interactional functions of these particles can only be annotated with reasonable accuracy if at least these three structural dimensions are taken into account. I conclude with discussing the relation between sequential position, sound and pitch format for each function.

in Proceedings of the 33rd Pacific Asia Conference on Language, Information and Computation
Andreas Liesenfeld
Andreas Liesenfeld
Postdoc in language technology

I am a social scientist working in Conversational AI. My work focuses on understanding how humans interact with voice technologies in the real world. I aspire to produce useful insights for anyone who designs, builds, uses or regulates talking machines. Late naturalist under pressure. “You must collect things for reasons you don’t yet understand.” — Daniel J. Boorstin