For more information, references and a full transcript please
visit wordsandactions.blog
In this episode we start our discussion of language and
technology with voice recognition. Bernard mentions a general bias
towards female voices, as discussed in this paper:
Edworthy J., Hellier E., & Rivers J. (2003). The use of male
or female voices in warnings systems: a question of acoustics.
Noise and Health, 6(21): 39-50.
Pitch range is also important, as demonstrated in the
experiment on using different voices for sat navs that Erika
mentions:
Niebuhr, O., & Michalsky, J. (2019). Computer-generated
speaker charisma and its effects on human actions in a
car-navigation system experiment: or how Steve Jobs’ tone of voice
can take you anywhere. In Misra S. et al. (eds) Computational
Science and Its Applications – ICCSA 2019. Lecture Notes in
Computer Science, vol. 11620: 375-390. Springer, Cham.
https://doi.org/10.1007/978-3-030-24296-1_31
Moving from acoustics to culture, the following paper
discusses how male voices are perceived as more
authoritative:
It is worth sharing a few more auto-captioning gems in the
lectures of Veronika and her colleagues at Lancaster
University:
"my grammar is leaving me" → "my grandma is leading
me"
“n-sizes” → “incisors”
“Hardaker and McGlashan” → “heartache and regression”
“institutional” → "it's too slow" (truth!)
“masculine” → "mass killer" (bit harsh)
On readability, Bernard mentions an example from accounting,
namely the obfuscation hypothesis. The following paper on the topic
is considered the first accounting study that uses automated
textual analysis with a very large sample to address
readability:
Li, F. (2008). Annual report readability, current earnings,
and earnings persistence. Journal of Accounting & Economics, 45:
221–247. doi:10.1016/j.jacceco.2008.02.003
We then go on to talk about sentiment analysis, which is used
to find out about, for example, brand perceptions or patient
satisfaction. Here is an example of the latter:
Hopper, A. M., & Uriyo, M. (2015). Using sentiment analysis to
review patient satisfaction data located on the internet. Journal
of Health Organization and Management, 29(2): 221-233. DOI
10.1108/JHOM-12-2011-0129
In the context of this episode, we want to distinguish between
corpus linguistics and computational linguistics. Although language
corpora are used to train systems in machine learning, corpus
linguists engage in the computer-assisted analysis of large text
collections, often combining automated statistical analysis with
manual qualitative analysis. A company using such mixed corpus
linguistic methods to provide their customers with insights about
their products and services is
Relative Insight. (We did not receive any funding from them for
this episode, but they are a spin-off company that started at
Lancaster University.)
A critical evaluation of another area of computational
linguistics, topic modelling, written by two corpus linguists
is:
Brookes, G., & McEnery, T. (2018). The utility
of topic modelling for discourse studies: A critical evaluation.
Discourse Studies, 21(1): 3-21.
https://doi.org/10.1177/1461445618814032
(Incidentally, the above paper is also based on data about
patient satisfaction.)
The PhD thesis on automatic irony detection that Bernard
mentions was written by Cynthia Van Hee and is available
here.
The second interview quest is another one of Bernard’s
colleagues from Ghent University, Orphée De Clercq. Her recent
publications include:
De Bruyne, L., De Clercq, O., & Hoste, V. (2021). Annotating
affective dimensions in user-generated content. Language Resources
and Evaluation, 55(4): 1017-1045.
De Clercq, O., De Sutter, G., Loock, R., Cappelle, B., &
Plevoets, K. (2021). Uncovering machine translationese using corpus
analysis techniques to distinguish between original and
machine-translated French. Translation Quarterly, 101: 21-45.
And finally, we talk to Doris Dippold from the University of
Surrey in the UK. Her work on chatbots can be found in:
Dippold, D., Lynden, J., Shrubsall, R., & Ingram, R. (2020). A turn
to language: How interactional sociolinguistics informs the
redesign of prompt: response chatbot turns. Discourse, Context &
Media, 37.
https://doi.org/10.1016/j.dcm.2020.100432