Well there has certainly been a lot of publicity for the laurel/yanny clip recently. It is great to have so many people discussing speech and speech perception – but also a little disheartening that so much misinformation gets accepted as valid phonetics.
Randy Newman: Can he or can’t he?
So how can questions like this be decided?Do questions like this even have a definitive answer? Or is it just up to whatever each individual thinks?
Let’s take a lookThis audio was discussed extensively on an email list of language professionals. Opinion was divided. More importantly, reasons for the opinions were divided.
I am grateful to my friend and colleague Richard Cauldwell for bringing this example (and many others) to the attention of the list, and for discussing it in detail with me. You can find Richard’s account of the discussion at this link.The discussants on the list ended up agreeing to disagree. However, I think from a forensic phonetics perspective, this issue is fairly straightforward to resolve. The interesting thing is, you can’t fully resolve it just by listening more and more carefully to the sample. Or by debating whose ears are better than whose.
You have to look further afield in the audio, and bring in other kinds of evidenceHere’s a file containing a series of snippets from the same recording in which the speaker says either ‘I can’ or ‘I can’t’. Here’s a spectrogram of all those cans and can’ts. For a phonetician who can both listen and look at the same time, it is quite easy to both see and hear that there is ‘glottalisation’ associated with the ‘can’t’ pronunciation but not the ‘can’ pronunciations. Glottalisation is a kind of creakiness to the voice that often occurs when a ‘t’ has been omitted or elided – almost like a ‘remnant’ of the ‘t’. If you listen carefully to the audio, you can probably notice the glottalisation yourself. Importantly, there’s no glottalisation in our section of interest. Here’s another thing – maybe a little harder to hear with untrained ears, but give it a go. The ‘n’ at the end of our ‘can’ has been influenced by the ‘g’ of ‘give’, to create a sound that is more like ‘ng’ than ‘n’. That kind of assimilation would be rather unlikely to happen if there was a ‘t’ between the ‘n’ and the ‘g’, even if that ‘t’ was ‘deleted’. However it is fully expected when you get an ‘n’ at the end of one word followed by a ‘g’ at the start of the next word. Finally, it’s also useful to look at the context in which the snippet occurs. The speaker has just declined to give advice about another topic, and is now agreeing to give advice about how to become a writer – as evidenced by the fact that he immediately goes on to give his best advice (which, for the record, is ‘show up and write every day’: fine advice indeed!). All in all, there is quite a lot of scientific evidence to support a ‘can’ interpretation for this audio – and to override those who might say, for whatever reason, ‘but I hear it differently’.
So what is this audio??It comes from a wonderful BBC interview with American singer-songwriter Randy Newman. It is well worth listening to the whole thing – even though it has absolutely nothing to do with forensics or crimes (maybe because it has nothing to do with forensics or crimes!). You can find the whole interview here (or click the image to the right). Our ‘section of interest’ is at 23min 40sec (but go back to 22min 40sec for the relevant context).
So where is the connection with forensic phonetics?
Sometimes, phoneticians can help resolve disputed utterances with some certainty. Sometimes, even phoneticians can’t determine for sure what is said. Hey – make sure you get that ‘can’ and ‘can’t’ the right way round ;-). It’s very important to distinguish between these two kinds of cases, and doing so reliably requires the insight of a real expert in phonetics. Current law allows interpretation of indistinct audio to be ‘a matter for the jury’, but that is known to risk serious injustice. The reasons are discussed extensively on this site, but here it is in one sentence: the hallmark of indistinct audio isn’t that it is indecipherable, but that it is open to different interpretations by different listeners in different contexts. What that means is: the poorer the quality of the audio, the more likely it is that non-expert listeners can be unwittingly influenced to form confident but inaccurate opinions as to what is said and who is saying it.It is interesting to ask what would happen if this were a covert recording and the question of whether the speaker said ‘can’ or ‘can’t’ was crucial to the verdict (not an unlikely scenario in fact).
It’s the mark of an expert to know when to say ‘I don’t know’ – and the courts should respect the opinion of phoneticians who give that conclusion.
So – I wonder how Randy would feel about his words being dissected in this way!?Do you think he’d be pleased or displeased? Maybe he’d write a song about it. That would certainly help put forensic phonetics on the map for us… Don’t know yet why putting forensic phonetics on the map is important?
Check our 90-second video and get the background here
An explosive murder confession – or a dodgy transcription?
17 March 2015
Listen to these two snippets of muttered self-talk, then read on to see how a transcript can prime journalists’ perception.
If you are among the few who have not already heard the media’s interpretation of this audio, you’ll find it useful if you write down what you hear now, before reading on – and if you have a moment, I would love to be told your perception – you can send a message here.
Why it is not safe to ask the jury to evaluate indistinct recordings
Here is a truly hilarious act by Peter Kay – which also carries a deeper message in relation to forensic recordings. As you are ‘begging for birdseed’, note how real the suggested interpretations sound, even when you know they cannot possibly be correct (or do you?).
Christopher Pyne: the c-word or the g-word?
16 May 2014
Social media claims Christopher Pyne dropped the ‘C’ word in parliament on Wednesday, but he says the word was ‘grub’. (SMH)
Huge interest the last day or two here in Oz as to whether Christopher Pyne, a right-wing politician, swore at a fellow politician in parliament.
What did Oscar Pistorius really say?
10 April 2014
With so many responding to media invitations to form subjective opinions as to whether Oscar Pistorius’ emotion is genuine, are we missing factual errors in the reporting of what he is actually saying? Could scientific analysis help here?
Why comparing alternative transcripts doesn’t necessarily yield the truth of what was said
You may have seen some rather hilarious ‘alternative lyrics’ for Carl Orff’s famous O Fortuna that are circulating on the internet. As with many forms of word play, these are not only entertaining but also give some important insights regarding language and speech – and, in this case, into the effect of priming on evaluation of forensic transcripts.
A new take on satanic messages
You’ve probably heard of the concerns voiced in the 1980s that rock bands could corrupt youth, by recording their songs so that if you played them backwards, the words would turn into a message from satan.
The crisis call experiment
Watch the video here (if you haven’t already)
Hear the full audio here (warning: potentially distressing)
Quick summary here:
- Fraser, H. 2013. Covert recordings as evidence in court: the return of police verballing? The Conversation
More detail here:
- Fraser, H., & Kinoshita, Y. 2021. Injustice arising from the unnoticed power of priming: How lawyers and even judges can be misled by unreliable transcripts of indistinct forensic audio. Criminal Law Journal, 45(3) 142-152.
Current law allows police transcripts to assist juries in understanding the content of indistinct forensic audio – with a number of legal safeguards intended to mitigate any risk that an inaccurate transcript might mislead the jury. The problem is that the safeguards rely on lawyers and judges gaining a sense of personal confidence that they hear words suggested by the transcript. The present article describes a new experiment showing that personal confidence is a poor indicator of perceptual accuracy, since listeners can be easily and unwittingly “primed” to hear words suggested by an inaccurate transcript. This confirms previous research suggesting current safeguards are inadequate, adds new findings regarding the effect of an alternative suggestion, and supports the need for an evidence-based process ensuring all indistinct forensic audio used in court is accompanied by a reliable transcript. It also indicates there is an urgent need to change legal procedures for admission of transcripts of indistinct forensic audio used as evidence in criminal trials.