Here’s a short snippet of audio which people hear in different ways.
What do you think – does he say ‘I can’ or ‘I can’t’?
Why not take a straw poll of your friends and colleagues to see what they think – then read on for a scientific perspective!
So how can questions like this be decided?
Do questions like this even have a definitive answer? Or is it just up to whatever each individual thinks?
Let’s take a look
This audio was discussed extensively on an email list of language professionals. Opinion was divided. More importantly, reasons for the opinions were divided.
I am grateful to my friend and colleague Richard Cauldwell for bringing this example (and many others) to the attention of the list, and for discussing it in detail with me. You can find Richard’s account of the discussion at this link.
The discussants on the list ended up agreeing to disagree. However, I think from a forensic phonetics perspective, this issue is fairly straightforward to resolve. The interesting thing is, you can’t fully resolve it just by listening more and more carefully to the sample. Or by debating whose ears are better than whose.
You have to look further afield in the audio, and bring in other kinds of evidence
Here’s a file containing a series of snippets from the same recording in which the speaker says either ‘I can’ or ‘I can’t’.
Here’s a spectrogram of all those cans and can’ts.
For a phonetician who can both listen and look at the same time, it is quite easy to both see and hear that there is ‘glottalisation’ associated with the ‘can’t’ pronunciation but not the ‘can’ pronunciations. Glottalisation is a kind of creakiness to the voice that often occurs when a ‘t’ has been omitted or elided – almost like a ‘remnant’ of the ‘t’. If you listen carefully to the audio, you can probably notice the glottalisation yourself.
Importantly, there’s no glottalisation in our section of interest.
Here’s another thing – maybe a little harder to hear with untrained ears, but give it a go. The ‘n’ at the end of our ‘can’ has been influenced by the ‘g’ of ‘give’, to create a sound that is more like ‘ng’ than ‘n’. That kind of assimilation would be rather unlikely to happen if there was a ‘t’ between the ‘n’ and the ‘g’, even if that ‘t’ was ‘deleted’. However it is fully expected when you get an ‘n’ at the end of one word followed by a ‘g’ at the start of the next word.
Finally, it’s also useful to look at the context in which the snippet occurs. The speaker has just declined to give advice about another topic, and is now agreeing to give advice about how to become a writer – as evidenced by the fact that he immediately goes on to give his best advice (which, for the record, is ‘show up and write every day’: fine advice indeed!).
All in all, there is quite a lot of scientific evidence to support a ‘can’ interpretation for this audio – and to override those who might say, for whatever reason, ‘but I hear it differently’.
So what is this audio??
It comes from a wonderful BBC interview with American singer-songwriter Randy Newman. It is well worth listening to the whole thing – even though it has absolutely nothing to do with forensics or crimes (maybe because it has nothing to do with forensics or crimes!).
You can find the whole interview here (or click the image to the right).
Our ‘section of interest’ is at 23min 40sec (but go back to 22min 40sec for the relevant context).
So where is the connection with forensic phonetics?
It is interesting to ask what would happen if this were a covert recording and the question of whether the speaker said ‘can’ or ‘can’t’ was crucial to the verdict (not an unlikely scenario in fact).
Sometimes, phoneticians can help resolve disputed utterances with some certainty. Sometimes, even phoneticians can’t determine for sure what is said. Hey – make sure you get that ‘can’ and ‘can’t’ the right way round ;-).
It’s very important to distinguish between these two kinds of cases, and doing so reliably requires the insight of a real expert in phonetics. Current law allows interpretation of indistinct audio to be ‘a matter for the jury’, but that is known to risk serious injustice. The reasons are discussed extensively on this site, but here it is in one sentence: the hallmark of indistinct audio isn’t that it is indecipherable, but that it is open to different interpretations by different listeners in different contexts.
What that means is: the poorer the quality of the audio, the more likely it is that non-expert listeners can be unwittingly influenced to form confident but inaccurate opinions as to what is said and who is saying it.
It’s the mark of an expert to know when to say ‘I don’t know’ – and the courts should respect the opinion of phoneticians who give that conclusion.
So – I wonder how Randy would feel about his words being dissected in this way!?
Do you think he’d be pleased or displeased? Maybe he’d write a song about it. That would certainly help put forensic phonetics on the map for us…
Don’t know yet why putting forensic phonetics on the map is important? Check our 90-second video and get the background here.
Or check out more fun audio examples like this one via Fun Audio Examples!