When it comes to speech recognition software, we’ve come a long way.
Machines are becoming increasingly adept at receiving, recognising, and understanding the spoken word.
Let’s look at the history…
Introducing Audrey!
Who is Audrey?!
Not how, but what! Audrey is recognised as being the first official example of our modern speech recognition technology, designed by Bell Laboratories in the 1950s. Audrey was very big – taking up a whole room to itself – and could only recognise 9 digits (numbers 1-9) spoken by its developer. What was even more impressive was that Audrey achieved this with 90% accuracy.
In those early days, progress was slow. For example, it was another 12 years until IBM’s Shoebox was launched, and even then, the Shoebox was only able to recognise and differentiate between 16 words. It’s only since 2010 that we’ve had the Google Voice Search app which could be personalised to an individual voice and was able to ‘learn’ speech patterns for higher accuracy.
The pros of speech recognition software
The technology can be hugely beneficial to a number of different users.
Visually- and Hearing-Impaired
Many people with visual impairments need text-to-speech dictation systems and screen readers in their daily lives. And for the hearing-impaired, converting audio into text can be a critical communication tool.
Specific learning difficulties
For people with specific learning difficulties such as dyslexia, speech recognition software can provide very useful support. Software allows the user to not only dictate into documents, but also control the computer with their voice. This is ideal for someone with dyslexia who may have difficulty with spelling or is better at communicating verbally or for someone who has physical difficulties and is unable to use a keyboard / mouse.
Hands free driving
There are strict rules in place for using your phone when you’re driving, the consequences of which can include a hefty fine and points on your licence. With this in mind, being able to communicate with Apple’s Siri or Google Maps to request directions while you’re driving reduces your chances of getting lost and means you don’t need to pull over and navigate a phone or read a map.
Specific voice recognition for security
One emerging use of this type of technology is in the world of financial services. Most of us are familiar with online banking and how face recognition software can be used to access our bank accounts on our smartphones. But some organisations are now beginning to use voice identification as part of their user identification process. This works by using a speech authentication factor that acts as a unique ‘password’ to unlock protected accounts only when your voice is used – very similar to how facial recognition software is used. And the principle is the same: everyone’s voice sounds different so accessibility is protected.
But speech recognition for transcription lags behind…
There’s still a way to go before speech recognition software is advanced enough to provide the levels of accuracy that transcription services require. Currently the technology simply isn’t advanced enough to learn the nuances of language, discern accents and understand context in the same way as humans can – all of which are vital to accurately convey what has been said. This is why many people find that AI is falling short in the transcription arena.
While there may have been significant advancements in AI capabilities in recent years, but humans still stand out as being the best!
Fiona Shipley has been providing clients with quality transcripts for more than 32 years.
We work with clients for whom nothing short of an accurate, detailed record will do. Indeed some clients have come to us to rework content that’s been created using AI – experiencing first hand that AI for transcription can be a false economy.
To find out more about how Fiona Shipley provides outstanding transcription services to our clients, please get in touch via alex@fionashipley.com.