Writer Robert Fortner doled out harsh criticisms of speech recognition yesterday, and today his faulty logic and false assertions are reverberating across the blogosphere. He argues that speech recognition "flat-lined in 2001," falling far short of human levels.
I have to respectfully and completely disagree. Using Nuance NaturallySpeaking 10.1 software on a fast PC, I achieve accuracy that's nothing short of astounding. Unless I'm drunk, it nails speech accuracy at a level of at least 98%, which happens to be the recognition rate of the human ear. And, since 2001, it's been improved by many orders of magnitude.
I must admit I've been working with NaturallySpeaking for the better part of a decade, and it's intricately familiar with the technology I write about and the words I use. It allows me to dictate thousands of words of text in the time it would've taken me to type hundreds. For Fortner to say it doesn't work and it's a failure is completely misinformed.
I did a quick test of some of the words he said sound so much alike that speech recognition can't handle them. Here's his flagship sentence:
Saying "recognize speech" makes a sound that can be indistinguishable from "wreck a nice beach."
I dictated that sentence into NaturallySpeaking and didn't correct anything. In fact, I dictated this entire post into NaturallySpeaking, and didn't correct a single thing. It even spelled Fortner's name right, and capitalized it. So there. 100% accuracy.
So as far as I'm concerned, his idea that speech recognition is dead can itself rest in peace. And yes, I did dictate that, and it didn't type "rest in peas." But it did substitute the word "bad" for "dead" and oops, just then it misunderstood the word "for," typing "of" — so that's about 99% accurate. Close enough.
Read his post, and keep in mind what I've just dictated to you: