Here at DVICE, you have our attention anytime someone talks about a universal translator, so we're definitely interested in the latest developments from Microsoft in which chief research officer Rick Rashid demonstrated software that converted his speech into Chinese — in real time, and in his own voice.
Yes Star Trek fans, you may pause to swoon, but it's also likely even non-Trekkies will find the latest development in the quest for a viable universal translator to be interesting.
The system works by recognizing a person's words, and then converting the speech into organized sentences — in this case, Chinese. This data is then picked up by speech synthesizing software trained to replicate the speaker's voice and their unique cadence.
It's an upgrade from Microsoft's earlier technology that modified synthesized speech to match a person's voice, but could only speak typed text. The new software is modeled on how networks of brain cells operate, and takes an hour or so to train itself to process a particular person's speech patterns.
In a Microsoft blog post about the new system, Rashid says exploring the new technology mimicking neural networks is responsible for the significant jump in the company's software capabilities, with the error rate dropping from one word in four or five being incorrect to just one in seven or eight.
Fortunately, this is a case where we can see it to believe it. Microsoft's chief research officer demoed the technology to an audience in China late last month, as the video below shows.
In the blog Rashid spoke about the technology:
"In other words, we may not have to wait until the 22nd century for a usable equivalent of Star Trek's universal translator. And we can also hope that as barriers to understanding language are removed, barriers to understanding each other might also be removed."
Before we get ahead of ourselves and start thinking about all the alien life forms we might be able to sit down and chat with in the future, it's important to note the software can have a significant effect on how we communicate right now with people from other countries, and even how we learn to speak languages.
Though it isn't the first and only software out there that attempts to translate the human voice — such as Apple's Siri or AT&T's English to Spanish translator the big difference between these and Microsoft's breakthrough is the new software's ability to learn, process and pass along a human's voice and cadence.
So when do we get to start our own real-time translating? Hold on to your dreams for a little while. While this is a significant new advance towards a universal translator, researchers outside Microsoft Research Asia have not yet finished their work on studying the brain's neural networks to find ways to improve the software, nor has it been released in the real world outside Microsoft Labs.
Though it is a major step forwards a universal translator the team believes they can refine the software further, beyond the communication it can achieve now. Additional training of the software and adding significantly more power are first on the list.
So for those of us who marveled at Uhura "hailing all frequencies" and the universal translator coming instantly online to tackle Klingon and Romulan, will just have to wait a bit longer…but not that much longer it seems!