Baidu Unveils Deep Voice that can Synthesize Human Speech Quickly
Ellie Froilan | | Mar 10, 2017 04:39 AM EST |
(Photo : Getty Images) Baidu Deep Voice can synthesize speech that will sound natural and realistic by itself.
Baidu has been developing its own AI system for four years, unveiling recently the Deep Voice system that is faster more efficient compared to Google's WaveNet.
According to Google of China, Deep Voice can be trained to speak in just a few hours with little to no human interaction. The company can manage how it speaks to convey different emotions, resulting to the synthesizing of speech that will sound pretty natural and realistic.
Like Us on Facebook
Baidu’s team of researchers at the Chinese giant’s AI Lab in Silicon Valley said that Deep Voice may require some initial human fine-tuning during the training period, but eventually it can synthesize speech that will sound natural and realistic by itself.
They have separated the text into graphemes, which is the smallest written particle. Next is translating them into phonemes, the smallest speech particle, and relay the information in sound. Each of the steps is being managed by machine-learning algorithms, which need to perform at an incredible rate to sound realistic.
Moreover, the researchers further explained that they may have improved on an existing system but the same still requires too much computational power. To achieve realistic human speech synthesis, the system requires maintaining sampling rate in the region of 48KHz and process text in 20 microseconds. Further, the company has already tested the said model and produced a ‘high quality’ result as per crowdsourced perceptions.
“To perform inference at real-time, we must take great care to never recompute any results, store the entire model in the processor cache (as opposed to main memory), and optimally utilize the available computational units.
We optimize inference to faster-than-real-time speeds, showing that these techniques can be applied to generate audio in real-time in a streaming fashion,” according to the statement of Baidu AI researchers.
Baidu’s AI researchers believe that producing real-time speech synthesis is possible. They have uploaded the audio samples to the crowdsourcing Amazon site Mechanical Turk to ask large number of people about the quality of their samples.
TagsBaidu, Deep Voice, Baidu Deep Voice, Google, AI, Artificial Intelligence, Google WaveNet
©2015 Chinatopix All rights reserved. Do not reproduce without permission
EDITOR'S PICKS
-
Did the Trump administration just announce plans for a trade war with ‘hostile’ China and Russia?
-
US Senate passes Taiwan travel bill slammed by China
-
As Yan Sihong’s family grieves, here are other Chinese students who went missing abroad. Some have never been found
-
Beijing blasts Western critics who ‘smear China’ with the term sharp power
-
China Envoy Seeks to Defuse Tensions With U.S. as a Trade War Brews
-
Singapore's Deputy PM Provides Bitcoin Vote of Confidence Amid China's Blanket Bans
-
China warns investors over risks in overseas virtual currency trading
-
Chinese government most trustworthy: survey
-
Kashima Antlers On Course For Back-To-Back Titles
MOST POPULAR
LATEST NEWS
Zhou Yongkang: China's Former Security Chief Sentenced to Life in Prison
China's former Chief of the Ministry of Public Security, Zhou Yongkang, has been given a life sentence after he was found guilty of abusing his office, bribery and deliberately ... Full Article
TRENDING STORY
-
China Pork Prices Expected to Stabilize As The Supplies Recover
-
Elephone P9000 Smartphone is now on Sale on Amazon India
-
There's a Big Chance Cliffhangers Won't Still Be Resolved When Grey's Anatomy Season 13 Returns
-
Supreme Court Ruled on Samsung vs Apple Dispute for Patent Infringement
-
Microsoft Surface Pro 5 Rumors and Release Date: What is the Latest?