Microsoft Speech Recognition With New Tech

+




The researchers have done some remarkable work in the computers understanding in everyday speech. They said that computers are getting even better while understanding the worlds which we speak. There was lot of mistakes of the words, which is now reduced from 43% to 6.3% within two decades. Due to variety of players, the figure has gone down and the latest innovation in the speech recognition of Microsoft has been narrowed the gap in particular manner.

Neural Networks Hold The Key To Speech Recognition

IBM and Microsoft started the advent of deep neural networks because of the advancements in the speech recognition technologies. They have prepared the deep neural networks, which are made after taking the inspiration by the biological processes of human brain and it is used in software form to help the computers in understanding the speech in better way.

The chief speech research scientist of Microsoft, Xuedong Huang said that with the use of neural networks they gained Word Error Rate (WER) of 6.3%. They did it when they do the task of industry-standard Switchboard Speech recognition where the WER of Microsoft was at the lowest when it is compared with other speech recognition systems.

During the Interspeech, which is the international conference on speech communication and technology in San Francisco, IBM said that they have gained WER of 6.9 % within only two decades, which was as high as 43%.

How Microsoft Managed to Achieve This

The neural networks are made on different layers and the recent research team of Microsoft won the ImageNet computer vision challenge for deep residual neural network to use it the new cross-network layering system. It is gathered with Computational Network Toolkit (CNTK) to make necessary advancement in speech recognition systems of Microsoft. CNTK allows the neural network algorithms to run the magnitudes quicker than normally it can. There is another main cause of use of GPUs (Graphical Processing Units or Graphic cards in layman terms).

GPUs work best at the parallel processing and it allows the deep neural network algorithms to work more efficiently. It is revealed that Cortana, voice assistant of Microsoft can consume 10 times more speech data with the use of CNTK and GPUs.

 

VN:F [1.9.22_1171]
Rating: 0.0/10 (0 votes cast)
VN:F [1.9.22_1171]
Rating: 0 (from 0 votes)


Advertisement

Comments are closed.