In this blog post, we will take a look at how deep learning-based GNMT technology has revolutionized the accuracy and naturalness of Google Translate.
Most people who have used Google Translate have experienced errors in Google Translate at least once. Before using Google Translate, you may have had the expectation that the Google Translate service would solve everything even if you didn’t know the foreign language. However, after using Google Translate, that expectation would have disappeared. As such, Google Translate, which provided an absurd experience just a few years ago, has recently been transformed. The translation context has become much more natural, and even poetic expressions can be interpreted.
Google Translate has had a particularly difficult time handling the subtle nuances between languages. This has caused many users to have to go through the hassle of re-editing or reinterpreting the results of Google Translate. However, this hassle has been greatly reduced recently. Thanks to the introduction of new technology, Google Translate is now able to provide much more sophisticated and accurate translations than before, which is a huge step forward in global communication.
The reason why Google Translate, which had been sluggish, was able to make such a rapid leap forward was because the Neural Machine Translation System (GNMT) was introduced to the Google Translate service. Before the introduction of GNMT technology, the system that supported the existing Google Translate service was a phrase-based machine translation system. This is a system based on a database of grammatical rules and meanings of languages entered by humans, and it had to assemble each translation like a puzzle to translate words or phrases one by one within a given sentence and then combine them into a sentence. As a result, the sentence structure itself felt like it was playing separately, and the word order and context of the translated sentences were very unnatural to read. Naturally, the author’s underlying meaning and intention were also not understood. Google has been providing translation services based on this system, which is why many users have experienced confusion with the translation results.
However, the newly introduced GNMT technology is a translation technology that uses deep learning, a core technology of artificial intelligence. GNMT technology recognizes the flow of the entire sentence and even understands the author’s purpose in the sentence to provide a translation, making the interpretation noticeably smoother. The introduction of GNMT technology is a major innovation in itself, but to understand the changes brought about by this technology, it is necessary to understand the basic concepts of deep learning.
So, what is deep learning, the core technology of artificial intelligence that has brought us to this stage? Deep learning is a general-purpose artificial intelligence algorithm used in the computer Go program AlphaGo, which competed against Lee Sedol. It is a word that has been mentioned frequently in the public interest and that you may have heard of at least once. In addition, you may have heard that the power consumption of AlphaGo during its match against Lee Sedol was enormous. It is said that 1,200 CPUs were used, which is roughly equivalent to 300 computers. These enormous computers study like humans before a game against Lee Sedol. They study which trends lead to victory and which trends lead to defeat. AlphaGo analyzes existing games and creates new games while learning the patterns of the games. After completing the learning process, it plays a game of Go against Lee Sedol. In the middle of a game, AlphaGo looks for the most similar pattern to the current game from the recorded games it has studied previously. AlphaGo learns to play in the direction of the winning game, just like a person would learn. Programs that have been learned in this way can sometimes do better than humans because they can perform hundreds of millions of calculations faster than humans.
As mentioned above, deep learning is a form of artificial intelligence that has evolved from artificial neural networks, which uses information input and output layers similar to neurons in the brain to learn data. Artificial neural networks, which are a form of deep learning that predates deep learning, are an algorithm that processes various information in a way similar to the human brain, modeled on the human brain. The human brain is composed of structural units called neurons, and through experience, we can learn certain functions such as pattern recognition and cognition. These artificial neural networks have become possible with the improvement of computer performance. However, there are some problems with this artificial neural network technology. For example, the learning time is slow, and the direction of change is lost as the number of layers increases. Deep learning technology has made up for the shortcomings of various artificial neural network technologies.
The introduction of deep learning has not only improved the performance of Google Translate, but also opened up the possibility of its application in various fields. For example, a diagnostic system using deep learning in the medical field can play an important role in detecting certain diseases early and suggesting treatment methods. As such, deep learning is increasingly penetrating into various aspects of our lives, and its possibilities are endless.
The GNMT technology, which applies the above-mentioned deep learning technology, does not match words or sentences 1:1 like humans do. The GNMT technology recognizes the entire sentence as a unit of translation, and is able to grasp the context and reflect it in the result. The GNMT technology also analyzes and learns from existing translations. In the process, the GNMT technology can improve its performance by modifying the connections between artificial neural networks.
To evaluate the performance of GNMT, Google researchers selected sentences from Wikipedia articles and news articles and translated them into several languages. They then compared the translations with those of Google’s existing system and human translators. They then asked human evaluators to rate the quality of the translations. The evaluation showed that English translations from notoriously difficult Chinese scored significantly better than the existing system. Translations between some languages also scored close to human translation in terms of accuracy. However, they were found to be inferior to translations between Indian and European languages. The authors of the paper emphasized that “the selected sentences were well-crafted short sentences.”
As such, Google Translate has applied deep learning, which is a core technology of artificial intelligence. Google Translate has been able to accumulate big data by converting materials on the Internet into data and translate entire sentences as a single unit. Since the unit of translation has been expanded from words and phrases to sentences, the translation results have improved significantly compared to the past. In response, Google Translate’s Head of Product Management, Barak Turovsky, said, “Neural machine translation technology reduces the possibility of errors by up to 85%. This is a major evolution over the achievements of the past decade.”
However, there are actually many variations in translating languages. There are individual differences and dialects even within a language, and languages change over time. Moreover, interpretation will be more inadequate and clumsy when translating language plays or poetic expressions. Some argue that neural network translation technology is less accurate than phrase-based machine translation systems. However, since the core technology behind Google Translate is deep learning, translation data is being accumulated in the machine’s brain even as you read this. In the future, it will be possible for the machine to collect and learn information about the data it has just accumulated on its own. As this data accumulates, it will eventually be possible to translate taking into account even the shortcomings that exist today. In fact, Google has admitted that this translation still falls short of human translation and contains many errors. However, it emphasized that it will evolve to be close to perfection as the experience of deep learning-based artificial intelligence is accumulated and related technologies are developed. We can expect that Google Translate, which uses neural network translation technology, will break down language barriers in the near future.