IT Techniques: Machine Translation

Machine Translation (MT) investigates the software usage of translating speech or text from one language to another, as a field of computational linguistics. The very basic Machine Translation simply performs the substitution of words from one language to another, but in actuality, it does not recognize the complete phrase.  The Machine Translation provides the substitutes of words of one language in other any other language, ignoring the phrase of the text.

However, with the growing IT techniques this issue is being resolved with the statistical technique and corpus, by handling the differences in the Linguistic Typology, isolation of anomalies and translation of idioms. This type of translation is effective for formulaic language, legal and government documents. MT output can be improved by including human intervention; MT is a very useful tool for assisting the human translators, Weather reports are a great example of this scenario.

The Translation Process

The process of translation can be described in two simple steps;

  • Decoding the meaning of the source language text.
  • Re-encoding the meaning in the text of Target Language.

In MT, the software is programmed to understand the text of source language like a human translator and requires in-depth information in respect of grammar, syntax, semantics and idioms. The automated translation produces the output in seconds and the result is very close to the human translation quality.

The Linguistic Rules are used in machine translation; the most suitable word in the target language is replaced with the one in the source language text. Usually, this rule based method gives a symbolic representation for generation of text in the target language, known as the intermediary. As per representation of the intermediary, the approach of MT is known as transfer-based machine translation or inter-lingual machine translation.

The rule-based MT is used where you are translating closely related languages. This kind of translation is mostly applied in the creation of grammar programs and dictionaries. It involves information about the linguistics of the source language and the target language, by using the syntactical and morphological rules along with the semantic analysis of languages.

Like inter-lingual machine translation, the Transfer-based MT creates translation from the intermediate representation simulating the meaning of the original sentence. While the Inter-lingual MT is an instance of the approach of rule-based MT, in which the text is translated in a neutral language that is independent of any other language.

MT may also use the words available in the dictionary for translation; such method of translation is based on dictionary entries. The MT by using the statistical method basis itself on the bilingual text, that resultantly provides good results. The software available for Statistical Machine Translations is CANDIDE by IBM and Google has also used SYSTRAN. Another approach of MT is the Example-based MT which uses the already translated text as an idea of analogy. The similar sentences that have been used for translation are put together forming the complete translation. A refined approach, known as the Hybrid MT leverages the rule-based and statistical translation methodology strengths for translation. This approach differs in two ways, one is using the rules-based technique prior to the application of statistical approach while in the second one, the rules are applied to pre-process the data of statistical approach in order to better guide the engine.

Conclusion

Although there are some major issues with the machine translation techniques like disambiguation named entities and non-standard speech, however, MT programs are used all around the world. This approach is highly successful in the field of Medicine and Law, where there are very few human translators available. Despite its limitation, MT by providing the nearly accurate results saves a lot of time and money for companies.

 

Leave a Reply

Your email address will not be published. Required fields are marked *