Littera Deusto

Modern Languages, Basque Studies and Humanities

Comparing Machine Translators: Japanese to English

abril 16th, 2011 · No hay Comentarios

Machine translation

Machine translators (MTs) are computer software that produce automatic translations from one natural language to another, more or less efficiently. They are programmed by linguists to follow specific grammar rules, and employ large lexicons and corpora databases to be informed of statistics on the usage of vocabulary and expressions of each language. MTs are, therefore, usually capable of dealing with basic sentences appropriately, but generate less accurate results when given complex texts.

Machine translation can be helpful when we want to understand a foreign text approximately, but, at least as of today, computers are nowhere near as reliable as human translators nor even nonprofessional bilinguals. Because MTs do not have a mind of their own, they cannot recognize nor capture context-specific connotations, cultural items, puns, etc., nor are able to always choose the right meaning when rendering polysemic words.

Common errors in different MTs

To show some examples of the typical problems found when translating with a machine, we will be checking the results produced by a few free and online -that is to say, the most accessible to everyone- MTs, in an attempt to informally evaluate them. We will take a short text in Japanese and compare the different translations into English that each MT comes up with.

Original text and human translation:

アイザック・アシモフ(1920年1月2日 – 1992年4月6日)はアメリカの作家、生化学者。非常に成功した多作の作家であり、その著作は500冊以上を数える。 彼の扱うテーマは科学、言語、歴史、聖書等々非常に多岐にわたる が、特にSF、一般向け科学解説書、推理小説によってよく知られている。

「『夜来たる』は、わたしのプロ作家としての経歴の中で、一つの転換点となった作品である(中略)突然、私は重要な作家と見なされ、SF界が私の存在に注目するようになった。何年か後には、わたしはいわゆる”古典”を書いたことがはっきりした」

Isaac Asimov (January 2, 1920 – April 6, 1992) was an American author and biochemist. He was an extremely successful prolific writer, whose works amount to more than 500 books. Although he dealt with wide-ranging themes such as science, language, history, the Bible and more, he is mostly known for his science fiction, handbooks of science for the general public and mystery novels.

“‘Nightfall’ is the work that became a turning point in my career as a professional writer (omission) Suddenly, I was regarded as a major writer, and the world of science fiction was made aware of my existence. Some years later, it became clear that I had written a so-called ‘classic’”[1]

(Retrieved April 6, 2011)


Bing Translator

Isaac Asimov (January 2, 1920 – April 6, 1992) is American writer and biochemist. Successful very prolific writer, counting more than 500 books and writings. Themes dealing with his science, language, history, Bible, etc. very spanning is especially science fiction, for general scientific manuals, mystery novel by well known.

“‘Nightfall’ was a piece was one turning point in his career as a professional writer I (omission), suddenly I important writer is considered the science fiction world in my presence featured to like was. Several years after I wrote the so-called “classical” it is clear the

(retrieved April 12, 2011)

Highlighted errors
by well known: Although the original “によって” can sometimes mean “by”, the MT has not been able to infer that, in this context, the translation should be “for”. In addition, instead of rearranging the elements of the sentence so that they follow the logical order in English (to be known + FOR + something), the MT has left each of them where they were in Japanese (something + FOR + to be known).

his (…) I: For the resulting translation to include a third person pronoun (“his”) is strange, since the original quote marks the presence of a first person very clearly. Perhaps the reason “I” only appears later on is because the machine has not understood the structure of the sentence properly.

the science fiction world in: Although preposition-like elements appear after their complements in Japanese, postpositions are not common in English. The MT should have placed “in” before the noun phrase (“the science fiction world”) it is accompanying.

it is clear the: Again, the order of the elements does not make sense in English. The MT has produced a sequence of words that seem to be unconnected to the previous part of the sentence, as “it” and “the” do not have a reference.

Overall
Bing Translator has been quite accurate in the translation of the vocabulary and each individual expression, even proving able to interpret that “SF” means “science fiction” (the rest of the MTs in this article have not been able to). Structure-wise, some of the sentences are nevertheless difficult to follow. The part with the quote (the second paragraph) becomes particularly tangled, first because it begins introducing a third person pronoun that should not be there, and second because the elements of each sentence do not appear in what in English would be the expected order, which confuses and slows down the reading process.

SDL FreeTranslation.com

) Is (be on January 2, 1920-on April 6, 1992 ISAC Asimov be the writer, biochemist of America. 《主語なし》I am the writer of the polygraph who succeeded very much and the creation that count 500 or more copies. The theme that he treats is well-known by the science explanation book, detective stories for SF, general although it spans very variously the etc. such as science, language, history, the Bible.

“‘(The omission) that at night coming’ is the work that became one conversion point in the history as my professional writer, suddenly, I am regarded with an important writer and the SF world has come to pay attention to my existence. It became clear that I wrote so-called’ the classics‘ at the later time, several years”

(Retrieved April 6, 2011)

Highlighted errors
) Is: Apparently, the MT has rearranged the elements of the sentence and moved the verb to the front, somehow also dragging the second half of the parenthesis with it. Although verbs do not go right at the beginning of sentences in English (SVO, there is a subject before), they appear earlier than they do in Japanese (SOV, verbs go at the very end), so that might explain the machine’s action.

the: アメリカの作家 could either mean “an American writer” or “the American writer”. The MT has chosen the wrong option.

《主語なし》I: For some reason, the machine has added the annotation that no subject is indicated in Japanese. Then, it has introduced a first person pronoun immediately afterwards, probably because, since sentences in English always need a subject, the machine has found the need to create one by itself.

polygraph: “Polygraph” refers to an author who can write about many different topics, but 多作 usually means writing a lot and not necessarily in more than one field.

s (plural marker): Japanese nouns have no grammatical number, so whether古典refers to a single classic or more is left ambiguous if there is no context. Within this context, however, the noun is singular, so the MT has made the wrong choice.

Overall
The result produced by FreeTranslation is at first glance not very accurate, either lexically or grammatically. Nevertheless, lines such as “the SF world has come to pay attention to my existence” are impressively precise. This time, the rendition of the second paragraph is clearer than the one of the first paragraph, while in the case of every other MT, the opposite is true. In fact, when it comes to the part of the quote alone, FreeTranslation‘s translation is probably the one that lets us figure out more easily what the message of the Japanese original was.

Reverso

Isaac Asimov (from January 2, 1920 to April 6, 1992) is an American writer, a biochemist. it is a writer of very successful many products, and to count more than the writing 500 books. The theme that he treats scientizes it, and a language, the history, the Bible diverge into many branches very much, but are particularly well known by SF, an article for public science commentary book, a whodunit.

In “as “since night-time“, my career as a professional writer, watched it with the writer that I was important suddenly (omission) who was the work which became one turning point, and SF world came to pay attention to my existence”. It became clear later in how many years that I wrote so-called “classic”

(Retrieved April 6, 2011)

Highlighted errors
scientizes it: The MT has interpreted that the noun 科学 (science) was actually working as a verb.

are: The MT has lost the reference of Asimov/he as a subject (singular), and probably thinks that “many branches” (plural) is what should be connected to and match in person and number with the verb “[to be] well known”.

an…a: As stated, Japanese nouns indicate no grammatical number by themselves. In this context, both 般向け科学解説書 and 推理小説should be translated as plural, but the MT has not been able to infer as much.

: This particle has no specific translation in English, as its function is simply structural. During the translation, instead of being blend together with the rest of the sentence, it has somehow being left behind and untranslated due to its unspecific meaning.

since night-time: The MT has only been able to give a literal translation of 夜来たる, instead of looking for the actual English title of the story.

watched it with the writer: Although 見なされ comes from 見なす(“to consider”), 見る (“to see”, “to watch”) shares the same kanji, so the MT seems to have confused both verbs. By itself, 作家と would mean “with the writer”, but because of the verb afterwards, と should be understood as “as” instead of “with”.

how many years: 何年 can only be rendered as “how many years” when the sentence is a direct question. In this case, 何年か後 is simply “after a few (indeterminate quantity) years”.

Overall
While there are several mistakes, as a whole, the translation by Reverso is not difficult to follow. Perhaps the second part of the first paragraph is where this MT’s rendition seems to be the weakest, as we get the impression that “the themes that he treats (…) are particularly well known”, rather than that Asimov deals with many themes but is particularly well known for others.

WorldLingo

As for Issac [ashimohu] (1920 January 2nd – 1992 April 6th) the American writer, raw chemist.It is the writer of the multi works which succeed very, the literary work counts 500 volumes or more. That science, language, history and Bible etc. it diverges the theme which is handled very, but especially SF, for the general scientific explanation book, by the detective novel it is well informed.

As for “’the night coming’, in personal history as my professional writer, it is the work which has become one commutation point (omission) suddenly, as for me to be considered the important writer, it reached the point where the SF boundary observes to my existence.Several years later, as for me that generally known ” classic ” is written, it was clear”

(Retrieved April 6, 2011)

Highlighted errors
[ashimohu]: Although “Asimov” is not a word to be found inside a dictionary, other MTs have been able to recognize the famous surname and spell it according to the standard form in English. We can see that this requires cultural –and not just linguistic- knowledge.

raw chemist: The MT has separated the first kanji (生, meaning “raw”) in 生化学者 from the rest of the word (化学者, meaning “chemist”), instead of understanding everything as a single unit.

boundary: By itself, 界 does mean “boundary”, but following another noun it is usually understood as “world”. The MT did not know this.

Overall
WorldLingo is one of the few MTs not to have recognized アシモフ as “Asimov”, which suggests that its cultural knowledge is less than that of other machines’. In addition, some of the sentences it constructs (e.g. “but especially SF, for the general scientific explanation book, by the detective novel it is well informed”) are too tangled to try and decipher. We can easily tell that, at least in this case, the result produced by this MT does not seem to be too reliable.

Google translator

Isaac Asimov (January 2, 1920 – April 6, 1992) was an American author, biochemist. Very successful and prolific writer, his work counts more than 500 books. His deal with scientific themes, language, history, the Bible and so very wide-ranging, especially SF, handbook for science in general, it is well known by the mystery.

“[Nightfall] is, in my professional career as a writer, which marked a turning point in one piece (omission), suddenly I was considered an important writer, SF my attention the existence of the world Now. in a few years, my so-called “classic” that was clearly written “

(Retrieved April 13, 2011)

Highlighted errors
the mystery
: For some reason, the MT has omitted the 小説 (novel) part in 推理小説 (mystery novels).

a turning point in: Although preposition-like elements appear after their complements in Japanese, postpositions are not common in English. The MT should have placed “in” before the noun phrase (“a turning point”) it is accompanying.

SF my attention the existence of the world: Elements that should belong to the same phrase have been separated and moved elsewhere. For example, 私の存在 means“my existence”, and yet the machine has connected存在 (existence) with 界 (world), which in turn should depend on SF (SF界 means“world of science fiction”).

Overall
While Google’s translation of the first paragraph is handled quite well, we find more problems to understand the part with the quote. It is to be praised that semantically, almost every word -including the title of Asimov’s story- has been given the proper equivalent in English, however, the construction of each sentence still fails in terms of arrangement of the elements.

Conclusion

As we can see, machine translation is not perfect, but it is not completely inaccurate either. Each of the MTs we have checked has had different strong and weak characteristics, yet they complement one another in their capacities. Whenever we come across a text written in a language that we do not understand and we cannot contact a human bilingual, the best suggestion would be to pass the text through as many machine translators as possible. By comparing the points where their results differ, we will also find out what it is that they have in common and thus is likely to reflect the real meaning truthfully. A combination of all of the results we can get will provide us with a general idea that should not be too far-off from the original message.

References:

  • Machine translation (March 30, 2011). In Wikipedia, the free encyclopedia. Retrieved April 6, 2011.

[1] Although Asimov’s original quote is already in English, this is the translation of a translation.

Etiquetas:

  • Etiquetas