Breaking the Language Barrier (Apr, 1958)
Very cool, if a somewhat optimistic article from 1958 about machine translation.
Breaking the Language Barrier
Each year, millions of reports on scientific research are publishedâ€”a big fraction of them in foreign languages. In this mass of Russian, Dutch, Chinese, Hindustani data are clues to H-power, interplanetary flight, more powerful batteries, longer-wearing tires. The trouble is: Too few scientists and engineers read foreign languages. What we need is a machine to read one language and type in another: an automatic translator. We’re trying to buildâ€”not one, but several. Engineering problems? Fantastic. Here’s where we stand now.
By Hartley Howe
THE girl sat at the keyboard and punched onto cards the words on the sheet before her. Vyelyichyina ugla opryedyelyayetsya, she banged out, otnoshyenyiyem dlyini dugi k radyiusu . . . Red lights flashed on and off across a central control panel as the cards were fed into a big IBM computer. There was a moment of suspense, finally broken by the chattering of the automatic printer. “Magnitude of angle,” it spelled across the page, “is determined by the relation of length of . . ”
The machine was “translating” the Russian sentences into Englishâ€” automatically printing 2-1/2 lines a second.
It was only a demonstration. The Russian texts were preselected by the experts who programed the computer; the vocabulary was tiny and the sentences simple. But Georgetown University’s translating computer was a portent of things to come. Today, scientists in several countriesâ€”particularly the United States, Great Britain and Russiaâ€”are working out the theory behind machines that may break down the language barriers between nations.
THE urgent need is for quick, working translations of technical research reports and scientific papers. The linguists and mathematicians don’t expect their machinesâ€”once they get themâ€”to translate poetry or plays or novels. Literary shades of meaning will be too delicate for even the most complicated machine.
In technology, it’s a different story. Today scientists can’t keep up with progress in their fields in other countries. Sometimes they are held up by problems that have been solved elsewhere. An example:
A paper on electric switching networks published in Russian was overlooked by Americans who needed it for five years while American scientists painfully duplicated much of the work, at an estimated unnecessary cost of $200,000. As for the Soviet moons: The truth is that American scientists worked frantically to tune in on their signalsâ€”only to find later that they could have learned the exact frequencies months beforehand from articles in a Soviet amateur-radio magazine that we had, but didn’t get around to translating.
RUSSIA does it differently: An army of linguists abstracts into Russian some 400,000 articles on engineering and science every year, as well as making full translations on request. Right now, the United States can’t come near matching that setup. Even if we double the number of Russian scientific journals that are translated or abstracted, which is the plan for 1958, it won’t begin to serve. Our scientists will be getting a look at fewer than half of those that they themselves rate as “significant for research.” And this doubled number of translated
journals will still include only one in 12 of all Russian scientific journals.
Worse, it’s not just a matter of mining Russian journals. Experts say valuable material is to be found in at least 50 languages, and that there are people speaking at least 200 different languages who could use information now locked away in other tongues. Even if human translation weren’t slow and expensive, for many languages besides Russian there’s a frightening scarcity of trained linguists. We’re trying to train themâ€”but over the long haul, the best answer now in sight is a partnership of human translators and machines.
IT WAS World War II use of computers for a special kind of translationâ€”devising and breaking secret codesâ€”that led scientists to consider the possibility of a mechanical translator. For theoretically there’s no reason why computers shouldn’t do three things as well asâ€”or better thanâ€”any human translator:
â€¢ Remember as much language as their builders teach them.
â€¢ Locate the words fast.
â€¢ Deliver all their stored learningâ€” translated.
How would a translating computer actually work? First step would be to fill the computer’s storage system or “memory” with a two-language dictionaryâ€” words in the “input” language and their equivalents in the “output” language, all stored in code.
In translating, each input word would be fed into the computer, which would search its coded memory for the same word. The computer then would “read” â€”pick outâ€”the equivalent in the output language, decode it, and print it by teletypewriter. A simple dictionary of this sort, capable of translating a few German words into English, has been built at the University of Washington.
To see a simplified version of this operation, put a dime in a jukebox. You choose a title; that’s the input word. The machine searches for the title and pulls out the corresponding record. Then the machine plays the record and you hear the output in a new language: music.
But with mechanical translators, there are these complications:
â€¢ A single word can have several forms. In Russian, for example, one stem word may have 29 different endings. Somehow, the machine must recognize the various forms of the basic word.
â€¢ A word can have several meanings. In English, the word “run,” for instance, can mean 54 different things. The computer must pick the one right meaning.
â€¢ Word order is sometimes quite different in other languages. Think of the confusion if “man kills lion” were translated “lion kills man.”
â€¢ Certain words in some languages don’t exist in others. Russian, for example, has no words for “the” or “a.” These words are vital in English: “give a man air,” “give a man an air” and “give a man the air” are quite different.
Combining a machine with a human editorâ€”who would need to know only one languageâ€”might solve some problems. The machine would print all possible meanings of doubtful words and an editor would choose the most likely one. Or an editor might go over the input copy in advance to adapt it for straight word-for-word translation.
BUT most experts believe that the best answer is to build a machine that can match everything a human translator does. Solving the multiple-ending problems would be the easiest. A machine can be designed to separate word bases from endings, find their individual meanings, and put the two together for the correct translations.
A harder problem is to select the one right meaning of a word. But progress is being made. Research has shown that you usually understand a word’s meaning in a sentence because you understand the words on either side of it. The Georgetown-IBM translating computer worked on this principle. When a direct translation was impossible, the computer crosschecked the adjacent words. The Russian word “o,” for example, can mean either “about” or “of.” When the machine came to “o,” it was directed to check a code sign attached to the preceding word. If the code sign was 241, “o” was translated “about.” If code sign 242 was found, the machine used the second English meaningâ€””of.”
At the University of Michigan, where scientists at the Engineering Research Institute are working on a Russian-translation system, the approach is different. The method is statisticalâ€”based partly on the frequency of words in a language â€”as opposed to a literal linguistic approach. Yet so far the experimenters see no reasonâ€”again, in theoryâ€”why such a method couldn’t eventually be used to “train” a computer to produce 90-to-95-percent-accurate translations.
Is 95-percent accuracy enough? That’s the standard also achieved by some other ways of using context that have been worked out. But some experts argue that 95 percent isn’t good enough. The most significant words in a sentence are likely to be the other five percent. In the long run, they believe, it will be quicker to learn more about the structure of language than we know now. We still have only a human’s-eye view of the languages we use. We must take a new look at grammar and structure from the machine’s point of viewâ€”figure out the patterns in the world’s languages in order to teach them to a computer and make a translator out of it.
Once the structural pattern of two languages has been worked out, their various elements can be matched, and the machine programed to build equivalent sentences in the output language from scratch. At Massachusetts Institute of Technology, a team of four linguists and a physicist are making such a study of German. Not surprisingly, they see five to 20 years’ work ahead.
Can we avoid the enormous job of pairing all important languages in this way? Yes, by developing a “master” language â€”either an existing one, such as English, or a new, artificial languageâ€”to which every other language system can be matched. Translations would have to be done twiceâ€”input to master, master to outputâ€”but much less basic research would be needed.
MEANWHILE there are still plenty of straightforward engineering problems for the experts to solve. One is speeding up input. At present, the input text must be copied on punch cards or tapeâ€”a real drawback when the computer rents for $30,000 a month. Some computers, howeverâ€”developed to handle checks in banksâ€”can already “read” numerals printed in special magnetic ink, and adaptation to letters is already under way. Next step will be a photoelectric scanner that can read ordinary printingâ€” it might be hard to persuade foreigners to use magnetic ink just to please our machines.
Still further in the future are translators that will pick up a spoken statement and turn out a printed text in another language. This will require teaming a computer with a machine that will transform sounds into written symbolsâ€”an electronic stenographer. Bell Laboratory’s AUDREY (Automatic Digit Recognizer) can already identify the spoken numbers from “zero” to “nine”â€”but only if a man says them in a clear voice. The problems are staggering because human voices and accents vary enormously. A machine that recognizes that a Georgia cracker and a British duke speak the same tongue will be a triumph of electronics.
This kind of problem isn’t limited to spoken language, of course. The meaning of any sentence depends not only on the actual words but on who wrote or spoke them, when, where, why. Not until the experts figure out some way to feed this information to their machines will they have the equal of a really good human translatorâ€”the scholar who knows that a good translation of Que sera sera might well be “That’s the way the cookie crumbles.”