In a world of deadlines, machine translation almost feels like cheating. Why wait for a translator, when you can just pop a text into ChatGPT, Google Translate or some other machine translation tool and instantly receive a translation? Modern machine translation tools have reached a level where they are usually able to produce a good first draft of a text, sparing you the need to type the entire thing from scratch.
They are, however, very rarely capable of producing flawless translations on their own, and there are some things that you should take into consideration when using machine translation – especially regarding data security. With that in mind, here are some practical tips we have compiled about using machine translation.
A few words about machine translation tools
As of 2023, there are two primary types of machine translation tools: 1) traditional machine translation engines like Google Translate and Bing Translator (most modern ones are based on neural networks), as well as 2) new large language models (LLMs) such as ChatGPT.
Whereas traditional machine translation tools are engineered purely for translation, the main function for LLM-based tools is text generation in general. Due to the massive, multilingual models they leverage, they’ve also been able to learn how to translate between several languages.
What kinds of texts are suitable for machine translation?
All machine translation engines and large language models are typically trained using extensive datasets comprising of publicly accessible online content, particularly content that is available in multiple languages. EU texts are a classic example of a training dataset, as a large percentage of them are published in all the Union’s official languages. A good general rule is that the more a text type is available online in several languages, the better a machine translation tool tends to perform with it. As a result, machine translation tools are generally relatively good with, for instance, legislative texts, privacy policies and generic, informative online articles.
Creative texts have, however, proven to be difficult for most machine translation tools. This shouldn’t come as a surprise, as a large part of a carefully constructed marketing text or narrative piece relies partly on evoking emotions and using words for more than their literal meaning. Translating such texts also requires consideration of the overall impact and intent of the text, rather than a mere word-for-word rendition. Large language models have shown greater proficiency in handling creative texts. They can utilise their capabilities in generating plausible texts across diverse contexts (blogs, correspondence, etc.), capturing not only the words but also the style and function of the text in another language. It’s up to debate whether this is (artificial) intelligence or not, but large language models are at least able to approximate a suitable version in another language.
While ChatGPT excels in English, it should be noted that its performance tends to drop with other languages. Speakers of smaller languages (including Finnish and Swedish) will probably find its output less idiomatic and useful. This is due to the simple fact that there is much less training material available for these languages, which is something that is unlikely to change quickly.
Don’t forget what the text will be used for
In addition to considering the text type being translated, it is important to take the intended use of the translation into account regarding, for instance, the significance of potential errors. For example, an error in a contract’s translation could lead to a high-stakes dispute, or incorrectly translated instructions cause injury to a person operating a device. And it’s safe to say that it doesn’t leave a very good impression of a construction equipment manufacturer, if their key product page about “cranes” is filled with references to birds.
Machine translation proves particularly useful when you simply need a general understanding of a text in an unfamiliar language. For instance, when faced with a massive RFP, machine translation can help you quickly assess whether it is worth pursuing without the costs of translating the entire thing. Raw machine translation also allows you to identify the most relevant sections of a lengthy material (e.g. the same RFP) that might warrant a “proper” translation by a professional translator. This is an area where machine translation tools excel – 100% accuracy isn’t essential, as long as the reader gets a general understanding of the text’s meaning.
What are some typical errors in machine translations?
Compared to old statistical machine translation tools that were still prevalent 5-10 years ago, modern machine translation tools are capable of producing translations that appear almost deceptively fluent. Upon closer examination, you might, however, discover significant issues in content and meaning.
One example of this are so called hallucinations, where the machine translation tool adds elements to the translation that don’t exist in the source text. Large language models are already somewhat notorious for this: if you ask ChatGPT to list sources for a fact, sources are what you’re going to get. The problem is that there’s no guarantee that even half of the ones it has listed actually exist.
Sometimes machine translation tools also mix up related concepts: a text with a reference to Germany might suddenly discuss France instead, or a list of pizza toppings swap pineapple for cherry tomatoes (which some might call an improvement). Sometimes pieces of information might also be omitted entirely, which can cause the meaning of a phrase (“don’t do this thing”) to turn on its head (“do this thing”).
Machine translation tools also struggle with industry-specific terminology – especially when a term holds different meanings in different industries. While the tools will definitely produce some translation for a difficult term, there’s a relatively large chance that it won’t align with the terminology commonly used within the industry or that it might mislead the reader. Therefore, it is essential that someone proficient in both the source and target languages carefully reviews the machine translation, paying attention not only to how things are said but also to the precise meaning conveyed in sentences and individual words in both the source text and translation.
Are machine translation tools secure?
By default, machine translation tools are NOT secure, although this can vary between different tools. All publicly available free tools (Google Translate, ChatGPT, Bing Translator) record everything you enter into them, and offer no guarantees against selling your input or utilising the content for their own purposes. As such, it’s important never to input confidential information, business secrets, or highly personal data into these free machine translation tools. If you wouldn’t publish the same text simultaneously on your website, avoid entering it into a free machine translation tool.
Machine translation tools with a paid plan generally offer better data security, but it is still advisable to carefully review the terms of service. A third option that is guaranteed to be secure is introduced in the next section.
Does Delingua also offer machine translations?
If reading this article made you feel like you were in a bit over your head – don’t worry! At Delingua, we have a machine translation solution that can assist you: either directly in your browser similar to Google Translate, or separately for larger files and projects. Delingua’s machine translation engine is guaranteed to be secure and can also be trained on a customer level, leveraging the translations performed by our professional translators specifically for that customer, and refining its suggestions accordingly.
Want to discuss your use case in more detail? Please leave us a contact request, we’re happy to help!