Basics of a Translation Memory: Translation Costs Per Word

A translation memory, or TM, is a centralized database of translated content. A TM leverages existing translations to improve consistency, reduce translation costs and save time. When you submit a project for translation, the translation company will run the content through an existing translation memory if you have one. If you don’t, the translation company will create a new translation memory. After this work, the translation company will give you a quote for the project based on the results of the TM analysis.

A translation memory stores content in segments. Think of a sentence as being a segment in writing. However, in a translation memory, a segment may not actually be a complete sentence, but rather a grouping of sequential words, which allows the translator to have some context to provide an accurate translation of the segment. In this blog, we’ll look at the types of TM matches and how they’re used to create a project quote.

Categories of Translation Memory

There are four categories of translation memory matches.

#1 – New

A new word is a segment that doesn’t match the source file or existing translation memory. New words are the most expensive type of segment.

#2 – Repetitions

Repetition is a full segment match within a source file for words in a duplicated segment. For example, if the segment “black shoes” is used more than once in a project, the segment will be quoted as a repeated segment.

#3 – 100% Match

100% match is a match against an existing translation memory but not within the current source file. A 100% match leverages existing translations from previous projects.

#4 – Fuzzy Match

A fuzzy match is a match of a segment that’s very similar to a 100% match against an existing memory, but it’s not an exact match. A fuzzy match may vary from another segment due to additional space, a different word or two, a change in punctuation or some other very minor difference.

A fuzzy match appears on a translation quote as a percentage of the segment variation such as 75% to 99% or some mix within this range. However, it’s not quite as simple as a percentage. An algorithm will dictate if a fuzzy segment is close enough to be a useful match. Matches below 75% aren’t useful, as a translator would most likely take as much or more time to address rather than just translating the segment from scratch. If a segment is less than 75% fuzzy, it will be counted as new words and treated as such.

How a Translation Memory Effects Project Costs

A translation company processes quotes using translation costs per word. Across the localization industry, repetition and 100% matches are charged at 30% of a new word rate for a language. Fuzzy matches are charged at 70% of a new word rate with some companies offering a variation of this if they offer broken-out ranges between 75% and 99%. But again, the percentage concept is misleading and may not be a true percentage of the difference within a segment.

This is consistent with how vendors pay translators for their review of these matches. A fuzzy match will take a translator more time than a full match (repetition or 100% match) to review, so the cost is higher. A fuzzy match may be a simple, quick edit or it could require a complete translation. On average, it takes less time for a translator to address than it would for a new segment.

Get Started with a Translation Memory

As we mentioned, a translation company can get you started with a translation memory. If you have translated content, whether the content is in a document, website or software, you can send it to your translation company to create the TM. Every time you have a new translation project, you’ll leverage the existing translations. As a result, you’ll increase consistency while reducing translation costs and saving time.

The Basics of a Translation Memory