Skip to content

Preparing Source Files for Language Translation

In several previous blogs, GPI has frequently referred to “source language” files or “source files.” For clients new to language translation services, the biggest surprise may be the number of source files that can be required for a complex project, e.g. an extensive website localization project. But just what is a source file?

In most instances, source files will refer to project assets/files which contain text that must be translated and localized. This could include graphics files (e.g. Adobe Illustrator or Adobe Photoshop) which contain editable text layers. The most common examples of a text-based source file would be a Microsoft Word file, a FrameMaker file or an XML file. In this blog we will explore some of the pre-translation and post-translation file preparation steps that are necessary to complete the translation process.

Source Files for Document Translation


Documentation source formats for language translation can include different types of files such as Word, excel, PowerPoint, InDesign, plain text (i.e. comma separated text files), InDesign, FrameMaker etc. The preparation of source files by your language translation services provider for document translation depends a great deal on the file format.

Some document file formats require little or no pre-translation preparation by engineering resources, and are essentially “ready to go.” For example a Word document can be prepared for translation in its native format, unless there are special instructions for special texts or tags externalization.

File format complexity for document translation varies

More complex document file formats like Adobe InDesign and unstructured Adobe FrameMaker cannot be prepared for translation in their native, binary format. InDesign documents must first be export to an .INX format before being further modified to work with language translation software used by linguists. On the other hand, unstructured FrameMaker files must be saved to .MIF before being further modified for translation. With regular FrameMaker documents, some further steps by your language translation services company are necessary, like turning off change bars and hyphenation.

The preparation process is done using different tools in order to prepare files based on the file format such as Trados TagEditor for a file format like PowerPoint or Excel and Trados S-Tagger for unstructured FrameMaker.

The preparation process also includes analyzing source files using a tool like Trados Workbench. An accurate analysis for documentation source files depends on the integrity of file preparation based on the project requirements.

Post translation file processing steps

After translation, document files have another round of linguistic engineering preparation before further steps towards a final deliverable can be taken. As with source files preparation, final preparation depends on the file format. In this step, intermediate bilingual files (which still display the source and target language) need to be reviewed and prepared in order to be delivered in the same format as the original document source file. TagEditor and S-tagger are also among the tools used in this final process.


The role of Translation Memory (TM)

Translation Memory is a very important tool used during source file preparation, during the translation process itself and during post-translation file processing.

Translation Memory is a database that stores so-called “segments”, which can be sentences or sentence-like units (headings, titles or elements in a list) that have previously been translated. Translation memory may be product line- or project-specific.

During analysis, the Translation Memories are used to calculate the word count for new or pre-translated segments. Updating Translation Memories is a very important step after translation. Over time, Translation Memories decreased translation costs, as more and more previously translated segments are stored and accessed in future projects.

Source files for website translation


For website localization, the same preparation tools (i.e. TagEditor) can be used with source files, but additional steps are needed in the pre- and final preparation steps. Based on the website programming language, website files should be prepared properly by externalizing tags such as HTML tags, C++, JavaScript or PHP internal Code.