Skip to content

Four Things to Avoid Before Translating Word Documents

Most Microsoft Word documents are not optimized for language translation. Although Word is by far the most common source file format for document translation, many people who author and edit documents in Word have never been trained in advanced Word techniques or they continue to cling to bad habits. This blog covers four fairly common mistakes made by Word authors, which can affect both text leveraging in translation and billable time for post-translation document formatting. Be sure to also read our follow-up blog, ” 5 More Things to Avoid Before Translating Word Documents.”

1. Tabs separating text that should flow in Word table cells

One of the most common challenges that translation company DTP staff encounter with Word documents is text that has been entered “line-by-line”, with tabs, to simulate the appearance of “stacked” lines of text. The screen capture below shows the Microsoft Word “print” screen display of this type of text.

shane mcMich no symbols

The screen capture below shows the same Word document text with paragraph marks and hidden formatting symbols displayed. Some of the text is selected to indicate the order in which words were entered. Notice that tabs were used to separate text that appears to be in separate “columns”; the green circles indicate forced line breaks.

shane selected w symbols

Unfortunately, the translation software used by linguists at your translation company will “see” the words in the same order in which they were entered. In this case, translation memory tools cannot be leveraged effectively for previously translated text because the translation software will see the text as “Shane McMichael à +1 866-272-5874 à Director of Marketing.” Previously translated text would likely have the word order as “Shane McMichael à Director of Marketing à Americas.”

The screen capture below shows the best way to format this type of stacked text in Microsoft Word. A simple one row/two column table was created, with the ruling turned off. Lines of text were entered as separate paragraphs, rather than as a single paragraph with forced line breaks. Text selection reveals that words were entered in an order that produces higher leveraging with translation memory from your previously translated text.

shane in table cell

2. Avoid tables in Word with fixed row height

English source documents in Microsoft Word destined for translation sometimes have tables with fixed row heights. The author or editor may have a preference for matching row heights that achieve a more pleasing table display.

This style can cause expensive reformatting due to “hidden” expanded text in translated Word documents. If paragraph marks and other hidden formatting symbols are not displayed, it may not be evident that some text has wrapped below the bottom of the table cell height.

The screen capture below shows a table with a fixed row height of one inch for both rows. The first row displays all of the English text. The second row, which contains translated German, has some hidden text that has wrapped below the bottom of the cell, as indicated by the red squares.

fixed row height

To correct this condition, select appropriate rows in a table; with the right mouse button, choose Table Properties, select the Row tab, and change “Row height is:” from Exactly to At least.

change row height

The row height will now automatically adjust for expanded translated text, as shown in the screen capture below.

non fixed row height

You may find further guidance helpful on how to resize all or part of a table on the website.

3. Text boxes in Word with no room for text expansion

Many Word authors are in the habit of resizing text boxes in illustrations to be just wide enough to display source English text. Text can expand up to 30% in translation, which will cause the copy to wrap below the bottom of such a text box.

The screen capture below shows a photo with an example of English text in a “tight” text box near the top of the photo, and translated German text in the same size text box near the bottom left of the photo. Text expansion has caused the German text to wrap, dropping one character to a hidden second line. This style of text box would require manual resizing to accommodate text expansion.

two text boxes

The screen capture below shows a better solution for text boxes with translated text in Microsoft Word illustrations. In this case, the blue handles indicate that the text box has been horizontally resized to be wide enough to display expanded, translated text. Notice that the background fill pattern has been turned off to make the wider text box visually pleasing in all target languages.

EN text box.jpg

4. Page breaks or section breaks adjacent to text

Improperly placed page breaks and section breaks can affect both translation integrity and post-linguistic formatting. To gain a sense of just how delicate Microsoft Word section breaks can be, review the webpage ” Getting Rid of Section Breaks, but Not Section Formatting“.

Page breaks inserted immediately at the end of a sentence, instead of in a separate blank paragraph, can cause challenges in cleaning and preparing files for translation. The screen capture below illustrates a “high risk” page break: if the publisher carelessly uses the backspace or delete key too quickly to delete this page break, formatting will be lost in the heading.

page break

Section breaks placed adjacent to numbered headlines or numbered list items will often cause incorrect numbering in translated files. Recommendation: whenever possible, insert manual page breaks or section breaks in empty paragraphs.