Translation of the User's Manual

TXM documentation is hosted in the 'doc' project of the SVN repository:

The user manuals are in the manuals directory.

Translations of user manuals are hosted in sub-directories called 'translation_<2 letters code of the language to translate to>', say 'translation_ru' for translations in Russian.

If you use OmegaT for the translation work, use the 'user manuals/translation_ru/omegat' directory as the OmegaT project:

  • the 'omegat/source' directory will contain (a copy of) the source manual
  • the 'omegat/target' directory will contain the translated manual

When finished, put (a copy of) the final translation document in 'user manuals/translation_ru' with a '-ru' suffix.

Using OmegaT for the translation work

Translation is done with OmegaT (always use the “Latest” version, not the “Standard” one).

OmegaT includes a good tutorial and has a complete user manual.


This workflow is used to help translate the manual's updates.

  1. Import Manuel de TXM 0.7 FR.odt in OmegaT → Manuel de TXM 0.7 FR.tmx
  2. Use OmegaT on Manuel de TXM 0.7 FR.tmx to translate
  3. Export translation in ODT from OmegaT → Manuel de TXM 0.7 EN.odt (EN current version translation result)
  4. align Manuel de TXM 0.7 FR.odt and Manuel de TXM 0.7 EN.odt → Manuel de TXM 0.7 FR-EN.tmx
  5. Import Manuel de TXM X.X FR.odt in OmegaT → Manuel de TXM X.X FR.tmx
  6. Use OmegaT on Manuel de TXM X.X FR.tmx and Manuel de TXM 0.7 FR-EN.tmx to translate (everything unchanged will get 100% match)
  7. Export translation in ODT from OmegaT → Manuel de TXM X.X EN.odt (EN new version translation result)



Build a translation memory with a given FR and EN version of the TXM manual

We use maligna to produce a .tmx translation memory file from the FR and EN .odt versions of the same version of the manual.

You can use Heartsome TMX Editor 8.0 to verify the tmx produced.

Then you can put the tmx file in the 'tm' directory of your OmegaT project.

OmegaT will automatically suggest translations based on that new MT, and all 100% matches can be transfered automatically.

Detailed procedure to produce the tmx:

  • get the maligna bitext aligner:
  • get ODT version of EN and FR TXM 0.5 manual:
  • edit the ODT manuals to:
    • remove 'update tables' in front in the two versions (not usefully alignable)
    • add 1 translation to French short-key definitions to align the tables
    • resort the French glossary table to the English order
    • suppress the Index in the two versions (not usefully alignable)
    • 'save as' the manuals to TXT format
  • put the TXT versions in a 'txt' directory (for example)
  • run maligna to process alignment and produce the TMX

bin/maligna parse -c txt txt/TXM-manual-05-EN.txt txt/TXM-manual-05-FR.txt |
bin/maligna modify -c split-sentence |
bin/maligna modify -c trim |
bin/maligna align -c viterbi -a poisson -n word -s iterative-band |
bin/maligna select -c one-to-one |
bin/maligna format -c tmx -l en,fr TXM-manual-0.5-align.tmx

build FR-EN glossary

A nice project would be to build a FR-EN glossary with the help of TXM:

  • import 'Manuel de TXM 0.7 FR-EN.tmx' with the XML-TMX import module
    • this produces a new TXM corpus composed of two parts:
      • TXM 0.7 FR: the French version
      • TXM 0.7 EN: the English one
  • choose X FR words for which we want a translation candidate:
    • for each 'fr-word' word:
      • build the TXM 0.7 EN sub-corpus of all <tuv>s aligned to the corresponding <tuv>s containing the 'fr-word' in the TXM 0.7 FR corpus (aligned CQL query)
      • list the most specific words of that sub-corpus: choose the 'en-word' translation candidate in the first ones

We can write a TXM macro to build a complete table of EN translation candidates for X FR words based on the specificity model.

use EN dictionnaries

use additional translation memories

Other stuff

public/traduction_manuel.txt · Dernière modification: 2017/08/02 11:59 par