Outils pour utilisateurs

Outils du site


Preparing your laptop for a TXM workshop

The objectives of the TXM Initiation workshop are not only to teach you how to use TXM, but also to allow you to do it in your own working environment, and to leave the workshop with TXM installed and configured properly (the configuration of some parameters, such as results export, is finalized during the workshop).

That is why we recommend everyone:

  • to come with their own laptop, and
  • to install TXM (and the TreeTagger associated software) before coming, to avoid taking on the collective time of the session.

Installation is a priori straightforward, but you know that with computers you can always be surprised, hence our caution.

TXM download and install

The current version of TXM is 0.7.9, instructions for installation are given at the following page:

If you have an older version of TXM, it is recommended to download the new version and launch the installation (your settings and corpora are kept); the transition to 0.7.9 requires a new installation and can not be done through the usual update mechanism.

In case of additional information needed or installation difficulty, please first consult the FAQ (Frequently Asked Questions) section of the users wiki (in French):

The FAQ is constantly evolving so remember to check the latest version when you need it.

Once the installation process is complete, you can make an initial check of your TXM installation:
comment_verifier_que_txm_s_est_installe_correctement (How to verify that TXM is properly installed?)

Furthermore, the TXM Manual is available online:

We will give you a printed version of the manual at the beginning of the workshop (for workshops that take place in Lyon).

TreeTagger software and linguistic models download and install

Installation and setting of the TreeTagger associated software (for on the fly morphosyntactic tagging and lemmatization of the corpus):
http://txm.sourceforge.net/installtreetagger_en.html (the French version of the page may be more up to date)

If possible, install the English language model (in order to process the text samples used in the workshop). Furthermore it is of course interesting to also install language models corresponding to the corpora you plan to analyze with TXM.

You can then test that TreeTagger is operational in your TXM:
comment_verifier_que_treetagger_est_bien_parametre_dans_txm (how to verify that my TreeTagger is properly set in TXM?)

In case of problems, you can find help in the FAQ:
treetagger_ne_fonctionne_pas_comment_bien_regler_treetagger_pour_txm (My TreeTagger doesn't work. How to properly set TreeTagger for TXM?)

The results tables exported by TXM -seen in the initiation workshop-, as the texts metadata tables imported with your corpora -seen in the corpus preparation and import workshop-, use the .csv file format. For handling CSV files, it is recommended to have LibreOffice or OpenOffice installed on his computer:

It is possible to use another spreadsheet program such as MS Excel, but it is much more complicated (Excel “hides” a lot of parameters to simplify common usage) and we usually do not have time during the workshop to resolve all the complications that can arise for different environments. Nevertheless, you can find help on the FAQ for some questions, for example:
Comment ouvrir dans un tableur l'export d'un tableau de résultats de TXM ? (How to open in a spreadsheet an array of results exported from TXM?)

Working corpus for the workshop

Common Corpus (essential)

English spoken workshops: BROWN corpus

Just download the 'brown-bin.txm' and 'TXM import workshop support files 2018.zip' files and save them where you want on your hard drive, to have it available at the time of the workshop. We will use them with TXM together.

English spoken workshops: Wuthering Heights, Emily Brontë, 1847 - ELTeC edition

  • Corpus binary file ready to load into TXM: WUTHERING.txm (to discuss the Progression command)

French spoken workshops: VOEUX corpus

Just download the file and save it where you want on your hard drive to have it available at the time of the workshop. We will load it into TXM together.

A document that you choose for rapid import into TXM (optional)

During the Initiation workshop, we will experiment the simplest import, called “clipboard”, based on a simple copy&paste.

It is an opportunity to see what happens on any of your files. You can just think before you come to put in your computer a file representing a text (in the broadest sense: it can be a record transcription for example) of a dozen pages (let's say between three and a hundred!), or a small corpus (or a sample of a large corpus) set as a single file, rather in a language for which you have installed a TreeTagger language model (parameter file). Take a format for which the “copy” command works: a file that you open in your word processor (.doc, .docx, .rtf, .odt, .txt…), a web page (rather long); avoid .pdf format or check before that the text content can be selected and copied (no “image” pdf).


If you have trouble installing TXM, and you have consulted the FAQ (for all questions concerning installation and launch), and your professional colleagues or staff can not help you, contact us (textometrie AT groupes.renater.fr).

public/preparation_ordinateur_en.txt · Dernière modification: 2019/01/22 18:08 par slh@ens-lyon.fr