Outils pour utilisateurs

Outils du site


public:tutorial_to_use_stylo_into_txm

Tutorial to use the stylometric tools of the Stylo R package into TXM

Install stylo package

If needed, install the Stylo R package into the R of TXM:
(Stylo must be installed in the TXM-hosted R in order to be applied to TXM-created tables)

[reference documentation of Stylo installation : stylo: R package for stylometric analyses]

  • switch to the R perspective
    • using the “R” perspective button in the toolbar or “View > Perspectives > R” menu
  • create a new R session
    • click on the [New session] button in the toolbar
      → a new “sessionX.R” script file is opened in a text editor
  • copy the following R code in the script:
    install.packages("stylo")
  • run the script
    • click on the green “Run” button in the toolbar
      → the Console display installation messages and finish with:

      Rserve>* DONE (stylo)


  • should the install process abort for some reason, you can try to install Stylo from R directly:
    • get the path to the R used by TXM in the 'TXM > Advanced > Statistics Engine > Path to the R executable file' preference:

[/usr/lib/TXM-0.8.3beta/../../../home/sheiden/.TXM-0.8.3/plugins/org.txm.statsengine.r.core.linux_1.0.0.202305171515/res/linux64/bin/R in this screenshot]

  • quit TXM
  • launch that R from command line
  • install the R package as usual
  • restart TXM to let the TXM R session discover the new package
If you get the following error during the installation: ..bin/Rcmd: 64: exec: INSTALL: not found
you must add the 'x' right to you 'INSTALL' script file and relaunch the install:
chmod +x /usr/lib/TXM-0.8.3beta/.../res/linux64/bin/*
in this example

Create a corpus

If needed, import a new corpus into TXM:

  • let's install a Stylo sample corpus
  • download the A_Small_Collection_of_British_Fiction novels corpus source repository zip archive
    (see reference “Ressources section” of the Computational Stylistics Group web site)
  • unzip the archive and rename the “corpus” folder to “BritishFiction”
  • launch TXM
  • Run the “File > Import > Corpus > TXT+CSV” import command
    • select the “BritishFiction” source folder
    • in the “Import parameters” form:

Load the stylo package in TXM R workspace

Load the stylo package into TXM:

  • edit your R session script or create a new one (by clicking on the “New Session” button in the toolbar)
  • copy the following R code in the script :
library(stylo)
  • select the R code with the mouse and execute it through the contextual menu command “R > Execute selected text”

Call stylo() R function on a TXM frequencies table

Build the frequencies table

  • if needed, switch to the “Corpus perspective”
  • let's compare all the texts of the corpus
    • run the “Partition” configuration tool on the “ BRITISHFICTION” corpus with the “Corpus > Partition” main menu or contextual menu
      • in the parameters window:
        • use “Simple” mode
        • select the “text” structure
        • select the “id” property
      • click on “OK”
        → a new “text_id” partition is created
  • let's compare the texts by all their uses of the <Adj> <Noun> pattern (that is all the bi-grams of an adjective immediately followed by a noun), counting the bi-grams by their lemma :
    • run the “Index” analysis tool on the “BRITISHFICTION > text_id” partition (with the “Tools > Index” main menu or contextual menu):
      • in the Query field, use the [enpos="JJ.*"] [enpos="NN.*"] CQL query
      • in the “Properties” parameter, click on the “Edit” button select only the “enlemma” property (to count the lemma of the sequences instead of their graphical forms), then click on “Ok”
      • in the parameters drawer
        • set the “Vmax” parameter to 500 (to only build the table of the 500 most frequent lemma sequences)
      • click on “Compute”
        → a new “BRITISHFICTION > text_id > [enpos=“JJ.*”] [enpos=“NN.*”]:enlemma” frequencies table is created (as a partition index)

  • send the frequencies table to R
    • send the Partition Index object to R with the “Tools > Send to R” main menu or contextual menu
      → the Console displays the name of the R symbol created (“PartitionIndex1” for example):

      <[enpos="JJ.*"] [enpos="NN.*"]>@enlemma ≤7,676,409 /500 has been copied in the PartitionIndex1 R variable.

Wrap the stylo() call into a TXM macro to catch the graphics it produces

By default, R graphics generated by Stylo are displayed in an external window and frozen (you can't zoom or pan in it).

To display Stylo graphics into usual TXM windows, you embed the R code calling Stylo in a Groovy script managing the R devices for you.

TXM macros are Groovy scripts easy to create and to call.

  • let' create a new “Stylo” macro to call Stylo on the PartitionIndex1 frequencies table
    • open the “Macro” view with the “View > Views > Macro” menu command
    • click on the [New macro] button
      • give it the name “Stylo”
        → a new “Stylo” macro is added to the Macro view
        → a new text editor is opened on the “StyloMacro.groovy” file
    • in the text editor, replace all the default content by the following content :
      (notice the “PartitionIndex1” variable name in the script)
import org.txm.statsengine.r.core.RWorkspace
import org.txm.rcpapplication.commands.*
import org.txm.rcp.commands.OpenBrowser

def r = RWorkspace.getRWorkspaceInstance()

// start logging R output in the console
r.setLog(true)

// use a temporary file to save the graphic
def f = File.createTempFile("txm", ".svg")

// execute R code generating the graphics in a SVG device
r.plot(f, "stylo(frequencies=t(subset(PartitionIndex1\$data, select= -F)), relative.frequencies=F)")
println "Plot saved in ${f.getAbsolutePath()}"

// open a new window to display the graphics
monitor.syncExec(new Runnable() {
	@Override
	public void run() {
		OpenBrowser.openfile(f.getAbsolutePath(), "stylo plot")
		}
})

// stop logging R output
r.setLog(false)
  • save your edits

Call stylo() by calling the Stylo macro

  • double-click on the “Stylo” macro in the Macro view to run it
    • stylo() is called :
    • choose Stylo parameters
      • for example got to the Statistic tab and click on OK to run the Cluster Analysis
        → a new TXM window is created to hold the result graphic:
      • you can zoom in and zoom out with Control-wheel and pan with horizontal and vertical scrollbars
public/tutorial_to_use_stylo_into_txm.txt · Dernière modification : 28/06/2023 16:51 de slh@ens-lyon.fr