Outils pour utilisateurs

Outils du site


public:perseus

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
Dernière révision Les deux révisions suivantes
public:perseus [2017/12/01 10:29]
alexei.lavrentev@ens-lyon.fr
public:perseus [2017/12/01 17:53]
benedicte.pincemin@ens-lyon.fr
Ligne 1: Ligne 1:
-This page is dedicated to project ​using TXM on texts taken from the Perseus Digital Library :+This page is dedicated to projects ​using TXM on texts taken from the Perseus Digital Library :
   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]
     * XML edition (Github)     * XML edition (Github)
Ligne 7: Ligne 7:
  
 Anybody who has subscribed to txm-users mailing list can edit this page. Anybody who has subscribed to txm-users mailing list can edit this page.
 +
 +====== Projects ======
 +
 +  * [[public:​perseus_201707_plato|July 2017, 29 greek texts from Plato.]] Context : paper submitted to [[https://​chs.harvard.edu/​CHS/​article/​display/​1167?​menuId=66|Classics@]].
 +  * [[public:​perseus_201705_cicero|May 2017, 29 latin texts from Cicero.]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_agdt_201705_plato|May 2017, 1 greek annotated text from Plato (AGDT2).]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_201212_plautus|December 2012, 20 latin plays from Plautus.]] Context : presentation at the [[http://​www.dh.uni-leipzig.de/​wo/​e-humanities-seminar/​|University of Leipzig eHumanities Seminar]] on December 5th, 2012.
  
 ====== CICERO corpus : demontration of Perseus Latin texts in TXM ====== ====== CICERO corpus : demontration of Perseus Latin texts in TXM ======
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== Project presentation ===== ===== Project presentation =====
Ligne 91: Ligne 100:
  
 <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes. <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes.
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== XSL Perseus stylesheets used for this import ===== ===== XSL Perseus stylesheets used for this import =====
Ligne 361: Ligne 372:
 </​code>​ </​code>​
  
-====== PLATO corpus : demontration of Perseus Greek & Treebank texts (AGDT 2) in TXM ====== +**[[public:​perseus|>>> ​Back to TXM Perseus ​Projects main page]]**
- +
-===== Project presentation ===== +
- +
-  * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]] +
- +
-  * goal : +
-    * demonstrating that one can work on texts available from Perseus project in TXM +
-    * TEI compliant import +
-    * compatibility of TXM with greek language +
-    * showing that TXM can work on the POS annotation provided by the Treebank (TreeTagger is not the only way to get tagged texts in TXM). +
- +
-  ​corpus +
-    ​Plato'​s text Euthyphro from [[https://​perseusdl.github.io/​treebank_data/​|AGDT 2]]: tlg0059.tlg001.perseus-grc1.tb.xml +
- +
-  * Available ressources (approximate list) +
-    * txm-filter-perseustreebank-xmlw.xsl +
- +
-===== Solution ===== +
- +
-Make a directory (e.g. "​plato"​),​ and put inside the XML text file(s) downloaded from Perseus AGDT. +
- +
-Then run the TXM command File>​Import>​XML/​w + CSV with the following settings : +
- +
-1. Source directory is "​plato"​ (in our example). +
- +
-2. Import parameters : +
-  * Main Language : untick "​Annotate the corpus"​ (means : don't use TreeTagger) +
-  * Lexical Segmentation : no change - Default settings +
-  * Front XSL : indicate the copy of txm-filter-perseustreebank-xmlw.xsl in your file system +
-  * Editions : default setting (Build edition, Words per page = 500, Page break tag = pb) +
-  * Display font : default setting (Font name = <​default>​) +
-  * Commands : default setting (Concordance context structure limits = text) +
- +
-3. Click on "Start corpus import"​ (above - beginning of the page) +
- +
-===== Feedback ===== +
- +
-We made 2 changes in the stylesheet : +
-  * a correction : rename Perseus @id attribute on <w> words for compatibility with TXM +
-  * an improvement : add <lb/> elements after each sentence for better rendering in HTML Edition. +
- +
-===== XSL Perseus stylesheet used for this import ===== +
- +
-==== txm-filter-perseustreebank-xmlw.xsl ==== +
- +
-<code XML> +
-<?xml version="​1.0"?>​ +
-<​xsl:​stylesheet +
-  xmlns:​xd="​http://​www.pnp-software.com/​XSLTdoc"​ +
-  xmlns:​edate="​http://​exslt.org/​dates-and-times"​ +
-  xmlns:​xsl="​http://​www.w3.org/​1999/​XSL/​Transform"​ +
-  xmlns:​xsi="​http://​www.w3.org/​2001/​XMLSchema-instance"​  +
-  xmlns:​treebank="​http://​nlp.perseus.tufts.edu/​syntax/​treebank/​1.5"​ +
-  exclude-result-prefixes="​edate xd xsi treebank"​ version="​2.0">​ +
-   +
-   +
-  <xd:doc type="​stylesheet">​ +
-    <​xd:​short>​ +
-      A stylesheet to prepare PERSEUS Treebank XML texts to TXM XML/w import. +
-    </​xd:​short>​ +
-    <​xd:​detail>​ +
-      This stylesheet is free software; you can redistribute it and/or +
-      modify it under the terms of the GNU Lesser General Public +
-      License as published by the Free Software Foundation; either +
-      version 3 of the License, or (at your option) any later version. +
-       +
-      This stylesheet is distributed in the hope that it will be useful, +
-      but WITHOUT ANY WARRANTY; without even the implied warranty of +
-      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ​ See the GNU +
-      Lesser General Public License for more details. +
-       +
-      You should have received a copy of GNU Lesser Public License with +
-      this stylesheet. If not, see http://​www.gnu.org/​licenses/​lgpl.html +
-    </​xd:​detail>​ +
-    <​xd:​author>​Alexei Lavrentiev alexei.lavrentev@ens-lyon.fr</​xd:​author>​ +
-    <​xd:​copyright>​2012,​ CNRS / ICAR (ICAR3 LinCoBaTO)</​xd:​copyright>​ +
-  </​xd:​doc>​ +
-   +
- +
-  <​xsl:​output method="​xml"​ encoding="​utf-8"​ omit-xml-declaration="​no"/>​ +
-   +
-  <​xsl:​template match="​*">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​apply-templates select="​*|@*|processing-instruction()|comment()|text()"/​>  +
-    </​xsl:​copy> +
-  </​xsl:​template> +
-   +
-  <​xsl:​template match="​@*|comment()">​ +
-    <​xsl:​copy/>​ +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​template match="​processing-instruction()"/>​ +
-   +
-  <​xsl:​template match="​text()"><​xsl:​value-of select="​."/></​xsl:​template>​ +
-   +
-<​xsl:​template match="​treebank">​ +
-  <text type="​treebank"​ version="​{@version}"​ date="​{normalize-space(child::​date[1])}"​ annotator-short="​{normalize-space(child::​annotator[1]/​short)}"​ annotator-name="​{normalize-space(child::​annotator[1]/​name)}"​ annotator-address="​{normalize-space(child::​annotator[1]/​address)}">​ +
-    <​xsl:​apply-templates select="​descendant::​sentence"/>​ +
-  </​text>​ +
-</​xsl:​template>​ +
- +
-<​xsl:​template match="​annotator"/>​ +
-   +
-<​xsl:​template match="​sentence">​ +
-  <​xsl:​copy>​ +
-    <​xsl:​apply-templates select="​@*"/>​ +
-    <​xsl:​attribute name="​annotator"><​xsl:​value-of select="​child::​annotator"/></​xsl:​attribute>​ +
-    <​xsl:​apply-templates/>​ +
-  </​xsl:​copy>​ +
-  <​lb/>​ +
-</​xsl:​template>​ +
-   +
-  <​xsl:​template match="​word">​ +
-    <w> +
-      <​xsl:​apply-templates select="​@*[not(name()='​form'​)]"/>​ +
-      <​xsl:​value-of select="​@form"></​xsl:​value-of>​ +
-    </​w>​ +
-  </​xsl:​template>​ +
- +
-<​xsl:​template match="​word/​@id">​ +
- <​xsl:​attribute name="​perseus-id"><​xsl:​value-of select="​."/></​xsl:​attribute>​ +
- +
-</​xsl:​template>​ +
-</​xsl:​stylesheet>​ +
-</​code>​ +
- +
-====== PLAUTELAT & PLAUTEEN TXM demo ====== +
- +
-===== Goal ===== +
- +
-  * Context is 2012-12-05 University of Leipzig eHumanities Seminar +
-  * goal was to demo TXM on Latin and English translations of Plaute'​ plays from Perseus +
- +
-===== Corpus ===== +
- +
-Corpus au Plaute'​s plays in Latin and their translation in English from Perseus. +
- +
-Import parameters (updated from XML/w to XTZ): +
-  * 2-front : +
-    * txm-filter-teiperseus-xmlw.xsl +
-    * txm-filter-teip5-xmlw-preserve.xsl +
-  * lat.par TreeTagger model +
- +
-  * PLAUTELAT: corpus of Plaute'​ Latin plays +
-    * source: [[https://​sharedocs.huma-num.fr/​wl/?​id=qftriVBBeFES4jmt2BIobq1IqtypXGnK|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​src/​plautelat-src.zip]] +
-    ​binary: [[https://​sharedocs.huma-num.fr/​wl/?​id=eOLdijlvM50Qep1BQTz7UICvYHS3bPDq|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​bin/​PLAUTELAT.txm]] +
-  ​PLAUTEEN: corpus of Plaute'​ English translation of plays +
-    * todo +
- +
----- +
--> [[:|Retour à la liste des projets]]. +
public/perseus.txt · Dernière modification: 2017/12/01 17:54 par benedicte.pincemin@ens-lyon.fr