Outils pour utilisateurs

Outils du site


public:perseus

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
Dernière révision Les deux révisions suivantes
public:perseus [2017/05/12 08:09]
benedicte.pincemin@ens-lyon.fr
public:perseus [2017/12/01 17:53]
benedicte.pincemin@ens-lyon.fr
Ligne 1: Ligne 1:
-This page is dedicated to project ​using TXM on texts taken from the Perseus Digital Library :+This page is dedicated to projects ​using TXM on texts taken from the Perseus Digital Library :
   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]
     * XML edition (Github)     * XML edition (Github)
   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)
  
-Please ​take care that this is a public page.+Please ​note that this is a public page.
  
 Anybody who has subscribed to txm-users mailing list can edit this page. Anybody who has subscribed to txm-users mailing list can edit this page.
 +
 +====== Projects ======
 +
 +  * [[public:​perseus_201707_plato|July 2017, 29 greek texts from Plato.]] Context : paper submitted to [[https://​chs.harvard.edu/​CHS/​article/​display/​1167?​menuId=66|Classics@]].
 +  * [[public:​perseus_201705_cicero|May 2017, 29 latin texts from Cicero.]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_agdt_201705_plato|May 2017, 1 greek annotated text from Plato (AGDT2).]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_201212_plautus|December 2012, 20 latin plays from Plautus.]] Context : presentation at the [[http://​www.dh.uni-leipzig.de/​wo/​e-humanities-seminar/​|University of Leipzig eHumanities Seminar]] on December 5th, 2012.
  
 ====== CICERO corpus : demontration of Perseus Latin texts in TXM ====== ====== CICERO corpus : demontration of Perseus Latin texts in TXM ======
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== Project presentation ===== ===== Project presentation =====
Ligne 14: Ligne 23:
   * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]]   * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]]
  
-  * objectif ​:+  * goal :
     * demonstrating that one can work on texts available from Perseus project in TXM     * demonstrating that one can work on texts available from Perseus project in TXM
     * TEI compliant import     * TEI compliant import
Ligne 24: Ligne 33:
  
   * Available ressources (approximate list)   * Available ressources (approximate list)
 +    * p4top5.xsl
 +      * TEI P4 to P5 conversion
     * txm-filter-perseus-tei-xtz.xsl     * txm-filter-perseus-tei-xtz.xsl
-      ​* p4 to p5 conversion +      * management of numbered div: div1, div2 
-      ​* management of numbered div : div1, div2 +      * management of nested <​text>:​ when <​group>​ then includes <​subtext>​ instead of <​text>​ 
-      * management of nested <​text>​ : when <​group>​ then includes <​subtext>​ instead of <​text>​ +    * teiheader-to-metadata.xsl:​ gets information from teiHeader and adds them as attribute to <​text>​ element. 
-        * teiheader-to-metadata.xsl ​(?) : gets information from teiHeader and adds them as attribute to <​text>​ element. +    * a useful macro : text2metadata:​ generates a metadata.csv from the XML-TXM files of a corpus. Must be used before starting import process.
-    * a useful macro : text2metadata ​à vérifier(to be checked) ​: generates a metadata.csv from the XML-TXM files of a corpus+
  
 ===== Specifications ===== ===== Specifications =====
Ligne 46: Ligne 56:
 Distribute <​milestone>​ attributes'​ information on word tokens (when available). Distribute <​milestone>​ attributes'​ information on word tokens (when available).
  
-Get page number when available, put it as an @n attibute on <pb> element so thant TXM can use it to number pages in HTML Edition.+Get page number when available, put it as an @n attibute on <pb> element so that TXM can use it to number pages in HTML Edition.
  
 Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics. Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics.
Ligne 57: Ligne 67:
   * a copy of every XML file for latin texts of Cicero downloaded from Perseus DL.   * a copy of every XML file for latin texts of Cicero downloaded from Perseus DL.
   * a directory named "​xsl",​ which includes :   * a directory named "​xsl",​ which includes :
-    * a directory ​named "​2-front",​ which includes :+    * a subdirectory ​named "​2-front",​ which includes :
       * p4top5.xsl       * p4top5.xsl
       * txm-front-teiperseus-xtz.xsl       * txm-front-teiperseus-xtz.xsl
-    * a directory ​named "​3-posttok",​ which includes :+    * a subdirectory ​named "​3-posttok",​ which includes :
       * txm-posttok-addRef-perseus.xsl       * txm-posttok-addRef-perseus.xsl
  
Ligne 90: Ligne 100:
  
 <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes. <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes.
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== XSL Perseus stylesheets used for this import ===== ===== XSL Perseus stylesheets used for this import =====
Ligne 95: Ligne 107:
 ==== txm-front-teiperseus-xtz.xsl ==== ==== txm-front-teiperseus-xtz.xsl ====
  
-<​code>​+<​code ​XML>
 <?xml version="​1.0"?>​ <?xml version="​1.0"?>​
 <​xsl:​stylesheet <​xsl:​stylesheet
Ligne 242: Ligne 254:
 ==== txm-posttok-addRef-perseus.xsl ==== ==== txm-posttok-addRef-perseus.xsl ====
  
-<​code>​+<​code ​XML>
 <?xml version="​1.0"?>​ <?xml version="​1.0"?>​
 <​xsl:​stylesheet xmlns:​edate="​http://​exslt.org/​dates-and-times"​ <​xsl:​stylesheet xmlns:​edate="​http://​exslt.org/​dates-and-times"​
Ligne 360: Ligne 372:
 </​code>​ </​code>​
  
-    ​txm-filter-perseustreebank-xmlw.xsl +**[[public:perseus|>>>​ Back to TXM Perseus Projects main page]]**
- +
-====== PLAUTELAT & PLAUTEEN TXM demo ====== +
- +
-===== Goal ===== +
- +
-  ​Context is 2012-12-05 University of Leipzig eHumanities Seminar +
-  * goal was to demo TXM on Latin and English translations of Plaute'​ plays from Perseus +
- +
-===== Corpus ===== +
- +
-Corpus au Plaute'​s plays in Latin and their translation in English from Perseus. +
- +
-Import parameters (updated from XML/w to XTZ): +
-  * 2-front : +
-    * txm-filter-teiperseus-xmlw.xsl +
-    * txm-filter-teip5-xmlw-preserve.xsl +
-  * lat.par TreeTagger model +
- +
-  * PLAUTELAT: corpus of Plaute'​ Latin plays +
-    * source: ​[[https://​sharedocs.huma-num.fr/​wl/?​id=qftriVBBeFES4jmt2BIobq1IqtypXGnK|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​src/​plautelat-src.zip]] +
-    ​binary: [[https://​sharedocs.huma-num.fr/​wl/?​id=eOLdijlvM50Qep1BQTz7UICvYHS3bPDq|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​bin/​PLAUTELAT.txm]] +
-  ​PLAUTEEN: corpus of Plaute'​ English translation of plays +
-    * todo +
- +
----- +
--> [[:|Retour à la liste des projets]]. +
public/perseus.txt · Dernière modification: 2017/12/01 17:54 par benedicte.pincemin@ens-lyon.fr