Outils pour utilisateurs

Outils du site


public:perseus

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
public:perseus [2017/05/12 07:43]
benedicte.pincemin@ens-lyon.fr
public:perseus [2017/12/01 17:54]
benedicte.pincemin@ens-lyon.fr
Ligne 1: Ligne 1:
-This page is dedicated to project ​using TXM on texts taken from the Perseus Digital Library :+This page is dedicated to projects ​using TXM on texts taken from the Perseus Digital Library :
   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]
     * XML edition (Github)     * XML edition (Github)
   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)
  
-Please ​take care that this is a public page.+Please ​note that this is a public page.
  
 Anybody who has subscribed to txm-users mailing list can edit this page. Anybody who has subscribed to txm-users mailing list can edit this page.
  
-====== ​CICERO corpus : demontration of Perseus Latin texts in TXM ======+====== ​Projects ​======
  
-===== Project presentation ===== +  * [[public:​perseus_201707_plato|July 2017, 29 greek texts from Plato.]] Context : paper submitted to [[https://​chs.harvard.edu/​CHS/​article/​display/​1167?​menuId=66|Classics@]]. 
- +  * [[public:perseus_201705_cicero|May 2017, 29 latin texts from Cicero.]] Context ​Conference ​[[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference ​on the project ​//Der digital turn in den AltertumswissenschaftenWahrnehmung - Dokumentation ​Reflexion//, Heidelberg, May 11–13, 2017
-  * context ​Heidelberg, ​May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]] +  * [[public:perseus_agdt_201705_plato|May 20171 greek annotated ​text from Plato (AGDT2).]] Context ​Conference [[http://www.altphil.uni-freiburg.de/texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den AltertumswissenschaftenWahrnehmung ​Dokumentation ​Reflexion//HeidelbergMay 11–132017
- +  * [[public:perseus_201212_plautus|December 2012, 20 latin plays from Plautus.]] Context ​presentation at the [[http://www.dh.uni-leipzig.de/wo/e-humanities-seminar/|University of Leipzig eHumanities Seminar]] on December 5th, 2012.
-  * objectif : +
-    * demonstrating that one can work on texts available from Perseus ​project in TXM +
-    * TEI compliant import +
-    * if possible, nice editions (could be shown through another corpus) +
- +
-  * corpus +
-    * Cicero'​s texts, latin edition ​a copy is here : [[https://​sharedocs.huma-num.fr/#/948/​3789/​Projets/​Textom%C3%A9trie/​Corpus/​src/​perseus/​Cicero/​170502latin]] +
-      * we get all files ending with _latexcept cic.pet_lat.xml because it's a text from Q. Tullius Cicero instead of M. Tullius Cicero. +
- +
-  * Available ressources (approximate list) +
-    * txm-filter-perseus-tei-xtz.xsl +
-      * p4 to p5 conversion +
-      * management of numbered div div1div2 +
-      * management of nested <text> : when <​group>​ then includes <​subtext>​ instead of <​text>​ +
-        * teiheader-to-metadata.xsl (?) : gets information ​from teiHeader and adds them as attribute to <​text>​ element. +
-    * a useful macro : text2metadata à vérifier(to be checked: generates a metadata.csv from the XML-TXM files of a corpus +
- +
-===== Specifications ===== +
- +
-Conversion from TEI P4 to TEI P5 (Sebastian Ratz stylesheet). +
- +
-Metadata ​from <​teiHeader><​fileDesc><​titleStmt>,​ get +
-  * first <​title>​ content, +
-  * first <​author>​ content, +
-  * first <​editor>​ content. +
- +
-Manage XML-TEI features which wouldn'​t work with CQP : +
-  * div1, div2 -> div +
-  * <​text><​group><​text>​ -> <​text><​group><​textgroupitem>​ (or other better tag name) +
- +
-Distribute <​milestone>​ attributes'​ information on word tokens (when available). +
- +
-Get page number when available, put it as an @n attibute on <pb> element so thant TXM can use it to number pages in HTML Edition. +
- +
-Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics. +
- +
-===== Solution. ===== +
- +
-Make a directory (e.g. "​cicero"​). +
- +
-This directory includes : +
-  * a copy of every XML file for latin texts of Cicero downloaded from Perseus DL. +
-  * a directory named "​xsl",​ which includes : +
-    * a directory named "2-front",​ which includes : +
-      * p4top5.xsl +
-      * txm-front-teiperseus-xtz.xsl +
-    * a directory named "3-posttok",​ which includes : +
-      * txm-posttok-addRef-perseus.xsl +
- +
-Then run the TXM command File>​Import>​XML-XTZ + CSV with the following settings : +
-  * Source directory is "​cicero"​ (in our example). +
-  * Import parameters ​: +
-    * Main Language : la (to use Treetagger with Latin parameter if TreeTagger has been setup and associated with TXM) +
-    * Lexical Segmentation : no change ​Default settings +
-    * Editions : Build edition, Words per page = 750, Page break tag = pb +
-    * Display font : default setting (Font name = <​default>​) +
-    * Commands : Concordance context structure limits = text +
-    * Textual planes : +
-      * Outside-text = teiHeader,front,back +
-      * Outside-text to edit = bibl +
-      * Note elements = note +
-      * Milestone elements = [nothingleave blank] +
-      * Options : default (= remove temporary directories) +
- +
-===== Planification ===== +
- +
-==== Étape 1 ==== +
- +
-==== Étape 2 ==== +
- +
-etc. +
- +
- +
-    * txm-filter-perseustreebank-xmlw.xsl +
- +
-====== PLAUTELAT & PLAUTEEN TXM demo ====== +
- +
-===== Goal ===== +
- +
-  * Context is 2012-12-05 University of Leipzig eHumanities Seminar +
-  * goal was to demo TXM on Latin and English translations of Plaute'​ plays from Perseus +
- +
-===== Corpus ===== +
- +
-Corpus au Plaute'​s plays in Latin and their translation in English from Perseus. +
- +
-Import parameters (updated from XML/w to XTZ): +
-  * 2-front : +
-    * txm-filter-teiperseus-xmlw.xsl +
-    * txm-filter-teip5-xmlw-preserve.xsl +
-  * lat.par TreeTagger model +
- +
-  * PLAUTELAT: corpus of Plaute'​ Latin plays +
-    * source: ​[[https://​sharedocs.huma-num.fr/​wl/?​id=qftriVBBeFES4jmt2BIobq1IqtypXGnK|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​src/​plautelat-src.zip]] +
-    * binary: [[https://sharedocs.huma-num.fr/​wl/?​id=eOLdijlvM50Qep1BQTz7UICvYHS3bPDq|davs://​sharedocs.huma-num.fr/dav.php/@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​bin/​PLAUTELAT.txm]] +
-  * PLAUTEEN: corpus of Plaute'​ English translation of plays +
-    * todo +
- +
----- +
--> [[:|Retour à la liste des projets]].+
  
public/perseus.txt · Dernière modification: 2017/12/01 17:54 par benedicte.pincemin@ens-lyon.fr