Outils pour utilisateurs

Outils du site


public:perseus

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
Dernière révision Les deux révisions suivantes
public:perseus [2017/05/12 08:08]
benedicte.pincemin@ens-lyon.fr
public:perseus [2017/12/01 17:53]
benedicte.pincemin@ens-lyon.fr
Ligne 1: Ligne 1:
-This page is dedicated to project ​using TXM on texts taken from the Perseus Digital Library :+This page is dedicated to projects ​using TXM on texts taken from the Perseus Digital Library :
   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]
     * XML edition (Github)     * XML edition (Github)
   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)   * [[ https://​perseusdl.github.io/​treebank_data|The Ancient Greek and Latin Dependency Treebank]] (Github)
  
-Please ​take care that this is a public page.+Please ​note that this is a public page.
  
 Anybody who has subscribed to txm-users mailing list can edit this page. Anybody who has subscribed to txm-users mailing list can edit this page.
 +
 +====== Projects ======
 +
 +  * [[public:​perseus_201707_plato|July 2017, 29 greek texts from Plato.]] Context : paper submitted to [[https://​chs.harvard.edu/​CHS/​article/​display/​1167?​menuId=66|Classics@]].
 +  * [[public:​perseus_201705_cicero|May 2017, 29 latin texts from Cicero.]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_agdt_201705_plato|May 2017, 1 greek annotated text from Plato (AGDT2).]] Context : Conference [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference on the project //Der digital turn in den Altertumswissenschaften:​ Wahrnehmung - Dokumentation - Reflexion//,​ Heidelberg, May 11–13, 2017.
 +  * [[public:​perseus_201212_plautus|December 2012, 20 latin plays from Plautus.]] Context : presentation at the [[http://​www.dh.uni-leipzig.de/​wo/​e-humanities-seminar/​|University of Leipzig eHumanities Seminar]] on December 5th, 2012.
  
 ====== CICERO corpus : demontration of Perseus Latin texts in TXM ====== ====== CICERO corpus : demontration of Perseus Latin texts in TXM ======
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== Project presentation ===== ===== Project presentation =====
Ligne 14: Ligne 23:
   * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]]   * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]]
  
-  * objectif ​:+  * goal :
     * demonstrating that one can work on texts available from Perseus project in TXM     * demonstrating that one can work on texts available from Perseus project in TXM
     * TEI compliant import     * TEI compliant import
Ligne 24: Ligne 33:
  
   * Available ressources (approximate list)   * Available ressources (approximate list)
 +    * p4top5.xsl
 +      * TEI P4 to P5 conversion
     * txm-filter-perseus-tei-xtz.xsl     * txm-filter-perseus-tei-xtz.xsl
-      ​* p4 to p5 conversion +      * management of numbered div: div1, div2 
-      ​* management of numbered div : div1, div2 +      * management of nested <​text>:​ when <​group>​ then includes <​subtext>​ instead of <​text>​ 
-      * management of nested <​text>​ : when <​group>​ then includes <​subtext>​ instead of <​text>​ +    * teiheader-to-metadata.xsl:​ gets information from teiHeader and adds them as attribute to <​text>​ element. 
-        * teiheader-to-metadata.xsl ​(?) : gets information from teiHeader and adds them as attribute to <​text>​ element. +    * a useful macro : text2metadata:​ generates a metadata.csv from the XML-TXM files of a corpus. Must be used before starting import process.
-    * a useful macro : text2metadata ​à vérifier(to be checked) ​: generates a metadata.csv from the XML-TXM files of a corpus+
  
 ===== Specifications ===== ===== Specifications =====
Ligne 46: Ligne 56:
 Distribute <​milestone>​ attributes'​ information on word tokens (when available). Distribute <​milestone>​ attributes'​ information on word tokens (when available).
  
-Get page number when available, put it as an @n attibute on <pb> element so thant TXM can use it to number pages in HTML Edition.+Get page number when available, put it as an @n attibute on <pb> element so that TXM can use it to number pages in HTML Edition.
  
 Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics. Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics.
Ligne 57: Ligne 67:
   * a copy of every XML file for latin texts of Cicero downloaded from Perseus DL.   * a copy of every XML file for latin texts of Cicero downloaded from Perseus DL.
   * a directory named "​xsl",​ which includes :   * a directory named "​xsl",​ which includes :
-    * a directory ​named "​2-front",​ which includes :+    * a subdirectory ​named "​2-front",​ which includes :
       * p4top5.xsl       * p4top5.xsl
       * txm-front-teiperseus-xtz.xsl       * txm-front-teiperseus-xtz.xsl
-    * a directory ​named "​3-posttok",​ which includes :+    * a subdirectory ​named "​3-posttok",​ which includes :
       * txm-posttok-addRef-perseus.xsl       * txm-posttok-addRef-perseus.xsl
  
Ligne 90: Ligne 100:
  
 <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes. <​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes.
 +
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
  
 ===== XSL Perseus stylesheets used for this import ===== ===== XSL Perseus stylesheets used for this import =====
Ligne 95: Ligne 107:
 ==== txm-front-teiperseus-xtz.xsl ==== ==== txm-front-teiperseus-xtz.xsl ====
  
-<​code>​+<​code ​XML>
 <?xml version="​1.0"?>​ <?xml version="​1.0"?>​
 <​xsl:​stylesheet <​xsl:​stylesheet
Ligne 242: Ligne 254:
 ==== txm-posttok-addRef-perseus.xsl ==== ==== txm-posttok-addRef-perseus.xsl ====
  
-    * txm-filter-perseustreebank-xmlw.xsl+<code XML> 
 +<?xml version="​1.0"?>​ 
 +<​xsl:​stylesheet xmlns:​edate="​http://​exslt.org/​dates-and-times"​ 
 +  xmlns:​xsl="​http://​www.w3.org/​1999/​XSL/​Transform"​ xmlns:​tei="​http://​www.tei-c.org/​ns/​1.0"​ 
 +  xmlns:txm="​http://​textometrie.org/​ns/​1.0"​ 
 +  exclude-result-prefixes="​tei edate" xpath-default-namespace="​http://​www.tei-c.org/​ns/​1.0"​ version="​2.0">​
  
-====== PLAUTELAT & PLAUTEEN TXM demo ======+  <!-- 
 +This software is dual-licensed:​
  
-===== Goal =====+1. Distributed under a Creative Commons Attribution-ShareAlike 3.0 
 +Unported License http://​creativecommons.org/​licenses/​by-sa/​3.0/ ​
  
-  * Context is 2012-12-05 University of Leipzig eHumanities Seminar +2. http://​www.opensource.org/​licenses/​BSD-2-Clause 
-  * goal was to demo TXM on Latin and English translations of Plaute'​ plays from Perseus+  
 +All rights reserved.
  
-===== Corpus =====+Redistribution and use in source and binary forms, with or without 
 +modification,​ are permitted provided that the following conditions are 
 +met:
  
-Corpus au Plaute'​s plays in Latin and their translation in English from Perseus.+* Redistributions of source code must retain the above copyright 
 +notice, this list of conditions ​and the following disclaimer.
  
-Import parameters (updated from XML/w to XTZ): +Redistributions in binary form must reproduce the above copyright 
-  ​2-front : +notice, this list of conditions and the following disclaimer in the 
-    * txm-filter-teiperseus-xmlw.xsl +documentation and/or other materials provided with the distribution.
-    * txm-filter-teip5-xmlw-preserve.xsl +
-  * lat.par TreeTagger model+
  
-  * PLAUTELAT: corpus ​of Plaute'​ Latin plays +This software is provided by the copyright holders and contributors 
-    * source: [[https://​sharedocs.huma-num.fr/​wl/?​id=qftriVBBeFES4jmt2BIobq1IqtypXGnK|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​src/​plautelat-src.zip]] +"as is" and any express or implied warranties, including, but not 
-    * binary: [[https://​sharedocs.huma-num.fr/​wl/?​id=eOLdijlvM50Qep1BQTz7UICvYHS3bPDq|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​bin/​PLAUTELAT.txm]] +limited to, the implied warranties ​of merchantability and fitness for 
-  * PLAUTEEN: corpus ​of Plaute'​ English translation ​of plays +a particular purpose are disclaimedIn no event shall the copyright 
-    * todo+holder or contributors be liable for any direct, indirect, incidental,​ 
 +special, exemplary, or consequential damages ​(including, but not 
 +limited to, procurement of substitute goods or services; loss of use, 
 +data, or profits; or business interruptionhowever caused and on any 
 +theory of liability, whether in contract, strict liability, or tort 
 +(including negligence or otherwisearising in any way out of the use 
 +of this software, even if advised ​of the possibility of such damage.
  
----- +      
--> [[:|Retour à la liste des projets]].+This stylesheet adds a ref attribute to w elements that will be used for 
 +references in TXM concordances. Can be used with TXM XTZ import module. 
 + 
 +Written by Alexei Lavrentiev, UMR 5317 IHRIM, 2017 
 +  ​--
 + 
 + 
 +  <​xsl:​output method="​xml"​ encoding="​utf-8" omit-xml-declaration="​no"/>​  
 +   
 +  ​ 
 +  <​!-- General patterns: all elements, attributes, comments and processing instructions are copied --> 
 +   
 +  <​xsl:​template match="​*"> ​      
 +        <​xsl:​copy>​ 
 +          <​xsl:​apply-templates select="​*|@*|processing-instruction()|comment()|text()"/>​ 
 +        </​xsl:​copy> ​    
 +  </​xsl:​template>​ 
 +   
 +  <​xsl:​template match="​*"​ mode="​position"><​xsl:​value-of select="​count(preceding-sibling::​*)"/></​xsl:​template>​ 
 + 
 +  <​xsl:​template match="​@*|comment()|processing-instruction()">​ 
 +    <​xsl:​copy/>​ 
 +  </​xsl:​template>​ 
 +   
 +  <​xsl:​variable name="​filename">​ 
 +    <​xsl:​analyze-string select="​document-uri(.)"​ regex="​^(.*)/​([^/​]+)\.xml$">​ 
 +      <​xsl:​matching-substring>​ 
 +        <​xsl:​value-of select="​regex-group(2)"/>​ 
 +      </​xsl:​matching-substring>​ 
 +    </​xsl:​analyze-string>​ 
 +  </​xsl:​variable>​ 
 +   
 +   
 +  <​xsl:​template match="​tei:​w">​ 
 +    <​xsl:​variable name="​ref">​ 
 +      <​xsl:​choose>​ 
 +        <​xsl:​when test="​ancestor::​tei:​text/​@*:​id">​ 
 +          <​xsl:​value-of select="​ancestor::​tei:​text[1]/@*:id[1]"/>​ 
 +        </​xsl:​when>​ 
 +        <​xsl:​otherwise>​ 
 +          <​xsl:​value-of select="​$filename"/>​ 
 +        </​xsl:​otherwise>​ 
 +      </​xsl:​choose>​ 
 +      <!-- ajout Perseus --> 
 +      <xsl:if test="​preceding::​tei:​milestone[@unit='​chapter'​][1][@n]">​ 
 +        <​xsl:​text>,​ c</​xsl:​text>​ 
 +        <​xsl:​value-of select="​preceding::​tei:​milestone[@unit='​chapter'​][1]/​@n"/>​ 
 +      </​xsl:​if>​ 
 +      <xsl:if test="​preceding::​tei:​milestone[@unit='​section'​][1][@n]">​ 
 +        <​xsl:​text>,​ s. </​xsl:​text>​ 
 +        <​xsl:​value-of select="​preceding::​tei:​milestone[@unit='​section'​][1]/​@n"/>​ 
 +      </​xsl:​if>​ 
 +      <!-- fin ajout Perseus --> 
 +       
 +      <xsl:if test="​preceding::​tei:​pb[1]/​@n">​ 
 +        <​xsl:​text>,​ p. </​xsl:​text>​ 
 +        <​xsl:​value-of select="​preceding::​tei:​pb[1]/​@n"/>​ 
 +      </​xsl:​if>​ 
 +      <xsl:if test="​ancestor::​tei:​p[@n]">​ 
 +        <​xsl:​text>,​ § </​xsl:​text>​ 
 +        <​xsl:​value-of select="​ancestor::​tei:​p/​@n"/>​ 
 +      </​xsl:​if>​ 
 +      <​!--<​xsl:​if test="​preceding::​tei:​lb[1]/​@n">​ 
 +        <​xsl:​text>,​ l. </​xsl:​text>​ 
 +        <​xsl:​value-of select="​preceding::​tei:​lb[1]/​@n"/>​ 
 +      </​xsl:​if>​-->​ 
 +    </​xsl:​variable>​ 
 +    <​xsl:​copy>​ 
 +      <​xsl:​apply-templates select="​@*"/>​ 
 +      <​xsl:​attribute name="​ref"><​xsl:​value-of select="​$ref"/></​xsl:​attribute>​ 
 +      <​xsl:​apply-templates select="​*|processing-instruction()|comment()|text()"/>​ 
 +    </​xsl:​copy>​ 
 +  </​xsl:​template>​ 
 + 
 +</​xsl:​stylesheet>​ 
 +</​code>​
  
 +**[[public:​perseus|>>>​ Back to TXM Perseus Projects main page]]**
public/perseus.txt · Dernière modification: 2017/12/01 17:54 par benedicte.pincemin@ens-lyon.fr