Outils pour utilisateurs

Outils du site


public:perseus

Différences

Ci-dessous, les différences entre deux révisions de la page.

Lien vers cette vue comparative

Les deux révisions précédentes Révision précédente
Prochaine révision
Révision précédente
public:perseus [2017/12/01 10:29]
alexei.lavrentev@ens-lyon.fr
public:perseus [2017/12/01 17:54]
benedicte.pincemin@ens-lyon.fr
Ligne 1: Ligne 1:
-This page is dedicated to project ​using TXM on texts taken from the Perseus Digital Library :+This page is dedicated to projects ​using TXM on texts taken from the Perseus Digital Library :
   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]   * [[http://​www.perseus.tufts.edu/​hopper|Perseus Digital Library]]
     * XML edition (Github)     * XML edition (Github)
Ligne 8: Ligne 8:
 Anybody who has subscribed to txm-users mailing list can edit this page. Anybody who has subscribed to txm-users mailing list can edit this page.
  
-====== ​CICERO corpus : demontration of Perseus Latin texts in TXM ======+====== ​Projects ​======
  
-===== Project presentation ===== +  ​* [[public:perseus_201707_plato|July 2017, 29 greek texts from Plato.]] Context ​paper submitted to [[https://chs.harvard.edu/CHS/article/display/1167?​menuId=66|Classics@]]. 
- +  * [[public:perseus_201705_cicero|May 201729 latin texts from Cicero.]] ​Context ​Conference ​[[http://​www.altphil.uni-freiburg.de/texte-messen/digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]]Concluding conference on the project ​//Der digital turn in den AltertumswissenschaftenWahrnehmung ​Dokumentation ​Reflexion//, HeidelbergMay 11–13, 2017. 
-  ​context : Heidelberg, May 2017 : [[http://www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]] +  * [[public:perseus_agdt_201705_plato|May 2017, 1 greek annotated ​text from Plato (AGDT2).]] Context ​Conference ​[[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis|Digital Classics III – Re-thinking Text Analysis]], Concluding conference ​on the project //Der digital turn in den AltertumswissenschaftenWahrnehmung ​Dokumentation ​Reflexion//, HeidelbergMay 11–132017
- +  * [[public:perseus_201212_plautus|December ​2012, 20 latin plays from Plautus.]] Context ​presentation at the [[http://www.dh.uni-leipzig.de/wo/e-humanities-seminar/|University ​of Leipzig eHumanities Seminar]] on December 5th, 2012.
-  * goal : +
-    * demonstrating that one can work on texts available from Perseus project in TXM +
-    * TEI compliant import +
-    * if possible, nice editions (could be shown through another corpus) +
- +
-  * corpus +
-    * Cicero'​s texts, latin edition : a copy is here : [[https://sharedocs.huma-num.fr/#/948/3789/Projets/​Textom%C3%A9trie/​Corpus/​src/​perseus/​Cicero/​170502latin]] +
-      * we get all files ending with _lat, except cic.pet_lat.xml because it's a text from Q. Tullius Cicero instead of M. Tullius Cicero. +
- +
-  * Available ressources (approximate list) +
-    * p4top5.xsl +
-      * TEI P4 to P5 conversion +
-    * txm-filter-perseus-tei-xtz.xsl +
-      * management of numbered divdiv1div2 +
-      * management of nested <​text>:​ when <​group>​ then includes <​subtext>​ instead of <​text>​ +
-    * teiheader-to-metadata.xsl:​ gets information from teiHeader and adds them as attribute to <​text>​ element. +
-    * a useful macro : text2metadata:​ generates a metadata.csv from the XML-TXM files of a corpus. Must be used before starting import process. +
- +
-===== Specifications ===== +
- +
-Conversion from TEI P4 to TEI P5 (Sebastian Ratz stylesheet). +
- +
-Metadata : from <​teiHeader><​fileDesc><​titleStmt>,​ get +
-  * first <​title>​ content, +
-  * first <​author>​ content, +
-  * first <​editor>​ content. +
- +
-Manage XML-TEI features which wouldn'​t work with CQP : +
-  * div1, div2 -> div +
-  * <​text><​group><​text>​ -> <​text><​group><​textgroupitem>​ (or other better tag name) +
- +
-Distribute <​milestone>​ attributes'​ information on word tokens (when available). +
- +
-Get page number when available, put it as an @n attibute on <pb> element so that TXM can use it to number pages in HTML Edition. +
- +
-Render foreign words (tagged with <​foreign>​ element) and titles (<​title>​ elements content) as italics. +
- +
-===== Solution ===== +
- +
-Make a directory (e.g. "​cicero"​). +
- +
-This directory includes : +
-  * a copy of every XML file for latin texts of Cicero ​downloaded from Perseus DL. +
-  * a directory named "​xsl",​ which includes : +
-    * a subdirectory named "​2-front",​ which includes : +
-      * p4top5.xsl +
-      * txm-front-teiperseus-xtz.xsl +
-    * a subdirectory named "​3-posttok",​ which includes : +
-      * txm-posttok-addRef-perseus.xsl +
- +
-Then run the TXM command File>​Import>​XML-XTZ + CSV with the following settings : +
- +
-1. Source directory is "​cicero"​ (in our example). +
- +
-2. Import parameters : +
-  * Main Language : la (to use Treetagger with Latin parameter if TreeTagger has been setup and associated with TXM) +
-  * Lexical Segmentation : no change - Default settings +
-  * Editions : Build edition, Words per page = 750, Page break tag = pb +
-  * Display font : default setting (Font name = <​default>​) +
-  * Commands : Concordance context structure limits = text +
-  * Textual planes : +
-    * Outside-text = teiHeader,​front,​back +
-    * Outside-text to edit = bibl +
-    * Note elements = note +
-    * Milestone elements = [nothing, leave blank] +
-    * Options : default (= remove temporary directories) +
- +
-3. Click on "Start corpus import"​ (above - beginning of the page) +
- +
- +
-Another import can be done, adding a metadata.csv file in order to get more metadata than only the ones automatically extracted from teiHeader (title, first author, first editor). +
- +
-===== Feedback ===== +
- +
-Some features of XML-XTZ import have not been implemented yet, especially @rend attribute seems is not used to interpret <​emph>​ and <hi> elements. So, through the front XSL (import step #2), we have changed some <hi> into <​emph>​ for cases for which we wanted italics in HTML edition. +
- +
-<​note>​ content looses all its markup, this is really a drawback as tagged foreign words and italics are very often use in notes. +
- +
-===== XSL Perseus stylesheets used for this import ===== +
- +
-==== txm-front-teiperseus-xtz.xsl ==== +
- +
-<code XML> +
-<?xml version="​1.0"?>​ +
-<​xsl:​stylesheet +
-  xmlns:​xd="​http://​www.pnp-software.com/​XSLTdoc"​ +
-  xmlns:​edate="​http://​exslt.org/​dates-and-times"​ +
-  xmlns:​xsl="​http://​www.w3.org/​1999/​XSL/​Transform"​ xmlns:​tei="​http://​www.tei-c.org/​ns/​1.0"​ +
-  exclude-result-prefixes="​tei edate xd" version="​2.0">​ +
-   +
-  <xd:doc type="​stylesheet">​ +
-    <​xd:​short>​ +
-      A stylesheet to prepare PERSEUS XML-TEI texts to TXM import. +
-    </​xd:​short>​ +
-    <​xd:​detail>​ +
-      This stylesheet is free software; you can redistribute it and/or +
-      modify it under the terms of the GNU Lesser General Public +
-      License as published by the Free Software Foundation; either +
-      version 3 of the License, or (at your option) any later version. +
-       +
-      This stylesheet is distributed in the hope that it will be useful, +
-      but WITHOUT ANY WARRANTY; without even the implied warranty of +
-      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ​ See the GNU +
-      Lesser General Public License for more details. +
-       +
-      You should have received a copy of GNU Lesser Public License with +
-      this stylesheet. If not, see http://​www.gnu.org/​licenses/​lgpl.html +
-    </​xd:​detail>​ +
-    <​xd:​author>​Alexei Lavrentiev alexei.lavrentev@ens-lyon.fr</​xd:​author>​ +
-    <​xd:​copyright>​2017,​ CNRS / IHRIM (Groupe CACTUS)</​xd:​copyright>​ +
-  </​xd:​doc>​ +
-   +
- +
-  <​xsl:​output method="​xml"​ encoding="​utf-8"​ omit-xml-declaration="​no"/>​ +
-   +
-  <​xsl:​template match="​node()|@*">​ +
-    <!-- Copy the current node --> +
-    <​xsl:​copy>​ +
-      <!-- Including any attributes it has and any child nodes --> +
-      <​xsl:​apply-templates select="​@*|node()"/>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
-   +
-<!-- This template had better be commented if one uses a metadata file with the same information : --> +
-  <​xsl:​template match="/​tei:​TEI/​tei:​text">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​copy-of select="​@*"/>​ +
-      <​xsl:​attribute name="​author"><​xsl:​value-of select="//​tei:​teiHeader/​tei:​fileDesc/​tei:​titleStmt/​tei:​author[1]"/></​xsl:attribute>​ +
-      <​xsl:​attribute name="​title"><​xsl:​value-of select="//​tei:​teiHeader/​tei:​fileDesc/​tei:​titleStmt/​tei:​title[1]"/></​xsl:​attribute>​ +
-      <​xsl:​attribute name="​editor"><​xsl:​value-of select="//​tei:​teiHeader/​tei:​fileDesc/​tei:​titleStmt/​tei:​editor[1]"/></​xsl:​attribute>​ +
-      <​xsl:​apply-templates/>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
- +
-<​xsl:​template match="​tei:​group/​tei:​text">​ +
-  <​xsl:​element name="​subtext">​ +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-  </​xsl:​element>​ +
-</​xsl:​template>​ +
-   +
-  <​xsl:​template match="​tei:​pb">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​attribute name="​n">​ +
-        <​xsl:​choose>​ +
-          <​xsl:​when test="​@n"><​xsl:​value-of select="​@n"/></​xsl:​when>​ +
-          <​xsl:​when test="​@*:​id">​ +
-            <​xsl:​value-of select="​replace(@*:​id,'​^p\.',''​)"/>​ +
-          </​xsl:​when>​ +
-          <​xsl:​otherwise><​xsl:​text>​[s.n.]</​xsl:​text></​xsl:​otherwise>​ +
-        </​xsl:​choose>​ +
-      </​xsl:​attribute>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
- +
-<​xsl:​template match="​tei:​div1|tei:​div2|tei:​div3|tei:​div4|tei:​div5|tei:​div6|tei:​div7">​ +
-  <​xsl:​element name="​div"​ namespace="​http://www.tei-c.org/​ns/​1.0">​ +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-  </​xsl:​element>​ +
-</​xsl:​template>​ +
- +
-<​xsl:​template match="​tei:​choice">​ +
-  <​xsl:​apply-templates select="​tei:​expan|tei:​corr|tei:​reg"/>​ +
-</​xsl:​template>​ +
- +
-<​xsl:​template match="​tei:​choice/​tei:​expan">​ +
-  <w xmlns="​http://​www.tei-c.org/ns/​1.0">​ +
-    <​xsl:​attribute name="​abbr"><​xsl:​value-of select="​normalize-space(parent::​tei:​choice/tei:​abbr)"/></​xsl:​attribute>​ +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-  </​w>​ +
-</​xsl:​template>​ +
-   +
-  <​xsl:​template match="​tei:​choice/​tei:​corr">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​attribute name="​sic"><​xsl:​value-of select="​normalize-space(parent::​tei:​choice/​tei:​sic)"/></​xsl:​attribute>​ +
-      <​xsl:​apply-templates select="​@*|node()"/>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​template match="​tei:​choice/​tei:​reg">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​attribute name="​orig"><​xsl:​value-of select="​normalize-space(parent::​tei:​choice/​tei:​orig)"/></​xsl:​attribute>​ +
-      <​xsl:​apply-templates select="​@*|node()"/>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
- +
-<!-- Temporary patch for TXM indexing quote elements in notes --> +
- +
-  <​xsl:​template match="​tei:​note//​tei:​quote">​ +
-    <​quote-note>​ +
-      <​xsl:​apply-templates select="​@*|node()"/>​ +
-    </​quote-note>​ +
-  </​xsl:​template>​ +
- +
-<!--  +
-(i) adding an <​emph>​ element in order to point out some elements'​ content (e.g. foreigntitle) in TXM edition ; +
-(ii) adding a <w> element to prevent tokenisation from analysing some content (e.g. foreign)  +
---> +
- +
-<​xsl:​template match="​tei:​foreign[not(ancestor::​tei:​note)]">​ +
-<emph rend="​italic"​ xmlns="​http:​//www.tei-c.org/​ns/​1.0">​ +
-  <​xsl:​copy>​ +
-    <w xmlns="​http://​www.tei-c.org/​ns/​1.0"> ​  +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-    </​w> ​  +
-  </​xsl:​copy>​ +
-</​emph>​ +
-</​xsl:​template>​ +
- +
-<​xsl:​template match="​tei:​title">​ +
-<emph rend="​italic"​ xmlns="​http://​www.tei-c.org/​ns/​1.0">​ +
-  <​xsl:​copy>​ +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-  </​xsl:​copy>​ +
-</​emph>​ +
-</​xsl:​template>​ +
- +
-<!-- Temporary patch to get the correct rendering for <hi @rend="​italic">​ content ​in TXM editions ​must use <​emph>​ instead of <​hi> ​--+
- +
-<​xsl:​template match="​tei:​hi[matches(@rend,'​italic'​)]"​ priority="​1">​ +
-  <​xsl:​element name="​emph"​ namespace="​http:​//www.tei-c.org/​ns/​1.0">​ +
-    <​xsl:​apply-templates select="​@*|node()"/>​ +
-  </​xsl:​element>​ +
-</​xsl:​template>​ +
- +
-</​xsl:​stylesheet>​ +
-</​code>​ +
- +
-==== txm-posttok-addRef-perseus.xsl ==== +
- +
-<code XML> +
-<?xml version="​1.0"?>​ +
-<​xsl:​stylesheet xmlns:​edate="​http://​exslt.org/​dates-and-times"​ +
-  xmlns:​xsl="​http://​www.w3.org/​1999/​XSL/​Transform"​ xmlns:​tei="​http://​www.tei-c.org/​ns/​1.0"​ +
-  xmlns:​txm="​http://​textometrie.org/​ns/​1.0"​ +
-  exclude-result-prefixes="​tei edate" xpath-default-namespace="​http://​www.tei-c.org/​ns/​1.0"​ version="​2.0">​ +
- +
-  <!-- +
-This software is dual-licensed:​ +
- +
-1. Distributed under a Creative Commons Attribution-ShareAlike 3.0 +
-Unported License http://​creativecommons.org/​licenses/​by-sa/​3.0/​  +
- +
-2. http://​www.opensource.org/​licenses/​BSD-2-Clause +
-  +
-All rights reserved. +
- +
-Redistribution and use in source and binary formswith or without +
-modificationare permitted provided that the following conditions are +
-met: +
- +
-* Redistributions of source code must retain the above copyright +
-notice, this list of conditions and the following disclaimer. +
- +
-* Redistributions in binary form must reproduce the above copyright +
-notice, this list of conditions and the following disclaimer in the +
-documentation and/or other materials provided with the distribution. +
- +
-This software is provided by the copyright holders and contributors +
-"as is" and any express or implied warranties, including, but not +
-limited to, the implied warranties of merchantability and fitness for +
-a particular purpose are disclaimed. In no event shall the copyright +
-holder or contributors be liable for any direct, indirect, incidental,​ +
-special, exemplary, or consequential damages (including, but not +
-limited to, procurement of substitute goods or services; loss of use, +
-data, or profits; or business interruption) however caused and on any +
-theory of liability, whether in contract, strict liability, or tort +
-(including negligence or otherwise) arising in any way out of the use +
-of this software, even if advised of the possibility of such damage. +
- +
-      +
-This stylesheet adds a ref attribute to w elements that will be used for +
-references in TXM concordances. Can be used with TXM XTZ import module. +
- +
-Written by Alexei Lavrentiev, UMR 5317 IHRIM, 2017 +
-  --> +
- +
- +
-  <​xsl:​output method="​xml"​ encoding="​utf-8"​ omit-xml-declaration="​no"/>​  +
-   +
-   +
-  <!-- General patterns: all elements, attributes, comments and processing instructions are copied --> +
-   +
-  <​xsl:​template match="​*"> ​      +
-        <​xsl:​copy>​ +
-          <​xsl:​apply-templates select="​*|@*|processing-instruction()|comment()|text()"/>​ +
-        </​xsl:​copy> ​    +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​template match="​*"​ mode="​position"><​xsl:​value-of select="​count(preceding-sibling::​*)"/></​xsl:​template>​ +
- +
-  <​xsl:​template match="​@*|comment()|processing-instruction()">​ +
-    <​xsl:​copy/>​ +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​variable name="​filename">​ +
-    <​xsl:​analyze-string select="​document-uri(.)"​ regex="​^(.*)/​([^/​]+)\.xml$">​ +
-      <​xsl:​matching-substring>​ +
-        <​xsl:​value-of select="​regex-group(2)"/>​ +
-      </​xsl:​matching-substring>​ +
-    </​xsl:​analyze-string>​ +
-  ​</​xsl:​variable>​ +
-   +
-   +
-  <​xsl:​template match="​tei:​w">​ +
-    <​xsl:​variable name="​ref">​ +
-      <​xsl:​choose>​ +
-        <​xsl:​when test="​ancestor::​tei:​text/​@*:​id">​ +
-          <​xsl:​value-of select="​ancestor::​tei:​text[1]/@*:id[1]"/>​ +
-        </xsl:when> +
-        <​xsl:​otherwise>​ +
-          <​xsl:​value-of select="​$filename"/>​ +
-        </​xsl:​otherwise>​ +
-      </​xsl:​choose>​ +
-      <!-- ajout Perseus --> +
-      <xsl:if test="​preceding::​tei:​milestone[@unit='​chapter'​][1][@n]">​ +
-        <​xsl:​text>​c. </​xsl:​text>​ +
-        <​xsl:​value-of select="​preceding::​tei:​milestone[@unit='​chapter'​][1]/​@n"/>​ +
-      </​xsl:​if>​ +
-      <xsl:if test="​preceding::​tei:​milestone[@unit='​section'​][1][@n]">​ +
-        <xsl:text>, s</​xsl:​text>​ +
-        <​xsl:​value-of select="​preceding::​tei:​milestone[@unit='​section'​][1]/​@n"/>​ +
-      </​xsl:​if>​ +
-      <!-- fin ajout Perseus --> +
-       +
-      <xsl:if test="​preceding::​tei:​pb[1]/​@n">​ +
-        <​xsl:​text>,​ p. </​xsl:​text>​ +
-        <​xsl:​value-of select="​preceding::​tei:​pb[1]/​@n"/>​ +
-      </​xsl:​if>​ +
-      <xsl:if test="​ancestor::​tei:​p[@n]">​ +
-        <​xsl:​text>,​ § </​xsl:​text>​ +
-        <​xsl:​value-of select="​ancestor::​tei:​p/​@n"/>​ +
-      </​xsl:​if>​ +
-      <​!--<​xsl:​if test="​preceding::​tei:​lb[1]/​@n">​ +
-        <​xsl:​text>,​ l. </​xsl:​text>​ +
-        <​xsl:​value-of select="​preceding::​tei:​lb[1]/​@n"/>​ +
-      </​xsl:​if>​-->​ +
-    </​xsl:​variable>​ +
-    <​xsl:​copy>​ +
-      <​xsl:​apply-templates select="​@*"/>​ +
-      <​xsl:​attribute name="​ref"><​xsl:​value-of select="​$ref"/></​xsl:​attribute>​ +
-      <​xsl:​apply-templates select="​*|processing-instruction()|comment()|text()"/>​ +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
- +
-</​xsl:​stylesheet>​ +
-</​code>​ +
- +
-====== PLATO corpus : demontration of Perseus Greek & Treebank texts (AGDT 2) in TXM ====== +
- +
-===== Project presentation ===== +
- +
-  * context : Heidelberg, May 2017 : [[http://​www.altphil.uni-freiburg.de/​texte-messen/​digital-classics-iii-2013-re-thinking-text-analysis]] +
- +
-  * goal : +
-    * demonstrating that one can work on texts available from Perseus project in TXM +
-    * TEI compliant import +
-    * compatibility of TXM with greek language +
-    * showing that TXM can work on the POS annotation provided by the Treebank (TreeTagger is not the only way to get tagged texts in TXM). +
- +
-  * corpus +
-    * Plato'​s text Euthyphro from [[https://​perseusdl.github.io/​treebank_data/​|AGDT 2]]: tlg0059.tlg001.perseus-grc1.tb.xml +
- +
-  * Available ressources (approximate list) +
-    * txm-filter-perseustreebank-xmlw.xsl +
- +
-===== Solution ===== +
- +
-Make a directory (e.g. "​plato"​)and put inside the XML text file(s) downloaded from Perseus AGDT. +
- +
-Then run the TXM command File>​Import>​XML/​w + CSV with the following settings : +
- +
-1. Source directory is "​plato"​ (in our example). +
- +
-2. Import parameters : +
-  * Main Language : untick "​Annotate the corpus"​ (means : don't use TreeTagger) +
-  * Lexical Segmentation : no change - Default settings +
-  * Front XSL : indicate the copy of txm-filter-perseustreebank-xmlw.xsl in your file system +
-  * Editions : default setting (Build edition, Words per page = 500, Page break tag = pb) +
-  * Display font : default setting (Font name = <​default>​) +
-  * Commands : default setting (Concordance context structure limits = text) +
- +
-3. Click on "Start corpus import"​ (above - beginning of the page) +
- +
-===== Feedback ===== +
- +
-We made 2 changes in the stylesheet : +
-  * a correction : rename Perseus @id attribute on <w> words for compatibility with TXM +
-  * an improvement : add <lb/> elements after each sentence for better rendering ​in HTML Edition. +
- +
-===== XSL Perseus stylesheet used for this import ===== +
- +
-==== txm-filter-perseustreebank-xmlw.xsl ==== +
- +
-<code XML> +
-<?xml version="​1.0"?>​ +
-<xsl:stylesheet +
-  xmlns:​xd="​http://​www.pnp-software.com/​XSLTdoc"​ +
-  xmlns:​edate="​http://​exslt.org/​dates-and-times"​ +
-  xmlns:​xsl="​http:​//www.w3.org/​1999/​XSL/​Transform"​ +
-  xmlns:​xsi="​http://​www.w3.org/​2001/​XMLSchema-instance"​  +
-  xmlns:​treebank="​http://​nlp.perseus.tufts.edu/​syntax/​treebank/​1.5"​ +
-  exclude-result-prefixes="​edate xd xsi treebank"​ version="​2.0">​ +
-   +
-   +
-  <xd:doc type="​stylesheet">​ +
-    <​xd:​short>​ +
-      A stylesheet to prepare PERSEUS Treebank XML texts to TXM XML/w import. +
-    </​xd:​short>​ +
-    <​xd:​detail>​ +
-      This stylesheet is free software; you can redistribute it and/or +
-      modify it under the terms of the GNU Lesser General Public +
-      License as published by the Free Software Foundation; either +
-      version 3 of the Licenseor (at your option) any later version. +
-       +
-      This stylesheet is distributed in the hope that it will be useful, +
-      but WITHOUT ANY WARRANTY; without even the implied warranty of +
-      MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. ​ See the GNU +
-      Lesser General Public License for more details. +
-       +
-      You should have received a copy of GNU Lesser Public License with +
-      this stylesheet. If notsee http://www.gnu.org/​licenses/​lgpl.html +
-    </​xd:​detail>​ +
-    <​xd:​author>​Alexei Lavrentiev alexei.lavrentev@ens-lyon.fr</​xd:​author>​ +
-    <​xd:​copyright>​2012,​ CNRS / ICAR (ICAR3 LinCoBaTO)</​xd:​copyright>​ +
-  ​</​xd:​doc>​ +
-   +
- +
-  <​xsl:​output method="​xml"​ encoding="​utf-8"​ omit-xml-declaration="​no"/>​ +
-   +
-  <​xsl:​template match="​*">​ +
-    <​xsl:​copy>​ +
-      <​xsl:​apply-templates select="​*|@*|processing-instruction()|comment()|text()"/>​  +
-    </​xsl:​copy>​ +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​template match="​@*|comment()">​ +
-    <​xsl:​copy/>​ +
-  </​xsl:​template>​ +
-   +
-  <​xsl:​template match="​processing-instruction()"/>​ +
-   +
-  <​xsl:​template match="​text()"><​xsl:​value-of select="​."/></​xsl:​template>​ +
-   +
-<​xsl:​template match="​treebank">​ +
-  <text type="​treebank"​ version="​{@version}"​ date="​{normalize-space(child::​date[1])}" annotator-short="​{normalize-space(child::​annotator[1]/​short)}"​ annotator-name="​{normalize-space(child::​annotator[1]/​name)}"​ annotator-address="​{normalize-space(child::​annotator[1]/​address)}">​ +
-    <​xsl:​apply-templates select="​descendant::​sentence"/>​ +
-  </​text>​ +
-</​xsl:​template>​ +
- +
-<​xsl:​template match="​annotator"/>​ +
-   +
-<​xsl:​template match="​sentence">​ +
-  <​xsl:​copy>​ +
-    <​xsl:​apply-templates select="​@*"/>​ +
-    <​xsl:​attribute name="​annotator"><​xsl:​value-of select="​child::​annotator"/></​xsl:​attribute>​ +
-    <​xsl:​apply-templates/>​ +
-  </​xsl:​copy>​ +
-  <​lb/>​ +
-</​xsl:​template>​ +
-   +
-  <​xsl:​template match="​word">​ +
-    <w> +
-      <​xsl:​apply-templates select="​@*[not(name()='​form'​)]"/>​ +
-      <​xsl:​value-of select="​@form"></​xsl:​value-of>​ +
-    </​w>​ +
-  </​xsl:​template>​ +
- +
-<​xsl:​template match="​word/​@id">​ +
- <​xsl:​attribute name="​perseus-id"><​xsl:​value-of select="​."/></​xsl:​attribute>​ +
- +
-</​xsl:​template>​ +
-</​xsl:​stylesheet>​ +
-</​code>​ +
- +
-====== PLAUTELAT & PLAUTEEN TXM demo ====== +
- +
-===== Goal ===== +
- +
-  * Context is 2012-12-05 University of Leipzig eHumanities Seminar +
-  * goal was to demo TXM on Latin and English translations of Plaute' ​plays from Perseus +
- +
-===== Corpus ===== +
- +
-Corpus au Plaute'​s plays in Latin and their translation in English from Perseus. +
- +
-Import parameters (updated from XML/w to XTZ): +
-  * 2-front : +
-    * txm-filter-teiperseus-xmlw.xsl +
-    * txm-filter-teip5-xmlw-preserve.xsl +
-  * lat.par TreeTagger model +
- +
-  * PLAUTELAT: corpus of Plaute'​ Latin plays +
-    * source: [[https://sharedocs.huma-num.fr/​wl/?​id=qftriVBBeFES4jmt2BIobq1IqtypXGnK|davs://​sharedocs.huma-num.fr/dav.php/@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​src/​plautelat-src.zip]] +
-    * binary: [[https://​sharedocs.huma-num.fr/wl/?​id=eOLdijlvM50Qep1BQTz7UICvYHS3bPDq|davs://​sharedocs.huma-num.fr/​dav.php/​@Shares/​(948)%20Cactus/​(3792)%20Cactus/​Projets/​Textométrie/​Corpus/​bin/​PLAUTELAT.txm]] +
-  * PLAUTEEN: corpus ​of Plaute'​ English translation of plays +
-    * todo +
- +
----- +
--> [[:|Retour à la liste des projets]].+
  
public/perseus.txt · Dernière modification: 2017/12/01 17:54 par benedicte.pincemin@ens-lyon.fr