For this experience we have selected every text, except numbers 15, 16, 17, 29, 33, 35, 36 (for scientific reasons, not technical ones -the solution should work with these texts too, but has not been completely tested).
All these XML-TEI files of Plato's texts are then grouped into one directory named plato170720 (that is, the name we have choosen to give to the TXM corpus).
When we prepared this corpus in June and July 2017, TEI encoding of plato's texts in Perseus was heterogeneus. We had to deal with several states : last updates made in 2017, 2015, 2014, 1992. 2017 texts were clearly a new generation. Two texts (27 = Ion and 30 = Republic) had some main differences choices in encoding, for instance as regard <div> use and sections' marking.
We decided not to modify sources (which are evolving and improving thanks to perseus community), but to make automatized and limited changes included in the import processing so as to get a usable corpus, even if the TXM user has to compel with some inherited heterogeneity.
The most relevant import format for this Perseus corpus is the XML-XTZ + CSV import (which is available since TXM 0.7.8 version), as it deals with XML-TEI files and allows for many simple and useful settings. A preliminary documentation is available online, before being available in TXM user Manual (see TXM Documentation).
As a basis we take the XSL stylesheets prepared for the previous experience on Perseus texts (Cicero, Heidelberg 2017), also described on the txm-users wiki (here). These stylesheets already manage some XML TEI features of Perseus texts (about nested <div> or <text>) in order to make them compliant with TXM processing (especially for the CQP search engine component embedded in TXM).
Automatically get text information from teiHeader :
As date formulation shows big variations throughout the corpus, we also encode this information in a normalized form in a metadata.csv file given as parameter in TXM XTZ import (this produces the update10 property on text structure, as last change date is written with 10 characters).
The title information is interesting as default identifier of the text for references in concordance view. Special cases : We have to deal with some long titles like “Republic (Greek). Machine readable text” (30), idem for Laws (34). We developped 2 solutions :
Nevertheless, CTS URN information must still be available and can be choosen to localize words in the corpus (cf. ctsurn and ctsurn5 property for word structure, and id property for text structure).
To precisely localize word occurrences in TXM we would like to have by default the text title and the number of the Stephanus section.
<div> usage is heterogeneus at the moment. The best solution is to use @n attribute in <milestone unit="section"> elements to get words localization in the Stephanus reference system. We just have to deal with the exception of texts 27 and 30 which code this information only on <div type="section"> or <div subtype="section"> elements.
Moreover, knowing that we may use section numbers in some sort processings, we want a version of this numbers that is encoded in a fixed length manner (e.g. 0015a instead of 15a), so that sorting these numbers as strings provides a relevant result.
We can take into account edition pages : in all the files of our corpus the information is available in <milestone unit="page"> element, with @n attribute. Solution : during XTZ import, at the 2-front stage, the txm-front-teiperseus-xtz.xsl stylesheet adds <pb> elements which can be used by TXM.
Encoding of speech turns is heterogeneous too :
<p> elements are sometimes used, sometimes not, and can be either outside <said>, or inside <said> and following <label>,…
We want to keep and show clearly speech turn information and speaker information, without indexing the speaker's name as a word to be counted and searched a such.
Solution :
Castlists given in files 23 to 27 should be ignored for textometric analysis
Bibliographic citation references encoded with <bibl> elements should be distinguished from ancient greek text.
Versified text (encoded with <l> elements) should be distinguished in TXM text edition.
Make a directory (e.g. “plato”).
Add files and subdirectories in it, so that this directory includes :
These two bugs have been reported and should be corrected in one of the next versions of TXM, so try and see if file name corrections are still needed.
Then run the TXM command File>Import>XML-XTZ + CSV with the following settings :
1. Source directory is “plato” (in our example).
2. Import parameters : Import parameters :
3. Click on “Start corpus import” (above - beginning of the form)
Our metadata.csv looks like this in a spreadsheet software (like Cacl or Excel) :
And here is a view of its content as a tabulated text (same file, opened with a plain text editor) :
"id","title1","update10" "tlg0059_tlg001_perseus-grc1","Euthyphro","2017-03-13" "tlg0059_tlg002_perseus-grc2","Apology","2017-03-16" "tlg0059_tlg003_perseus-grc2","Crito","2017-03-16" "tlg0059_tlg004_perseus-grc2","Phaedo","2017-03-28" "tlg0059_tlg005_perseus-grc2","Cratylus","2017-03-29" "tlg0059_tlg006_perseus-grc2","Theaetetus","2017-03-30" "tlg0059_tlg007_perseus-grc2","Sophist","2017-04-06" "tlg0059_tlg008_perseus-grc2","Statesman","2017-04-11" "tlg0059_tlg009_perseus-grc2","Parmenides","2017-04-13" "tlg0059_tlg010_perseus-grc2","Philebus","2017-04-18" "tlg0059_tlg011_perseus-grc2","Symposium","2017-04-19" "tlg0059_tlg012_perseus-grc2","Phaedrus","2017-04-18" "tlg0059_tlg013_perseus-grc2","Alcibiades 1","2017-05-12" "tlg0059_tlg014_perseus-grc2","Alcibiades 2","2017-05-12" "tlg0059_tlg015_perseus-grc2","Hipparchus","2017-05-17" "tlg0059_tlg016_perseus-grc2","Lovers","2017-05-17" "tlg0059_tlg017_perseus-grc2","Theages","2017-05-17" "tlg0059_tlg018_perseus-grc2","Charmides","2017-06-06" "tlg0059_tlg019_perseus-grc2","Laches","2017-06-06" "tlg0059_tlg020_perseus-grc2","Lysis","2017-06-07" "tlg0059_tlg021_perseus-grc2","Euthydemus","2017-06-13" "tlg0059_tlg022_perseus-grc2","Protagoras","2017-06-19" "tlg0059_tlg023_perseus-grc2","Gorgias","2017-06-23" "tlg0059_tlg024_perseus-grc2","Meno","2017-07-10" "tlg0059_tlg025_perseus-grc1","Hippias Major","2014-07-01" "tlg0059_tlg026_perseus-grc1","Hippias Minor","2014-07-01" "tlg0059_tlg027_perseus-grc1","Ion","2014-07-01" "tlg0059_tlg028_perseus-grc1","Menexenus","2014-07-01" "tlg0059_tlg029_perseus-grc1","Cleitophon","2014-07-01" "tlg0059_tlg030_perseus-grc1","Republic","1992-07-01" "tlg0059_tlg030_perseus-grc2","Republic","2015-04-15" "tlg0059_tlg031_perseus-grc1","Timaeus","2014-07-01" "tlg0059_tlg032_perseus-grc1","Critias","2014-07-01" "tlg0059_tlg033_perseus-grc1","Minos","2014-07-01" "tlg0059_tlg034_perseus-grc1","Laws","1992-07-01" "tlg0059_tlg035_perseus-grc1","Epinomis","2014-07-01" "tlg0059_tlg036_perseus-grc1","Epistles","1992-07-01"
This file is put in the css subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
/* Copyright © 2017 ENS de Lyon, CNRS, University of Franche-Comté Licensed under the terms of the GNU General Public License (http://www.gnu.org/licenses) @author cbourdot @author sheiden TXM default CSS 06-2017 */ .txmeditionpage { background-color: #f8f7ee; font-family: brill, 'Arial Unicode MS',ubuntu,verdana; /*junicode (Greek is displayed with italics font)*/ font-size: 14px; text-indent:0px; text-align: justify; box-shadow: .3125em .3125em .625em #888; margin: 1.25em auto; padding: 1.25em; /*width: 400px;*/ min-height: 90%; } .txmeditionpb { text-align: center; } .txmeditionpb:before { content: "- "; } .txmeditionpb:after { content: " -"; } .txmlettrinep:first-letter { float: left; font-size: 6em; line-height: 1; margin-right: 0.2em; } .editionpage { display:block; text-align:center; color:gray; } a { color:#802520; } h1 { font-size: 20px; font-variant: small-caps; text-align: center; color:#802520; } h2 { font-size: 18px; font-variant: small-caps; text-align: center; color:#802520; } h3 { font-size: 16px; font-variant: small-caps; text-align: center; color:#802520; } p { text-indent: 0.2cm; text-align: justify; text-justify: inter-word; } img { margin: 10px 10px 10px 10px; } td[rend="table-cell-align-right"] { text-align: right; } td[rend="table-cell-align-left"] { text-align: left; } td[rend="table-cell-align-center"] { text-align: center; } .bibl { color:gray; } .bibl:before { content:"("; } .bibl:after { content:")"; } .hi-italic { font-style:italic; } .foreign { font-style:italic; color:darkred; } .label, .speaker { font-style:italic; color:gray; } .l { display:block; }
This file is put in the xsl/1-split-merge subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
It has been added in order to take account of a bug in TXM 0.7.8 early version. See note about 1-split-merge subdirectory above.
<!-- The Identity Transformation --> <xsl:stylesheet version="2.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform"> <xsl:param name="output-directory"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.[^/.]+$"> <xsl:matching-substring> <xsl:value-of select="regex-group(1)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:param> <xsl:variable name="filename"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.[^/.]+$"> <xsl:matching-substring> <xsl:value-of select="regex-group(2)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:template match="/"> <xsl:result-document href="{$output-directory}/{replace($filename,'\.','_')}.xml"> <xsl:copy-of select="."></xsl:copy-of> </xsl:result-document> <warning>Result file written to <xsl:value-of select="concat($output-directory,'/',replace($filename,'\.','_'),'.xml')"/></warning> </xsl:template> </xsl:stylesheet>
This file is put in the xsl/2-front subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
Note that this file has been edited to deal with Perseus texts where some pointers already have the “#” character (see comment in the file).
<?xml version="1.0"?> <xsl:stylesheet xmlns:edate="http://exslt.org/dates-and-times" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" exclude-result-prefixes="tei edate" version="1.0"> <!-- P4 to P5 converter Sebastian Rahtz <sebastian.rahtz@oucs.ox.ac.uk> $Date: 2007-11-01 16:33:34 +0000 (Thu, 01 Nov 2007) $ $Id: p4top5.xsl 3927 2007-11-01 16:33:34Z rahtz $ Copyright 2007 TEI Consortium Permission is hereby granted, free of charge, to any person obtaining a copy of this software and any associated documentation gfiles (the ``Software''), to deal in the Software without restriction, including without limitation the rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit persons to whom the Software is furnished to do so, subject to the following conditions: The above copyright notice and this permission notice shall be included in all copies or substantial portions of the Software. --> <xsl:output method="xml" encoding="utf-8" cdata-section-elements="tei:eg" omit-xml-declaration="yes"/> <xsl:variable name="processor"> <xsl:value-of select="system-property('xsl:vendor')"/> </xsl:variable> <xsl:variable name="today"> <xsl:choose> <xsl:when test="function-available('edate:date-time')"> <xsl:value-of select="edate:date-time()"/> </xsl:when> <xsl:when test="contains($processor,'SAXON')"> <xsl:value-of select="Date:toString(Date:new())" xmlns:Date="/java.util.Date"/> </xsl:when> <xsl:otherwise>0000-00-00</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="uc">ABCDEFGHIJKLMNOPQRSTUVWXYZ</xsl:variable> <xsl:variable name="lc">abcdefghijklmnopqrstuvwxyz</xsl:variable> <xsl:template match="*"> <xsl:choose> <xsl:when test="namespace-uri()=''"> <xsl:element namespace="http://www.tei-c.org/ns/1.0" name="{local-name(.)}"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:element> </xsl:when> <xsl:otherwise> <xsl:copy> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:copy> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@*|processing-instruction()|comment()"> <xsl:copy/> </xsl:template> <xsl:template match="text()"> <xsl:value-of select="."/> </xsl:template> <!-- change of name, or replaced by another element --> <xsl:template match="teiCorpus.2"> <teiCorpus xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </teiCorpus> </xsl:template> <xsl:template match="witness/@sigil"> <xsl:attribute name="xml:id"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <xsl:template match="witList"> <listWit xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </listWit> </xsl:template> <xsl:template match="TEI.2"> <TEI xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </TEI> </xsl:template> <xsl:template match="xref"> <xsl:element namespace="http://www.tei-c.org/ns/1.0" name="ref"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:element> </xsl:template> <xsl:template match="xptr"> <xsl:element namespace="http://www.tei-c.org/ns/1.0" name="ptr"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:element> </xsl:template> <xsl:template match="figure[@url]"> <figure xmlns="http://www.tei-c.org/ns/1.0"> <graphic xmlns="http://www.tei-c.org/ns/1.0"> <xsl:copy-of select="@*"/> </graphic> <xsl:apply-templates/> </figure> </xsl:template> <xsl:template match="figure/@url"/> <xsl:template match="figure/@entity"/> <xsl:template match="figure[@entity]"> <figure xmlns="http://www.tei-c.org/ns/1.0"> <graphic xmlns="http://www.tei-c.org/ns/1.0" url="{unparsed-entity-uri(@entity)}"> <xsl:apply-templates select="@*"/> </graphic> <xsl:apply-templates/> </figure> </xsl:template> <xsl:template match="event"> <incident xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|text()|comment()|processing-instruction()"/> </incident> </xsl:template> <xsl:template match="state"> <refState xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|text()|comment()|processing-instruction()"/> </refState> </xsl:template> <!-- lost elements --> <xsl:template match="dateRange"> <date xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </date> </xsl:template> <xsl:template match="dateRange/@from"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="dateRange/@to"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="language"> <xsl:element namespace="http://www.tei-c.org/ns/1.0" name="language"> <xsl:if test="@id"> <xsl:attribute name="ident"> <xsl:value-of select="@id"/> </xsl:attribute> </xsl:if> <xsl:apply-templates select="*|processing-instruction()|comment()|text()"/> </xsl:element> </xsl:template> <!-- attributes lost --> <!-- dropped from TEI. Added as new change records later --> <xsl:template match="@date.created"/> <xsl:template match="@date.updated"/> <!-- dropped from TEI. No replacement --> <xsl:template match="refsDecl/@doctype"/> <!-- attributes changed name --> <xsl:template match="date/@value"> <xsl:attribute name="when"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <xsl:template match="@url"> <xsl:attribute name="target"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <xsl:template match="@doc"> <xsl:attribute name="target"> <xsl:value-of select="unparsed-entity-uri(.)"/> </xsl:attribute> </xsl:template> <xsl:template match="@id"> <xsl:choose> <xsl:when test="parent::lang"> <xsl:attribute name="ident"> <xsl:value-of select="."/> </xsl:attribute> </xsl:when> <xsl:otherwise> <xsl:attribute name="xml:id"> <xsl:value-of select="."/> </xsl:attribute> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@lang"> <xsl:attribute name="xml:lang"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <xsl:template match="change/@date"/> <xsl:template match="date/@certainty"> <xsl:attribute name="cert"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <!-- all pointing attributes preceded by # --> <xsl:template match="variantEncoding/@location"> <xsl:copy-of select="."/> </xsl:template> <!-- Modified for Perseus texts where some pointers already have # --> <xsl:template match="@ana|@active|@adj|@adjFrom|@adjTo|@children|@children|@class|@code|@code|@copyOf|@corresp|@decls|@domains|@end|@exclude|@fVal|@feats|@follow|@from|@hand|@inst|@langKey|@location|@mergedin|@new|@next|@old|@origin|@otherLangs|@parent|@passive|@perf|@prev|@render|@resp|@sameAs|@scheme|@script|@select|@since|@start|@synch|@target|@targetEnd|@to|@to|@value|@value|@who|@wit"> <xsl:attribute name="{name(.)}"> <xsl:choose> <xsl:when test="starts-with(.,'#')"> <xsl:copy-of select="."/> </xsl:when> <xsl:otherwise> <xsl:call-template name="splitter"> <xsl:with-param name="val"> <xsl:value-of select="."/> </xsl:with-param> </xsl:call-template> </xsl:otherwise> </xsl:choose> </xsl:attribute> </xsl:template> <xsl:template name="splitter"> <xsl:param name="val"/> <xsl:choose> <xsl:when test="contains($val,' ')"> <xsl:text>#</xsl:text> <xsl:value-of select="substring-before($val,' ')"/> <xsl:text> </xsl:text> <xsl:call-template name="splitter"> <xsl:with-param name="val"> <xsl:value-of select="substring-after($val,' ')"/> </xsl:with-param> </xsl:call-template> </xsl:when> <xsl:otherwise> <xsl:text>#</xsl:text> <xsl:value-of select="$val"/> </xsl:otherwise> </xsl:choose> </xsl:template> <!-- fool around with selected elements --> <!-- imprint is no longer allowed inside bibl --> <xsl:template match="bibl/imprint"> <xsl:apply-templates/> </xsl:template> <xsl:template match="editionStmt/editor"> <respStmt xmlns="http://www.tei-c.org/ns/1.0"> <resp><xsl:value-of select="@role"/></resp> <name><xsl:apply-templates/></name> </respStmt> </xsl:template> <!-- header --> <xsl:template match="teiHeader"> <teiHeader xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()"/> <xsl:if test="not(revisionDesc) and (@date.created or @date.updated)"> <revisionDesc xmlns="http://www.tei-c.org/ns/1.0"> <xsl:if test="@date.updated"> <change xmlns="http://www.tei-c.org/ns/1.0">> <label>updated</label> <date xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@date.updated"/> </date> <label xmlns="http://www.tei-c.org/ns/1.0">Date edited</label> </change> </xsl:if> <xsl:if test="@date.created"> <change xmlns="http://www.tei-c.org/ns/1.0"> <label>created</label> <date xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@date.created"/> </date> <label xmlns="http://www.tei-c.org/ns/1.0">Date created</label> </change> </xsl:if> </revisionDesc> </xsl:if> <!-- <change when="{$today}" xmlns="http://www.tei-c.org/ns/1.0">Converted to TEI P5 XML by p4top5.xsl written by Sebastian Rahtz at Oxford University Computing Services.</change> </revisionDesc> </xsl:if> --> </teiHeader> </xsl:template> <xsl:template match="revisionDesc"> <revisionDesc xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()"/> </revisionDesc> </xsl:template> <xsl:template match="publicationStmt"> <publicationStmt xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()"/> <!-- <availability xmlns="http://www.tei-c.org/ns/1.0"> <p xmlns="http://www.tei-c.org/ns/1.0">Licensed under <ptr xmlns="http://www.tei-c.org/ns/1.0" target="http://creativecommons.org/licenses/by-sa/2.0/uk/"/></p> </availability> --> </publicationStmt> </xsl:template> <!-- space does not have @extent any more --> <xsl:template match="space/@extent"> <xsl:attribute name="quantity"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <!-- tagsDecl has a compulsory namespace child now --> <xsl:template match="tagsDecl"> <xsl:if test="*"> <tagsDecl xmlns="http://www.tei-c.org/ns/1.0"> <namespace name="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|comment()|processing-instruction"/> </namespace> </tagsDecl> </xsl:if> </xsl:template> <!-- orgTitle inside orgName? redundant --> <xsl:template match="orgName/orgTitle"> <xsl:apply-templates/> </xsl:template> <!-- no need for empty <p> in sourceDesc --> <xsl:template match="sourceDesc/p[string-length(.)=0]"/> <!-- start creating the new choice element --> <xsl:template match="corr[@sic]"> <choice xmlns="http://www.tei-c.org/ns/1.0"> <corr xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="text()" /> </corr> <sic xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@sic" /> </sic> </choice> </xsl:template> <xsl:template match="sic[@corr]"> <choice xmlns="http://www.tei-c.org/ns/1.0"> <sic xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="text()" /> </sic> <corr xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@corr" /> </corr> </choice> </xsl:template> <xsl:template match="abbr[@expan]"> <choice xmlns="http://www.tei-c.org/ns/1.0"> <abbr xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="text()" /> </abbr> <expan xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@expan" /> </expan> </choice> </xsl:template> <xsl:template match="expan[@abbr]"> <choice xmlns="http://www.tei-c.org/ns/1.0"> <expan xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="text()" /> </expan> <abbr xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="@abbr" /> </abbr> </choice> </xsl:template> <!-- special consideration for <change> element --> <xsl:template match="change"> <change xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="date"/> <xsl:if test="respStmt/resp"> <label> <xsl:value-of select="respStmt/resp/text()"/> </label> </xsl:if> <xsl:for-each select="respStmt/name"> <name xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()|text()"/> </name> </xsl:for-each> <xsl:for-each select="item"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()|text()"/> </xsl:for-each> </change> </xsl:template> <xsl:template match="respStmt[resp]"> <respStmt xmlns="http://www.tei-c.org/ns/1.0"> <xsl:choose> <xsl:when test="resp/name"> <resp xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="resp/text()"/> </resp> <xsl:for-each select="resp/name"> <name xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates/> </name> </xsl:for-each> </xsl:when> <xsl:otherwise> <xsl:apply-templates/> <name xmlns="http://www.tei-c.org/ns/1.0"> </name> </xsl:otherwise> </xsl:choose> </respStmt> </xsl:template> <xsl:template match="q/@direct"/> <xsl:template match="q"> <q xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|*|comment()|processing-instruction()|text()"/> </q> </xsl:template> <!-- if we are reading the P4 with a DTD, we need to avoid copying the default values of attributes --> <xsl:template match="@targOrder"> <xsl:if test="not(translate(.,$uc,$lc) ='u')"> <xsl:attribute name="targOrder"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@opt"> <xsl:if test="not(translate(.,$uc,$lc) ='n')"> <xsl:attribute name="opt"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@to"> <xsl:if test="not(translate(.,$uc,$lc) ='ditto')"> <xsl:attribute name="to"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@default"> <xsl:choose> <xsl:when test="translate(.,$uc,$lc)= 'no'"/> <xsl:otherwise> <xsl:attribute name="default"> <xsl:value-of select="."/> </xsl:attribute> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@part"> <xsl:if test="not(translate(.,$uc,$lc) ='n')"> <xsl:attribute name="part"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@full"> <xsl:if test="not(translate(.,$uc,$lc) ='yes')"> <xsl:attribute name="full"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@from"> <xsl:if test="not(translate(.,$uc,$lc) ='root')"> <xsl:attribute name="from"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@status"> <xsl:choose> <xsl:when test="parent::teiHeader"> <xsl:if test="not(translate(.,$uc,$lc) ='new')"> <xsl:attribute name="status"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:when> <xsl:when test="parent::del"> <xsl:if test="not(translate(.,$uc,$lc) ='unremarkable')"> <xsl:attribute name="status"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:when> <xsl:otherwise> <xsl:attribute name="status"> <xsl:value-of select="."/> </xsl:attribute> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@place"> <xsl:if test="not(translate(.,$uc,$lc) ='unspecified')"> <xsl:attribute name="place"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@sample"> <xsl:if test="not(translate(.,$uc,$lc) ='complete')"> <xsl:attribute name="sample"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="@org"> <xsl:if test="not(translate(.,$uc,$lc) ='uniform')"> <xsl:attribute name="org"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <xsl:template match="teiHeader/@type"> <xsl:if test="not(translate(.,$uc,$lc) ='text')"> <xsl:attribute name="type"> <xsl:value-of select="."/> </xsl:attribute> </xsl:if> </xsl:template> <!-- yes|no to boolean --> <xsl:template match="@anchored"> <xsl:attribute name="anchored"> <xsl:choose> <xsl:when test="translate(.,$uc,$lc)='yes'">true</xsl:when> <xsl:when test="translate(.,$uc,$lc)='no'">false</xsl:when> </xsl:choose> </xsl:attribute> </xsl:template> <xsl:template match="sourceDesc/@default"/> <xsl:template match="@tei"> <xsl:attribute name="tei"> <xsl:choose> <xsl:when test="translate(.,$uc,$lc)='yes'">true</xsl:when> <xsl:when test="translate(.,$uc,$lc)='no'">false</xsl:when> </xsl:choose> </xsl:attribute> </xsl:template> <xsl:template match="@langKey"/> <xsl:template match="@TEIform"/> <!-- assorted atts --> <xsl:template match="@old"/> <xsl:template match="@mergedin"> <xsl:attribute name="mergedIn"> <xsl:value-of select="."/> </xsl:attribute> </xsl:template> <!-- deal with the loss of div0 --> <xsl:template match="div1|div2|div3|div4|div5|div6"> <xsl:variable name="divName"> <xsl:choose> <xsl:when test="ancestor::div0"> <xsl:text>div</xsl:text> <xsl:value-of select="number(substring-after(local-name(.),'div')) + 1"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="local-name()"/> </xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:element name="{$divName}" namespace="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:element> </xsl:template> <xsl:template match="div0"> <div1 xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </div1> </xsl:template> </xsl:stylesheet>
This file is put in the xsl/2-front subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
<?xml version="1.0"?> <xsl:stylesheet xmlns:xd="http://www.pnp-software.com/XSLTdoc" xmlns:edate="http://exslt.org/dates-and-times" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" exclude-result-prefixes="#all" version="2.0"> <xd:doc type="stylesheet"> <xd:short> A stylesheet to prepare PERSEUS XML-TEI texts to TXM import. </xd:short> <xd:detail> This stylesheet is free software; you can redistribute it and/or modify it under the terms of the GNU Lesser General Public License as published by the Free Software Foundation; either version 3 of the License, or (at your option) any later version. This stylesheet is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU Lesser General Public License for more details. You should have received a copy of GNU Lesser Public License with this stylesheet. If not, see http://www.gnu.org/licenses/lgpl.html </xd:detail> <xd:author>Alexei Lavrentiev alexei.lavrentev@ens-lyon.fr</xd:author> <xd:copyright>2017, CNRS / IHRIM (Groupe CACTUS)</xd:copyright> </xd:doc> <xsl:output method="xml" encoding="utf-8" omit-xml-declaration="no"/> <xsl:template match="node()|@*"> <!-- Copy the current node --> <xsl:copy> <!-- Including any attributes it has and any child nodes --> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <!-- This template automatically extracts information from teiHeader to use them as metadata information associated with each text. It had better be commented if one uses a metadata file with the same information (same metadata labels) : --> <xsl:template match="/tei:TEI/tei:text"> <xsl:variable name="changes" as="element()"> <changes> <xsl:choose> <xsl:when test="ancestor::*:TEI/*:teiHeader/*:revisionDesc//*:change[@when]"> <xsl:for-each select="ancestor::*:TEI/*:teiHeader/*:revisionDesc//*:change"> <xsl:sort order="descending" select="@when"/> <change><xsl:value-of select="@when"/></change> </xsl:for-each> </xsl:when> <xsl:when test="/*:TEI/*:teiHeader/*:revisionDesc//*:change/*:date"> <change><xsl:value-of select="/*:TEI/*:teiHeader/*:revisionDesc//*:change[not(following-sibling::*:change)]/*:date"/></change> </xsl:when> <xsl:otherwise> <change>unknown</change> </xsl:otherwise> </xsl:choose> </changes> </xsl:variable> <xsl:copy> <xsl:copy-of select="@*"/> <xsl:attribute name="author"><xsl:value-of select="normalize-space(//tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:author[1])"/></xsl:attribute> <xsl:attribute name="title"><xsl:value-of select="replace(normalize-space(/tei:TEI/tei:teiHeader[1]/tei:fileDesc[1]/tei:titleStmt[1]/tei:title[1]),'^([^.,\(:;]*[^.,\(:;\s]+)\s*.*$','$1')"/></xsl:attribute> <xsl:attribute name="editor"><xsl:value-of select="normalize-space(//tei:teiHeader/tei:fileDesc/tei:titleStmt/tei:editor[1])"/></xsl:attribute> <xsl:attribute name="lastchange"><xsl:value-of select="normalize-space($changes//change[1])"/></xsl:attribute> <xsl:apply-templates/> </xsl:copy> </xsl:template> <!-- Texts in a text group are renamed as subtext (texts cannot be nested for TXM processing) --> <xsl:template match="tei:group/tei:text"> <xsl:element name="subtext"> <xsl:apply-templates select="@*|node()"/> </xsl:element> </xsl:template> <xsl:template match="tei:pb"> <xsl:copy> <xsl:attribute name="n"> <xsl:choose> <xsl:when test="@n"><xsl:value-of select="@n"/></xsl:when> <xsl:when test="@*:id"> <xsl:value-of select="replace(@*:id,'^p\.','')"/> </xsl:when> <xsl:otherwise><xsl:text>[s.n.]</xsl:text></xsl:otherwise> </xsl:choose> </xsl:attribute> </xsl:copy> </xsl:template> <!-- The CQP search engine used by TXM doesn't work well with structures both numbered and nested, it gets confused ; so numbered div elements are converted to simple nested div elements. --> <xsl:template match="tei:div1|tei:div2|tei:div3|tei:div4|tei:div5|tei:div6|tei:div7"> <xsl:element name="div" namespace="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|node()"/> </xsl:element> </xsl:template> <!-- Some texts of our selection (Plato, taken from github on 2017, July 3rd) are in a transitory edition state, with no milestone element for section encoding : sections are defined only through div elements. This template adds milestones for sections when missing, as to get a uniform encoding of the information in the corpus.--> <xsl:template match="tei:div[@type='section' or @subtype='section']"> <xsl:copy> <xsl:apply-templates select="@*"/> <!--<xsl:if test="not(//tei:milestone[@unit='page']) and (matches(@n,'a$'))"> <pb xmlns="http://www.tei-c.org/ns/1.0" n="{replace(@n,'a$','')}"/> </xsl:if>--> <xsl:if test="not(//tei:milestone[@unit='section'])"> <milestone unit="section" xmlns="http://www.tei-c.org/ns/1.0" n="{@n}"></milestone> </xsl:if> <xsl:apply-templates/> </xsl:copy> </xsl:template> <!-- Milestones for pages are converted into pb elements (easier to process in TXM import) --> <xsl:template match="tei:milestone[@unit='page']"> <pb xmlns="http://www.tei-c.org/ns/1.0" n="{replace(@n,'a$','')}"/> </xsl:template> <xsl:template match="tei:choice"> <xsl:apply-templates select="tei:expan|tei:corr|tei:reg"/> </xsl:template> <xsl:template match="tei:choice/tei:expan"> <w xmlns="http://www.tei-c.org/ns/1.0"> <xsl:attribute name="abbr"><xsl:value-of select="normalize-space(parent::tei:choice/tei:abbr)"/></xsl:attribute> <xsl:apply-templates select="@*|node()"/> </w> </xsl:template> <xsl:template match="tei:choice/tei:corr"> <xsl:copy> <xsl:attribute name="sic"><xsl:value-of select="normalize-space(parent::tei:choice/tei:sic)"/></xsl:attribute> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="tei:choice/tei:reg"> <xsl:copy> <xsl:attribute name="orig"><xsl:value-of select="normalize-space(parent::tei:choice/tei:orig)"/></xsl:attribute> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <!-- Dialogs' processing : the following templates unify encoding in the corpus (Plato, taken from github on 2017, July 3rd) so as to have clear editions (a speech turn looks like a paragraph introduced by the speaker's name). --> <xsl:template match="tei:said"> <xsl:choose> <xsl:when test="not(ancestor::tei:p or descendant::tei:p)"> <p xmlns="http://www.tei-c.org/ns/1.0"> <xsl:call-template name="process-said"/> </p> </xsl:when> <xsl:otherwise> <xsl:call-template name="process-said"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="process-said"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:if test="@who and not(descendant::tei:label) and not(ancestor::tei:said)"> <!-- <emph xmlns="http://www.tei-c.org/ns/1.0">--> <label xmlns="http://www.tei-c.org/ns/1.0"> <xsl:value-of select="replace(@who,'^#','')"/> <xsl:text>.</xsl:text> </label> <!--</emph>--> <xsl:text> </xsl:text> </xsl:if> <xsl:apply-templates/> </xsl:copy> </xsl:template> <!-- The following processing has been moved in a later stage --> <!-- <xsl:template match="tei:label|tei:speaker"> <emph xmlns="http://www.tei-c.org/ns/1.0"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </emph> </xsl:template>--> <!-- Temporary patch for TXM indexing quote elements in notes --> <xsl:template match="tei:note//tei:quote"> <quote-note> <xsl:apply-templates select="@*|node()"/> </quote-note> </xsl:template> <!-- following are examples of : (i) adding an <emph> element in order to point out some elements' content (e.g. foreign, title) in TXM edition ; (ii) adding a <w> element to prevent tokenisation from analysing some content (e.g. foreign, num) (iii) adding emph element inside all elements with @rend containing 'italic' for automatic styling in TXM editions --> <xsl:template match="tei:foreign[not(ancestor::tei:note)]"> <!--<emph rend="italic" xmlns="http://www.tei-c.org/ns/1.0">--> <xsl:copy> <w xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|node()"/> </w> </xsl:copy> <!--</emph>--> </xsl:template> <!--<xsl:template match="tei:title"> <emph rend="italic" xmlns="http://www.tei-c.org/ns/1.0"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </emph> </xsl:template>--> <!-- Temporary patch to get the correct rendering for <hi @rend="italic"> content in TXM editions : must use <emph> instead of <hi> --> <!--<xsl:template match="tei:hi[matches(@rend,'italic')]" priority="1"> <xsl:element name="emph" namespace="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|node()"/> </xsl:element> </xsl:template>--> <!-- The problem with the following first attempt is that bold is kept, it should be replaced : <xsl:template match="*[matches(@rend,'italic','i')]"> <xsl:copy> <xsl:apply-templates select="@*"/> <emph> <xsl:apply-templates/> </emph> </xsl:copy> </xsl:template> --> <!-- not relevant in current TXM version as only hi and emph elements' @rend are used in editions --> <!--<xsl:template match="@rend"> <xsl:copy> <xsl:choose> <xsl:when test="matches(.,'italics','i')"> <xsl:value-of select="replace(.,'italics','italic','i')"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="."/> </xsl:otherwise> </xsl:choose> </xsl:copy> </xsl:template>--> <!-- Not relevant because numerals are composed with underscores and not dot separators, so there is no problem with tokenisation : <xsl:template match="tei:num"> <xsl:copy> <w xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates select="@*|node()"/> </w> </xsl:copy> </xsl:template> --> <!-- This template is relevant for latin texts with greek fragments not to be indexed, not for greek ones. But doesn't work well anyway -> to be fixed. <xsl:template match="*[@xml:lang='greek']"> <xsl:copy> <xsl:apply-templates select="@*"/> <foreign xml:lang="greek" xmlns="http://www.tei-c.org/ns/1.0"> <xsl:apply-templates/> </foreign> </xsl:copy> </xsl:template> --> <!-- Verse quotations temporary processing (replaced by a post-processing dedicated to HTML edition preparing - XSLT 4th stage in TXM XML-TEI-XTZ import) --> <!--<xsl:template match="tei:l"> <lb xmlns="http://www.tei-c.org/ns/1.0"/> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> </xsl:template> <xsl:template match="tei:bibl"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> <lb xmlns="http://www.tei-c.org/ns/1.0"/> </xsl:template> <xsl:template match="tei:quote[descendant::tei:l and not(following-sibling::tei:quote[1][descendant::tei:l]) and not(following-sibling::*[1][self::tei:bibl])]"> <xsl:copy> <xsl:apply-templates select="@*|node()"/> </xsl:copy> <lb xmlns="http://www.tei-c.org/ns/1.0"/> </xsl:template>--> </xsl:stylesheet>
This file is put in the xsl/3-posttok subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
<?xml version="1.0"?> <xsl:stylesheet xmlns:edate="http://exslt.org/dates-and-times" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:txm="http://textometrie.org/ns/1.0" exclude-result-prefixes="tei edate" xpath-default-namespace="http://www.tei-c.org/ns/1.0" version="2.0"> <!-- This software is dual-licensed: 1. Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License http://creativecommons.org/licenses/by-sa/3.0/ 2. http://www.opensource.org/licenses/BSD-2-Clause All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holder or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. This stylesheet adds a ref attribute to w elements that will be used for references in TXM concordances. Can be used with TXM XTZ import module. Written by Alexei Lavrentiev, UMR 5317 IHRIM, 2017 --> <!-- Modified version of the stylesheet : customization for an experiment on Plato's texts, July 2017, B. Pincemin & S. Marchand --> <xsl:output method="xml" encoding="utf-8" omit-xml-declaration="no"/> <!-- General patterns: all elements, attributes, comments and processing instructions are copied --> <xsl:template match="*"> <xsl:copy> <xsl:apply-templates select="*|@*|processing-instruction()|comment()|text()"/> </xsl:copy> </xsl:template> <xsl:template match="*" mode="position"><xsl:value-of select="count(preceding-sibling::*)"/></xsl:template> <xsl:template match="@*|comment()|processing-instruction()"> <xsl:copy/> </xsl:template> <xsl:variable name="filename"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.xml$"> <xsl:matching-substring> <xsl:value-of select="regex-group(2)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:template match="tei:w"> <xsl:variable name="ctsurn"> <xsl:choose> <xsl:when test="ancestor::tei:text/@*:id"> <xsl:value-of select="ancestor::tei:text[1]/@*:id[1]"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="$filename"/> </xsl:otherwise> </xsl:choose> <xsl:if test="preceding::tei:milestone[@unit='section'][1][@n]"> <xsl:text>:</xsl:text> <xsl:value-of select="preceding::tei:milestone[@unit='section'][1]/@n"/> </xsl:if> </xsl:variable> <xsl:variable name="currentsection"> <xsl:choose> <xsl:when test="preceding::tei:milestone[@unit='section'][1][@n]"> <xsl:analyze-string select="preceding::tei:milestone[@unit='section'][1]/@n" regex="^(\d+)(.*)$"> <xsl:matching-substring> <xsl:value-of select="concat(format-number(number(regex-group(1)),'0000'),regex-group(2))"></xsl:value-of> </xsl:matching-substring> <xsl:non-matching-substring><xsl:value-of select="."/></xsl:non-matching-substring> </xsl:analyze-string> </xsl:when> <xsl:otherwise>SN</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="ref"> <xsl:choose> <!-- Following condition to be fixed : this doesn't catch the title1 attribute value --> <!--<xsl:when test="ancestor::tei:text/@*:title1"> <xsl:value-of select="ancestor::tei:text[1]/@*:title1[1]"/> </xsl:when>--> <!--<xsl:when test="/tei:TEI/tei:teiHeader[1]/tei:fileDesc[1]/tei:titleStmt[1]/tei:title[1]">--> <xsl:when test="/tei:TEI/tei:text[1]/@title"> <!--<xsl:value-of select="replace(normalize-space(/tei:TEI/tei:teiHeader[1]/tei:fileDesc[1]/tei:titleStmt[1]/tei:title[1]),'^([^.,\(:;]*[^.,\(:;\s]+)\s*.*$','$1')"/>--> <xsl:value-of select="/tei:TEI/tei:text[1]/@title"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="$filename"/> </xsl:otherwise> </xsl:choose> <xsl:if test="preceding::tei:milestone[@unit='section'][1][@n]"> <xsl:text>, s. </xsl:text> <!--<xsl:value-of select="preceding::tei:milestone[@unit='section'][1]/@n"/>--> <xsl:value-of select="$currentsection"/> </xsl:if> </xsl:variable> <xsl:variable name="ctsurn5"> <xsl:choose> <xsl:when test="ancestor::tei:text/@*:id"> <xsl:value-of select="ancestor::tei:text[1]/@*:id[1]"/> </xsl:when> <xsl:otherwise> <xsl:value-of select="$filename"/> </xsl:otherwise> </xsl:choose> <xsl:if test="preceding::tei:milestone[@unit='section'][1][@n]"> <xsl:text>:</xsl:text> <xsl:value-of select="$currentsection"/> </xsl:if> </xsl:variable> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:attribute name="ctsurn"><xsl:value-of select="$ctsurn"/></xsl:attribute> <xsl:attribute name="ctsurn5"><xsl:value-of select="$ctsurn5"/></xsl:attribute> <xsl:attribute name="ref"><xsl:value-of select="$ref"/></xsl:attribute> <xsl:attribute name="section"><xsl:value-of select="$currentsection"/></xsl:attribute> <xsl:apply-templates select="*|processing-instruction()|comment()|text()"/> </xsl:copy> </xsl:template> </xsl:stylesheet>
This file is put in the xsl/4-edition subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
<?xml version="1.0"?> <xsl:stylesheet xmlns:edate="http://exslt.org/dates-and-times" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:tei="http://www.tei-c.org/ns/1.0" xmlns:txm="http://textometrie.org/1.0" xmlns:xs="http://www.w3.org/2001/XMLSchema" exclude-result-prefixes="#all" version="2.0"> <xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="no" indent="no"/> <!-- <xsl:output method="xml" encoding="UTF-8" omit-xml-declaration="no" indent="no" doctype-public="-//W3C//DTD XHTML 1.0 Transitional//EN" doctype-system="http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"/> --> <xsl:strip-space elements="*"/> <xsl:param name="pagination-element">pb</xsl:param> <xsl:variable name="word-element"> <xsl:choose> <xsl:when test="//tei:c//txm:form">c</xsl:when> <xsl:otherwise>w</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="page-number-adjust" as="xs:integer"> <xsl:choose> <xsl:when test="//tei:c//txm:form">1</xsl:when> <xsl:otherwise>2</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="inputtype"> <xsl:choose> <xsl:when test="//tei:w//txm:form">xmltxm</xsl:when> <xsl:otherwise>xmlw</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="filename"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.[^/.]+$"> <xsl:matching-substring> <xsl:value-of select="regex-group(2)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:template match="/"> <html> <head> <title><xsl:choose> <xsl:when test="//tei:text/@id"><xsl:value-of select="//tei:text[1]/@id"/></xsl:when> <xsl:otherwise><xsl:value-of select="$filename"/></xsl:otherwise> </xsl:choose></title> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/> <link rel="stylesheet" media="all" type="text/css" href="css/txm.css" /> </head> <xsl:apply-templates select="descendant::tei:text"/> </html> </xsl:template> <xsl:template match="tei:text"> <body> <xsl:if test="$word-element='w'"> <a class="txm-page" title="1" next-word-id="w_0"/> <div class="metadata-page"> <h1><xsl:value-of select="@id"></xsl:value-of></h1> <br/> <table> <xsl:for-each select="@*"> <tr> <td><xsl:value-of select="name()"/></td> <td><xsl:value-of select="."/></td> </tr> </xsl:for-each> </table> </div> </xsl:if> <xsl:apply-templates/> </body> </xsl:template> <xsl:template match="*"> <xsl:choose> <xsl:when test="descendant::tei:p|descendant::tei:ab"> <div> <xsl:call-template name="addClass"/> <xsl:apply-templates/></div> <xsl:text>
</xsl:text> </xsl:when> <xsl:otherwise><span> <xsl:call-template name="addClass"/> <xsl:if test="self::tei:add[@del]"> <xsl:attribute name="title"><xsl:value-of select="@del"/></xsl:attribute> </xsl:if> <xsl:apply-templates/></span> <xsl:call-template name="spacing"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template match="@*|processing-instruction()|comment()"> <!--<xsl:copy/>--> </xsl:template> <!-- <xsl:template match="comment()"> <xsl:copy/> </xsl:template> --> <xsl:template match="text()"> <xsl:value-of select="normalize-space(.)"/> </xsl:template> <xsl:template name="addClass"> <xsl:attribute name="class"> <xsl:value-of select="local-name(.)"/> <xsl:if test="@type"><xsl:value-of select="concat('-',@type)"/></xsl:if> <xsl:if test="@subtype"><xsl:value-of select="concat('-',@subtype)"/></xsl:if> <xsl:if test="@rend"><xsl:value-of select="concat('-',@rend)"/></xsl:if> </xsl:attribute> </xsl:template> <xsl:template match="tei:p|tei:ab|tei:lg"> <p> <xsl:call-template name="addClass"/> <xsl:apply-templates/> </p> <xsl:text>
</xsl:text> </xsl:template> <xsl:template match="tei:head"> <h2> <xsl:call-template name="addClass"/> <xsl:apply-templates/> </h2> </xsl:template> <xsl:template match="tei:gap"> <span class="gap"> <xsl:if test="@quantity and @unit"> <xsl:attribute name="title"><xsl:value-of select="concat(@quantity,' ',@unit)"/></xsl:attribute> </xsl:if> <xsl:text>[...]</xsl:text> </span> <xsl:call-template name="spacing"/> </xsl:template> <xsl:template match="//tei:lb"> <xsl:variable name="lbcount"> <xsl:choose> <xsl:when test="ancestor::tei:ab"><xsl:number from="tei:ab" level="any"/></xsl:when> <xsl:when test="ancestor::tei:p"><xsl:number from="tei:p" level="any"/></xsl:when> <xsl:otherwise>999</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:if test="@rend='hyphen(-)'"><span class="hyphen">-</span></xsl:if> <xsl:if test="@rend='hyphen(=)'"><span class="hyphen">=</span></xsl:if> <xsl:if test="ancestor::tei:w and not(contains(@rend,'hyphen'))"><span class="hyphen-added">-</span></xsl:if> <xsl:if test="not($lbcount=1) or preceding-sibling::node()[matches(.,'\S')]"><br/><xsl:text>
</xsl:text></xsl:if> <xsl:if test="@n and not(@rend='prose')"> <xsl:choose> <xsl:when test="matches(@n,'^[0-9]*[05]$')"> <!--<a title="{@n}" class="verseline" style="position:relative"> </a>--> <!--<span class="verseline"><span class="verselinenumber"><xsl:value-of select="@n"/></span></span>--> <span class="verselinenumber"><xsl:value-of select="@n"/></span> </xsl:when> <xsl:when test="matches(@n,'[^0-9]')"> <!--<a title="{@n}" class="verseline" style="position:relative"> </a>--> <!--<span class="verseline"><span class="verselinenumber"><xsl:value-of select="@n"/></span></span>--> <span class="verselinenumber"><xsl:value-of select="@n"/></span> </xsl:when> <xsl:otherwise> </xsl:otherwise> </xsl:choose> </xsl:if> </xsl:template> <!-- Page breaks --> <xsl:template match="//*[local-name()=$pagination-element]"> <xsl:variable name="next-word-position" as="xs:integer"> <xsl:choose> <xsl:when test="following::*[local-name()=$word-element]"> <xsl:value-of select="count(following::*[local-name()=$word-element][1]/preceding::*[local-name()=$word-element])"/> </xsl:when> <xsl:otherwise>0</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="next-pb-position" as="xs:integer"> <xsl:choose> <xsl:when test="following::*[local-name()=$pagination-element]"> <xsl:value-of select="count(following::*[local-name()=$pagination-element][1]/preceding::*[local-name()=$word-element])"/> </xsl:when> <xsl:otherwise>999999999</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="next-word-id"> <xsl:choose> <xsl:when test="$next-pb-position - $next-word-position = 999999999">w_0</xsl:when> <xsl:when test="$next-pb-position > $next-word-position"><xsl:value-of select="following::*[local-name()=$word-element][1]/@id"/></xsl:when> <xsl:otherwise>w_0</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="editionpagetype"> <xsl:choose> <xsl:when test="ancestor::tei:ab">editionpageverse</xsl:when> <xsl:otherwise>editionpage</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="pagenumber"> <xsl:choose> <xsl:when test="@n"><xsl:value-of select="@n"/></xsl:when> <xsl:when test="@facs"><xsl:value-of select="substring-before(@facs,'.')"/></xsl:when> <xsl:otherwise>[NN]</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="page_id"><xsl:value-of select="count(preceding::*[local-name()=$pagination-element])"/></xsl:variable> <xsl:if test="//tei:note[not(@place='inline') and not(matches(@type,'intern|auto'))][following::*[local-name()=$pagination-element][1][count(preceding::*[local-name()=$pagination-element]) = $page_id]]"> <xsl:text>
</xsl:text> <br/> <br/> <span style="display:block;border-top-style:solid;border-top-width:1px;border-top-color:gray;padding-top:5px"> <xsl:for-each select="//tei:note[not(@place='inline') and not(matches(@type,'intern|auto'))][following::*[local-name()=$pagination-element][1][count(preceding::*[local-name()=$pagination-element]) = $page_id]]"> <xsl:variable name="note_count"><xsl:value-of select="count(preceding::tei:note[matches(@type,'^(apparat|identification|zoologique)$')]) + 1"/></xsl:variable> <!--<p><xsl:value-of select="$note_count"/>. <a href="#noteref_{$note_count}" name="note_{$note_count}">[<xsl:value-of select="preceding::tei:cb[1]/@xml:id"/>, l. <xsl:value-of select="preceding::tei:lb[1]/@n"/>]</a><xsl:text> </xsl:text> <xsl:value-of select="."/></p>--> <span class="note"> <span style="position:absolute;left:-30px"><a href="#noteref_{$note_count}" name="note_{$note_count}"><xsl:value-of select="$note_count"/></a>. </span> <xsl:apply-templates mode="#current"/> </span> </xsl:for-each></span><xsl:text>
</xsl:text> </xsl:if> <xsl:text>
</xsl:text> <br/><xsl:text>
</xsl:text> <a class="txm-page" title="{count(preceding::*[local-name()=$pagination-element]) + $page-number-adjust}" next-word-id="{$next-word-id}"/> <span class="{$editionpagetype}"> <<xsl:value-of select="$pagenumber"/>> </span><br/><xsl:text>
</xsl:text> </xsl:template> <!-- Notes --> <xsl:template match="tei:note"> <!--<span style="color:violet"> [<b>Note :</b> <xsl:apply-templates/>] </span>--> <xsl:variable name="note_count"><xsl:value-of select="count(preceding::tei:note[matches(@type,'^(apparat|identification|zoologique)$')]) + 1"/></xsl:variable> <xsl:variable name="note_content"> <xsl:choose> <xsl:when test="descendant::txm:form"> <xsl:for-each select="descendant::txm:form"> <xsl:value-of select="."/> <xsl:if test="not(matches(following::txm:form[1],'^[.,\)]')) and not(matches(.,'^\S+[''’]$|^[‘\(]$'))"> <xsl:text> </xsl:text> </xsl:if> </xsl:for-each> </xsl:when> <xsl:otherwise><xsl:value-of select="normalize-space(.)"/></xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:choose> <xsl:when test="matches(@type,'intern|auto')"></xsl:when> <xsl:when test="@place='inline'"><span class="note"> (Note : <xsl:value-of select="$note_content"/>)</span></xsl:when> <xsl:when test="not(@place='inline') and not(matches(@type,'intern|auto'))"> <a title="{$note_content}" style="font-size:75%;position:relative;top:-5px" href="#note_{$note_count}" name="noteref_{$note_count}">[<xsl:value-of select="$note_count"/>]</a> </xsl:when> <xsl:otherwise><span class="noteref" title="{$note_content}">[•]</span></xsl:otherwise> </xsl:choose> <xsl:call-template name="spacing"/> </xsl:template> <!--<xsl:template match="tei:bibl"> <span class="noteref" title="{normalize-space(.)}">[•]</span> </xsl:template>--> <xsl:template match="a[@class='txmpageref']"> <xsl:copy-of select="."/> </xsl:template> <!--<xsl:template match="tei:note[@place='inline']"> <span class="noteinline"> <xsl:apply-templates/> </span> </xsl:template> --> <xsl:template match="//tei:w"><span class="w"> <xsl:choose> <xsl:when test="descendant::tei:c//txm:form"> <xsl:apply-templates select="descendant::tei:c"/> </xsl:when> <xsl:otherwise> <xsl:if test="@*:id"> <xsl:attribute name="id"><xsl:value-of select="@*:id"/></xsl:attribute> </xsl:if> <xsl:attribute name="title"> <xsl:for-each select="@*[not(matches(local-name(.),'id'))]"> <xsl:value-of select="concat(name(.),' : ',.,' ; ')"/> </xsl:for-each> <xsl:if test="descendant::txm:ana"> <xsl:for-each select="descendant::txm:ana"> <xsl:value-of select="concat(substring-after(@type,'#'),' : ',.,' ; ')"/> </xsl:for-each> </xsl:if> </xsl:attribute> <xsl:choose> <xsl:when test="descendant::txm:form"> <xsl:apply-templates select="txm:form"/> </xsl:when> <xsl:otherwise><xsl:apply-templates/></xsl:otherwise> </xsl:choose> </xsl:otherwise> </xsl:choose> </span><xsl:call-template name="spacing"/></xsl:template> <!-- <xsl:template match="//txm:form"> <xsl:apply-templates/> </xsl:template> --> <xsl:template name="spacing"> <xsl:choose> <xsl:when test="$inputtype='xmltxm'"> <xsl:call-template name="spacing-xmltxm"/> </xsl:when> <xsl:otherwise> <xsl:call-template name="spacing-xmlw"/> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="spacing-xmlw"> <xsl:choose> <xsl:when test="ancestor::tei:w"/> <xsl:when test="following::tei:w[1][matches(.,'^\s*[.,)\]]+\s*$')]"/> <xsl:when test="matches(.,'^\s*[(\[‘]+$|\w(''|’)\s*$')"></xsl:when> <xsl:when test="position()=last() and (ancestor::tei:choice or ancestor::tei:supplied[not(@rend='multi_s')])"></xsl:when> <xsl:when test="following-sibling::*[1][self::tei:note]"></xsl:when> <xsl:when test="following::tei:w[1][matches(.,'^\s*[:;!?]+\s*$')]"> <xsl:text> </xsl:text> </xsl:when> <xsl:otherwise> <xsl:text> </xsl:text> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="spacing-xmltxm"> <xsl:choose> <xsl:when test="ancestor::tei:w"/> <xsl:when test="following::tei:w[1][matches(descendant::txm:form[1],'^[.,)\]]+$')]"/> <xsl:when test="matches(descendant::txm:form[1],'^[(\[‘]+$|\w(''|’)$')"></xsl:when> <xsl:when test="position()=last() and (ancestor::tei:choice or ancestor::tei:supplied[not(@rend='multi_s')])"></xsl:when> <xsl:when test="following-sibling::*[1][self::tei:note]"></xsl:when> <xsl:when test="following::tei:w[1][matches(descendant::txm:form[1],'^[:;!?]+$')]"> <xsl:text> </xsl:text> </xsl:when> <xsl:otherwise> <xsl:text> </xsl:text> </xsl:otherwise> </xsl:choose> </xsl:template> </xsl:stylesheet>
This file is put in the xsl/4-edition subdirectory (inside the corpus import directory, next to XML-TEI Plato's texts).
<?xml version="1.0" encoding="utf-8"?> <xsl:stylesheet exclude-result-prefixes="#all" xmlns:xsl="http://www.w3.org/1999/XSL/Transform" xmlns:xs="http://www.w3.org/2001/XMLSchema" version="2.0"> <!-- This software is dual-licensed: 1. Distributed under a Creative Commons Attribution-ShareAlike 3.0 Unported License http://creativecommons.org/licenses/by-sa/3.0/ 2. http://www.opensource.org/licenses/BSD-2-Clause All rights reserved. Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met: * Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer. * Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution. This software is provided by the copyright holders and contributors "as is" and any express or implied warranties, including, but not limited to, the implied warranties of merchantability and fitness for a particular purpose are disclaimed. In no event shall the copyright holder or contributors be liable for any direct, indirect, incidental, special, exemplary, or consequential damages (including, but not limited to, procurement of substitute goods or services; loss of use, data, or profits; or business interruption) however caused and on any theory of liability, whether in contract, strict liability, or tort (including negligence or otherwise) arising in any way out of the use of this software, even if advised of the possibility of such damage. $Id$ This stylesheet is based on TEI processpb.xsl by Sebastian Rahtz available at https://github.com/TEIC/Stylesheets/blob/master/tools/processpb.xsl and is adapted by Alexei Lavrentiev to split an HTML edition for TXM platform. --> <xsl:output indent="no" method="html"/> <xsl:param name="css-name-txm">txm</xsl:param> <xsl:param name="css-name"><!--<xsl:value-of select="$current-corpus-name"/>-->perseus</xsl:param> <xsl:param name="edition-name">default</xsl:param> <xsl:param name="number-words-per-page">999999</xsl:param> <xsl:param name="pagination-element">a[@class='txm-page']</xsl:param> <xsl:param name="output-directory"><xsl:value-of select="concat($current-file-directory,'/',$edition-name)"/></xsl:param> <xsl:variable name="current-file-name"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.[^/]+$"> <xsl:matching-substring> <xsl:value-of select="regex-group(2)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:variable name="current-file-directory"> <xsl:analyze-string select="document-uri(.)" regex="^(.*)/([^/]+)\.[^/]+$"> <xsl:matching-substring> <xsl:value-of select="regex-group(1)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:variable name="current-corpus-name"> <xsl:analyze-string select="$current-file-directory" regex="^(.*)/([^/]+)$"> <xsl:matching-substring> <xsl:value-of select="regex-group(2)"/> </xsl:matching-substring> </xsl:analyze-string> </xsl:variable> <xsl:template match="html/body"> <xsl:variable name="pages"> <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:apply-templates select="*|processing-instruction()|comment()|text()"/> </xsl:copy> </xsl:variable> <xsl:for-each select="$pages"> <xsl:apply-templates mode="pass2"/> </xsl:for-each> <!-- creating title page with metadata --> </xsl:template> <!-- first (recursive) pass. look for <pb> elements and group on them --> <xsl:template match="comment()|@*|processing-instruction()|text()"> <xsl:copy-of select="."/> </xsl:template> <xsl:template match="*"> <xsl:call-template name="checkpb"> <xsl:with-param name="eName" select="local-name()"/> </xsl:call-template> </xsl:template> <xsl:template match="a[@class='txm-page']"> <!-- <xsl:variable name="next-word-position" as="xs:integer"> <xsl:choose> <xsl:when test="following::span[@class='w']"> <xsl:value-of select="count(following::span[@class='w'][1]/preceding::span[@class='w'])"/> </xsl:when> <xsl:otherwise>20</xsl:otherwise> </xsl:choose> </xsl:variable> <xsl:variable name="next-pb-position" as="xs:integer"> <xsl:choose> <xsl:when test="following::a[@class='txm-page']"> <xsl:value-of select="count(following::a[@class='txm-page'][1]/preceding::span[@class='w'])"/> </xsl:when> <xsl:otherwise>999999999</xsl:otherwise> </xsl:choose> </xsl:variable> <!-\-<xsl:value-of select="count(following::a[@class='txm-page'][1]/preceding::a[@class='w'])"/>-\-> <xsl:variable name="next-word-id"> <xsl:choose> <xsl:when test="$next-pb-position - $next-word-position = 999999999"><!-\-w_0-\-><xsl:value-of select="concat($next-pb-position,' - ',$next-word-position)"/></xsl:when> <xsl:when test="$next-pb-position > $next-word-position"><xsl:value-of select="following::*:span[@class='w'][1]/@id"/></xsl:when> <xsl:otherwise><!-\- w_0 -\-><xsl:value-of select="concat($next-pb-position,' - ',$next-word-position)"/></xsl:otherwise> </xsl:choose> </xsl:variable>--> <!-- <a xmlns="http://www.w3.org/1999/xhtml"> --> <a> <xsl:copy-of select="@*"/> <!--<xsl:attribute name="next-word-id"><xsl:value-of select="$next-word-id"/></xsl:attribute>--> </a> </xsl:template> <xsl:template name="checkpb"> <xsl:param name="eName"/> <xsl:choose> <xsl:when test="not(.//a[@class='txm-page'])"> <xsl:copy-of select="."/> </xsl:when> <xsl:otherwise> <xsl:variable name="pass"> <xsl:call-template name="groupbypb"> <xsl:with-param name="Name" select="$eName"/> </xsl:call-template> </xsl:variable> <xsl:for-each select="$pass"> <xsl:apply-templates/> </xsl:for-each> </xsl:otherwise> </xsl:choose> </xsl:template> <xsl:template name="groupbypb"> <xsl:param name="Name"/> <xsl:for-each-group select="node()" group-starting-with="a[@class='txm-page']"> <xsl:choose> <xsl:when test="self::a[@class='txm-page']"> <xsl:copy-of select="."/> <xsl:element name="{$Name}"> <xsl:attribute name="rend">CONTINUED</xsl:attribute> <xsl:apply-templates select="current-group() except ."/> </xsl:element> </xsl:when> <xsl:otherwise> <xsl:element name="{$Name}"> <xsl:for-each select=".."> <xsl:copy-of select="@*"/> <xsl:apply-templates select="current-group()"/> </xsl:for-each> </xsl:element> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:template> <!-- second pass. group by <pb> (now all at top level) and wrap groups in <page> --> <xsl:template match="*" mode="pass2"> <xsl:copy> <xsl:apply-templates select="@*|*|processing-instruction()|comment()|text()" mode="pass2"/> </xsl:copy> </xsl:template> <xsl:template match="comment()|@*|processing-instruction()|text()" mode="pass2"> <xsl:copy-of select="."/> </xsl:template> <!-- <xsl:variable name="style"> <xsl:copy-of select="/html/head[1]/style[1]"></xsl:copy-of> </xsl:variable>--> <xsl:template match="*[a[@class='txm-page']]" mode="pass2" > <xsl:copy> <xsl:apply-templates select="@*"/> <xsl:for-each-group select="*" group-starting-with="a[@class='txm-page']"> <xsl:choose> <xsl:when test="self::a[@class='txm-page']"> <xsl:comment> Page <xsl:value-of select="@title"/> déplacée vers <xsl:value-of select="concat($output-directory,'/',$current-file-name,'_',@title,'.html')"/></xsl:comment> <xsl:result-document href="{$output-directory}/{$current-file-name}_{@title}.html/"> <html> <head> <meta name="txm:first-word-id" content="{@next-word-id}"/> <title><xsl:value-of select="concat($current-file-name,', Page ',@title)"/></title> <meta http-equiv="Content-Type" content="text/html;charset=UTF-8"/> <link rel="stylesheet" media="all" type="text/css" href="css/{$css-name-txm}.css"/> <xsl:if test="matches($css-name,'\S')"><link rel="stylesheet" media="all" type="text/css" href="css/{$css-name}.css"/></xsl:if> <!--<xsl:copy-of select="$style"/>--> </head> <body> <div class="txmeditionpage"> <xsl:copy-of select="current-group() except ."/> </div> </body> </html> </xsl:result-document> </xsl:when> <xsl:otherwise> <xsl:copy-of select="current-group()"/> </xsl:otherwise> </xsl:choose> </xsl:for-each-group> </xsl:copy> </xsl:template> </xsl:stylesheet>