Module 2: The TEI Header
1. Desiderius Erasmus: Colloquia familiaria #
This example features the TEI header for the transcription of Colloquia familiaria, a series of colloquia written by Desiderius Erasmus. They are encoded and made available by the Stoa Consortium, University of Kentucky.
This is an excellent example of a TEI header. The file description provides the minimal information sections about the title and responsibilities of the electronic text, its publication, and its source. Editorial principles are documented in <encodingDesc>, which also has a statement about sampling decisions in <samplingDesc> (see section 2.3.2 The Sampling Declaration of the TEI Guidelines). It also contains a formal declaration of a reference system, for which it makes use of <refState> elements (see section 2.3.5.3 Milestone Method of the TEI Guidelines). Two classification systems are declared in <classDecl>: Library of Congress Subject Headings and Library of Congress Classification. The next header section, <profileDesc>, contains the actual classification of the text according to both systems, in <textClass>. This is a nice illustration of two classification strategies: using natural language keywords (<keywords>) or abstract classification codes (<classCode>). Also, the languages of the text are formally declared in <langUsage>. Finally, a complete revision history is available in <revisionDesc>.
2. Thomas Wentworth Higginson: “Letter of 7 November 1885” #
This example shows the TEI header of the digital edition of a letter of 7 November 1885 by the American minister and writer Thomas Wentworth Higginson, encoded and made available by the Lincoln Electronic Text Center of the University of Nebraska.
This TEI header provides detailed documentation about the electronic text in <fileDesc>. The title statement not only identifies the people responsible for transcription and markup, but also for the technical processing of the letters by means of stylesheets. The <extent> section needs to be completed still; of course, this can only be done after completion of the encoding. Notice the detailed statement of availability in <availability>. The source text in which this letter has been published is described using the <biblFull> element; notice how its sections reflect the actual file description in the TEI header of the electronic text (apart from the <sourceDesc> section). The <notesStmt> seems to be used to record some loose annotations about the source text.
The encoding description section only contains a description of the editorial practice in <editorialDecl>. This is done in a prose paragraph. The header is concluded by a minimal revision description, recording only one change.
3. Christopher Marlowe: The Tragedie of Doctor Faustus (B text) #
This example contains the TEI header of the digital edition of Christopher Marlowe’s The Tragedie of Doctor Faustus (B text), encoded and made available by the Perseus Digital Library.
This TEI header provides decent descriptions of the publication details of the electronic text (<publicationStmt>), and the languages occurring in the text (<langUsage>). A reference system is declared in the <encodingDesc> section of the header, using <refState> elements (see section 2.3.5.3 Milestone Method of the TEI Guidelines).
The revision description is interesting both in a positive and a negative way. It clearly contains a detailed list of the changes. The list seems to be generated by an automated versioning system, which allows one to keep complete track of a file’s historical states, and document changes with log messages. Integrating automated revision control in the <revisionDesc> section of the TEI header is an interesting idea, as it combines processability and expressiveness. However, on the encoding level, this integration could be improved. In this case, a single <change> element is (ab)used to record the complete revision history. If the output of the automated version control system would be formatted to distinct <change> elements per revision (either directly, or via a post-processing step), this would make the information much more compliant with the semantics of the TEI header.
One essential point of critique concerns the lacking description of the source document in <sourceDesc>. In this case, the title and author of the source work (that can be recollected from the information in the <titleStmt> subsection) still provide cues to its origin, but this could be much harder for less known texts. It is reasonable to suppose that the source texts of the files in the Perseus Digital Library are documented externally, but then the TEI header sections of these files should at least contain a pointer to these resources.
4. William Shakespeare: “Sonnet 17” #
The following example illustrates the TEI header for a sonnet by William Shakespeare, containing a detailed metrical analysis of the poem. Both the electronic text and its source are bibliographically described in the <fileDesc> section. The text encoding process is described in <encodingDecl>, providing details about the encoding project (<projectDesc>), the editorial policy (<editorialDecl>), and the system used to analyse the metre of the poem (<metDecl>). Notice how the <editorialDecl> subsection had to be repeated, as it both documents features that can be encoded in a TEI category (<segmentation> and <interpretation>), and features for which no such TEI labels are available. (<p>). The standard TEI scheme does not allow both systems (formal and informal) to be mixed, hence the repetition of the <encodingDesc> section. The same goes for the <metDecl> sections: as both a formal (<metSym>) and informal (<p>) description is provided for the metrical system, repeating the <metDecl> element was the easiest solution. Of course, this could have been addressed as well by adapting the TEI schema.
5. Walt Whitman: “After the Argument” #
This example contains the TEI header of the digital edition of a manuscript draft of “After the Argument,” a poem by Walt Whitman. It was encoded and made available by the Walt Whitman Archive.
This TEI header contains a detailed description of the electronic text in <fileDesc>. Apart from the required subsections, the edition of the electronic text is identified briefly in <editionStmt>. The <notesStmt> element contains a general remark about the dating of the manuscript.
Besides the file description, the header contains a detailed account of the file’s history in <revisionDesc>.
Functioning as the header of a manuscript transcription, however, one would have expected at least an <encodingDesc>, documenting how the electronic version relates to the source text. When this text is seen in isolation, this header falls short in explaining the editorial choices (that are referred to, however, in the <revisionDesc>). Of course, this text probably features in the wider context of the Walt Whitman Archive, where uniform encoding practices were used for all texts. Still, without repeating boilerplate information in each text of the archive, it would have made sense to provide an <editorialDecl> section with at least pointers to the external documentation of these practices available at https://www.whitmanarchive.org/about/editorial.html and https://www.whitmanarchive.org/mediawiki/index.php/Whitman_Encoding_Guidelines. Furthermore, as the transcription is fairly detailed in the recording of editorial phenomena (additions, deletions, substitutions), identification of the different document hands in <profileDesc> could have made sense.
(Of course, these are only minor remarks, relative to the quality of the surrounding documentation of the archive in which this text is embedded. Yet, even if such external documentation exists, it makes sense to provide pointers in the document.)
6. Oscar Wilde: The Importance of Being Earnest #
This example contains the TEI header for an electronic edition of Oscar Wilde’s The Importance of Being Earnest, encoded and made available by Corpus of Electronic Texts (CELT), a project of University College, Cork.
This is an excellent TEI header example, featuring quality descriptions of the electronic text (<fileDesc>), its relation to the source text (<encodingDesc>), the context in which it came about (<profileDesc>), and a revision history (<revisionDesc>).
An outstanding feature of this example is the level of detail for the bibliographic description of the source text, in <sourceDesc>. It contais a complete bibliography, in three sections: “select editions,” “select bibliography,” and “the edition used in the digital edition.” The former two categories consist of bibliographic lists, with a <listBibl> element grouping the separate <bibl> elements. The actual edition used for the electronic text is described in detail with a <biblStruct> element.
Bibliography
- Erasmus, Desiderius. 1867-1872. Desiderii Erasmi Roterodami colloquia familiaria. Lipsiae: sumptibus Ottonis Holtze. Encoded and made available by the Stoa Consortium, University of Kentucky at https://web.archive.org/web/20160220004338/https://www.stoa.org/hopper/text.jsp?doc=Stoa:text:2003.02.0006.
- Higginson, Thomas Wentworth. 1885. “Letter of November 7, 1885.” Encoded and made available by the Lincoln Electronic Text Center of the University of Nebraska at https://higginson.unl.edu/letters/LC1885k07.html.
- Islam, Mubina. 2004. “A Selection of Sonnets: electronic edition encoded in XML with a TEI DTD.” Unpublished Master’s Dissertation, London: University College London.
- Marlowe, Christopher. 1616. The Tragedie of Doctor Faustus. Encoded and made available by the Perseus Digital Library. Available online at https://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.03.0011.
- Shakespeare, William. 1978. The Complete Works of William Shakespeare. Edited by Alexander, Peter. London: Collins.
- Whitman, Walt. 1890. “After the Argument.” Manuscript encoded and made available by the Walt Whitman Archive at https://www.whitmanarchive.org/manuscripts/transcriptions/loc.00001.html.
- Wilde, Oscar. 1930. “The Importance of Being Earnest.” In: Plays, Prose Writings and Poems. London: Everyman. Encoded and made available by CELT: Corpus of Electronic Texts: a project of University College, Cork. Available online at https://www.ucc.ie/celt/published/E850003-002/.