TEI by ExampleModule 2: The TEI HeaderRon Van den BrandenEdward VanhoutteMelissa TerrasAssociation for Literary and Linguistic Computing (ALLC)Centre for Data, Culture and Society, University of Edinburgh, UKCentre for Digital Humanities (CDH), University College London, UKCentre for Computing in the Humanities (CCH), King’s College London, UKCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumCentre for Scholarly Editing and Document Studies (CTB)Royal Academy of Dutch Language and LiteratureKoningstraat 189000 GentBelgiumctb@kantl.beEdward VanhoutteMelissa TerrasCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumGentCentre for Scholarly Editing and Document Studies (CTB)Royal Academy of Dutch Language and LiteratureKoningstraat 189000 GentBelgium
Licensed under a Creative Commons Attribution ShareAlike 3.0 License
9 July 2010TEI by Example.Edward VanhoutteeditorRon Van den BrandeneditorMelissa Terraseditor
TEI by Example offers a series of freely available online tutorials walking individuals through the different stages in marking up a document in TEI (Text Encoding Initiative). Besides a general introduction to text encoding, step-by-step tutorial modules provide example-based introductions to eight different aspects of electronic text markup for the humanities. Each tutorial module is accompanied with a dedicated examples section, illustrating actual TEI encoding practice with real-life examples. The theory of the tutorial modules can be tested in interactive tests and exercises.
en-GBtechnical revisioncorrected significant typo (biblStruct for biblFull), removed ref around gireleasecorrected typos + examplescreation
Module 2: The TEI Header
Desiderius Erasmus: Colloquia familiaria
This example features the TEI header for the transcription of
Colloquia familiaria, a series of colloquia written by Desiderius Erasmus. They are encoded and made available by the Stoa Consortium, University of Kentucky.
This is an excellent example of a TEI header. The file description provides the minimal information sections about the title and responsibilities of the electronic text, its publication, and its source. Editorial principles are documented in encodingDesc, which also has a statement about sampling decisions in samplingDesc (see section 2.3.2 The Sampling Declaration of the TEI Guidelines). It also contains a formal declaration of a reference system, for which it makes use of refState elements (see section 188.8.131.52 Milestone Method of the TEI Guidelines). Two classification systems are declared in classDecl: Library of Congress Subject Headings and Library of Congress Classification. The next header section, profileDesc, contains the actual classification of the text according to both systems, in textClass. This is a nice illustration of two classification strategies: using natural language keywords (keywords) or abstract classification codes (classCode). Also, the languages of the text are formally declared in langUsage. Finally, a complete revision history is available in revisionDesc.
Thomas Wentworth Higginson: Letter of 7 November 1885
This example shows the TEI header of the digital edition of a letter of 7 November 1885 by the American minister and writer Thomas Wentworth Higginson, encoded and made available by the Lincoln Electronic Text Center of the University of Nebraska.
This TEI header provides detailed documentation about the electronic text in fileDesc. The title statement not only identifies the people responsible for transcription and markup, but also for the technical processing of the letters by means of stylesheets. The extent section needs to be completed still; of course, this can only be done after completion of the encoding. Notice the detailed statement of availability in availability. The source text in which this letter has been published is described using the biblFull element; notice how its sections reflect the actual file description in the TEI header of the electronic text (apart from the sourceDesc section). The notesStmt seems to be used to record some loose annotations about the source text.
The encoding description section only contains a description of the editorial practice in editorialDecl. This is done in a prose paragraph. The header is concluded by a minimal revision description, recording only one change.
Christopher Marlowe: The Tragedie of Doctor Faustus (B text)
This example contains the TEI header of the digital edition of Christopher Marlowe’s
The Tragedie of Doctor Faustus (B text), encoded and made available by the Perseus Digital Library.
This TEI header provides decent descriptions of the publication details of the electronic text (publicationStmt), and the languages occurring in the text (langUsage). A reference system is declared in the encodingDesc section of the header, using refState elements (see section 184.108.40.206 Milestone Method of the TEI Guidelines).
The revision description is interesting both in a positive and a negative way. It clearly contains a detailed list of the changes. The list seems to be generated by an automated versioning system, which allows one to keep complete track of a file’s historical states, and document changes with log messages. Integrating automated revision control in the revisionDesc section of the TEI header is an interesting idea, as it combines processability and expressiveness. However, on the encoding level, this integration could be improved. In this case, a single change element is (ab)used to record the complete revision history. If the output of the automated version control system would be formatted to distinct change elements per revision (either directly, or via a post-processing step), this would make the information much more compliant with the semantics of the TEI header.
One essential point of critique concerns the lacking description of the source document in sourceDesc. In this case, the title and author of the source work (that can be recollected from the information in the titleStmt subsection) still provide cues to its origin, but this could be much harder for less known texts. It is reasonable to suppose that the source texts of the files in the Perseus Digital Library are documented externally, but then the TEI header sections of these files should at least contain a pointer to these resources.
William Shakespeare: Sonnet 17
The following example illustrates the TEI header for a sonnet by William Shakespeare, containing a detailed metrical analysis of the poem. Both the electronic text and its source are bibliographically described in the fileDesc section. The text encoding process is described in encodingDecl, providing details about the encoding project (projectDesc), the editorial policy (editorialDecl), and the system used to analyse the metre of the poem (metDecl). Notice how the editorialDecl subsection had to be repeated, as it both documents features that can be encoded in a TEI category (segmentation and interpretation), and features for which no such TEI labels are available. (p). The standard TEI scheme does not allow both systems (formal and informal) to be mixed, hence the repetition of the encodingDesc section. The same goes for the metDecl sections: as both a formal (metSym) and informal (p) description is provided for the metrical system, repeating the metDecl element was the easiest solution. Of course, this could have been addressed as well by adapting the TEI schema.
Walt Whitman: After the Argument
This example contains the TEI header of the digital edition of a manuscript draft of
After the Argument, a poem by Walt Whitman. It was encoded and made available by the Walt Whitman Archive.
This TEI header contains a detailed description of the electronic text in fileDesc. Apart from the required subsections, the edition of the electronic text is identified briefly in editionStmt. The notesStmt element contains a general remark about the dating of the manuscript.
Besides the file description, the header contains a detailed account of the file’s history in revisionDesc.
Functioning as the header of a manuscript transcription, however, one would have expected at least an encodingDesc, documenting how the electronic version relates to the source text. When this text is seen in isolation, this header falls short in explaining the editorial choices (that are referred to, however, in the revisionDesc). Of course, this text probably features in the wider context of the Walt Whitman Archive, where uniform encoding practices were used for all texts. Still, without repeating boilerplate information in each text of the archive, it would have made sense to provide an editorialDecl section with at least pointers to the external documentation of these practices available at and . Furthermore, as the transcription is fairly detailed in the recording of editorial phenomena (additions, deletions, substitutions), identification of the different document hands in profileDesc could have made sense.
(Of course, these are only minor remarks, relative to the quality of the surrounding documentation of the archive in which this text is embedded. Yet, even if such external documentation exists, it makes sense to provide pointers in the document.)
Oscar Wilde: The Importance of Being Earnest
This example contains the TEI header for an electronic edition of Oscar Wilde’s
The Importance of Being Earnest, encoded and made available by Corpus of Electronic Texts (CELT), a project of University College, Cork.
This is an excellent TEI header example, featuring quality descriptions of the electronic text (fileDesc), its relation to the source text (encodingDesc), the context in which it came about (profileDesc), and a revision history (revisionDesc).
An outstanding feature of this example is the level of detail for the bibliographic description of the source text, in sourceDesc. It contais a complete bibliography, in three sections: select editions, select bibliography, and the edition used in the digital edition. The former two categories consist of bibliographic lists, with a listBibl element grouping the separate bibl elements. The actual edition used for the electronic text is described in detail with a biblStruct element.
Erasmus, Desiderius. 1867-1872. Desiderii Erasmi Roterodami colloquia familiaria. Lipsiae: sumptibus Ottonis Holtze. Encoded and made available by the Stoa Consortium, University of Kentucky at .Higginson, Thomas Wentworth. 1885. Letter of November 7, 1885. Encoded and made available by the Lincoln Electronic Text Center of the University of Nebraska at .Islam, Mubina. 2004. A Selection of Sonnets: electronic edition encoded in XML with a TEI DTD. Unpublished Master’s Dissertation, London: University College London.Marlowe, Christopher. 1616. The Tragedie of Doctor Faustus. Encoded and made available by the Perseus Digital Library. Available online at .Shakespeare, William. 1978. The Complete Works of William Shakespeare. Edited by Alexander, Peter. London: Collins.Whitman, Walt. 1890. After the Argument. Manuscript encoded and made available by the Walt Whitman Archive at .Wilde, Oscar. 1930. The Importance of Being Earnest. In: Plays, Prose Writings and Poems. London: Everyman. Encoded and made available by CELT: Corpus of Electronic Texts: a project of University College, Cork. Available online at .