Module 2: The TEI Header

2. Exploring a Minimal TEI Header

Let's start this section with a mental exercise (though you are free to make it as physical as you want). Before the holidays, your partner presents you with a short list of book titles she would like to read. Since it is you who took a day off early, you take this wish list and set out to the public library. Most of the titles are easy to find, except for the somewhat more cryptic entry:
Balzac or Zola (don't know exactly) ? something about a magic donkey (in English please!!!) -- sorry, dear, you're the best!
There are many ways you could approach this problem:
  • flesh out all works by Zola and Balzac on the library shelves and try to find the one(s) dealing with magic donkeys
  • have a look at the available titles in the 'translated literature' section
  • try to google for more information first
Depending on how greatly you value your free time, you will probably start / end up by asking the librarian, who will either scan her current knowledge of world literature or a catalogue of library records. Or, if you live in the 21st century, you will probably move to one of the library's computer terminals, search for 'Balzac' or 'Zola' in the author field, narrow the search to 'English' translations, and give it a try with 'donkey' in the title field. If you lived in the 22nd century, the search robot could probably analyse your search query, propose alternatives for unsuccessful search terms, and even suggest you'd give it a try with 'ass' instead of 'donkey'. For the time being, however, you'll have to depend on your (librarian's) world knowledge, patience, and/or creativity in order to find following information:
It is the last field of this library catalogue that will guide you to the right library shelf and a superb holiday. This exercise vulgarises the motivation to abstract primary information about bibliographic objects into fixed categories. In the analog world, this happened on printed library catalogue records; nowadays these are entered as digital records in databases of library catalogues. These fixed categories together make up an 'identity card' of a literary work.
The TEI Guidelines consider such a virtual 'identity card' an essential part of each TEI document. It must be encoded within a <teiHeader> element, before the actual text contents in the <text> part. The 'ID categories' of the TEI Header are the subject of this tutorial module. As a trade-off between exhaustivity and usability, the TEI Guidelines define a wide range of specific TEI Header elements, only a few of which are mandatory. A minimal TEI header for above work would look as follows:
<teiHeader>
<fileDesc>
<titleStmt>
<title>The Wild Ass's Skin: an electronic edition</title>
</titleStmt>
<publicationStmt>
<p>Published as an example for the header module of TBE.</p>
</publicationStmt>
<sourceDesc>
<p>Honoré de Balzac (1906). The Wild Ass's Skin.</p>
</sourceDesc>
</fileDesc>
</teiHeader>
This example shows how a <teiHeader> element must contain a <fileDesc> (file description) element, providing a description of the electronic file. In order to be complete, it must consist of three subsections, in that order:
  • <titleStmt>: a title statement about the electronic text
  • <publicationStmt>: information on the publication of the electronic text
  • <sourceDesc>: a bibliographic description of the source for the electronic text

Note:

In a minimal <teiHeader>, only a description of the electronic text must be given in the <fileDesc> element. Such a minimal file description must consist of <titleStmt>, <publicationStmt>, and <sourceDesc> sections. Moreover, they must occur in this order.
The <titleStmt> element must minimally contain a title for the electronic text. Depending on the nature of this text, this title may repeat the original's title, followed by a paraphrase like electronic version/transcription/edition. Details about the publication and source of the electronic text in <publicationStmt> and <sourceDesc> respectively, may consist of informal prose in loose paragraphs. More specialised elements can be used as well. These are covered in detail in the next section of this tutorial.
You will have noticed that this minimal example of a TEI header does quite a poor job providing an identity card of this novel, compared to the library record example above. However, there are two things of notice:
  • the TEI header is an integral part of any TEI document, and must precede the <text> element with its actual content
  • the TEI header minimally documents aspects of the title, publication, and source text of the electronic text
Of course, the TEI header allows for much more descriptive sophistication. The most important sections of the TEI header are treated in the next sections of this tutorial.

Summary

The TEI header contains meta-information about the electronic text, and is considered an integral part of it. Therefore, the <teiHeader> element must precede the <text> part of any TEI text, documenting at least some aspects of the electronic text in a <fileDesc> element. A file description minimally contains information about the title of the electronic text in <titleStmt>, about its publication in <publicationStmt>, and bibliographic information about the source document from which it is derived <sourceDesc>.