Module 2: The TEI Header

2. Exploring a Minimal TEI Header

Let’s start this section with a mental exercise (though you are free to make it as physical as you want). Before the holidays, your partner presents you with a short list of book titles she would like to read. Since it is you who took a day off early, you take this wish list and set out to the public library. Most of the titles are easy to find, except for the somewhat more cryptic entry:

Balzac or Zola (don't know exactly) ? something about a magic donkey (in English please!!!) -- sorry, dear, you're the best!

There are many ways you could approach this problem:

  • flesh out all works by Zola and Balzac on the library shelves and try to find the one(s) dealing with magic donkeys
  • have a look at the available titles in the “translated literature” section
  • try to google for more information first

Depending on how greatly you value your free time, you will probably start / end up asking the librarian, who will either scan her current knowledge of world literature or a catalogue of library records. Or, if you live in the twenty-first century, you will probably move to one of the library’s computer terminals, search for “Balzac” or “Zola” in the author field, narrow the search to “English” translations, and give it a try with “donkey” in the title field. If you lived in the twenty-second century, the search robot could probably analyse your search query, propose alternatives for unsuccessful search terms, and even suggest you’d give it a try with “ass” instead of “donkey.” For the time being, however, you’ll have to depend on your (librarian’s) world knowledge, patience, and/or creativity in order to find following information:

It is the last field of this library catalogue that will guide you to the right library shelf and a superb holiday. This exercise vulgarises the motivation to abstract primary information about bibliographic objects into fixed categories. In the analog world, this happened on printed library catalogue records; nowadays these are entered as digital records in databases of library catalogues. These fixed categories together make up an “identity card” of a literary work.

The TEI Guidelines consider such a virtual “identity card” an essential part of each TEI document. It must be encoded within a <teiHeader> element, before the actual text contents in the <text> part. The “ID categories” of the TEI header are the subject of this tutorial module. As a trade-off between exhaustivity and usability, the TEI Guidelines define a wide range of specific TEI Header elements, only a few of which are mandatory. A minimal TEI header for the work described in the catalogue record above would look as follows:

<teiHeader xmlns="http://www.tei-c.org/ns/1.0">
<fileDesc>
<titleStmt>
<title>The Wild Ass​’s Skin: an electronic edition</title>
</titleStmt>
<publicationStmt>
<p>Published as an example for the header module of T​BE.</p>
</publicationStmt>
<sourceDesc>
<p>Honoré de Balzac (1906). The Wild Ass​’s Skin.</p>
</sourceDesc>
</fileDesc>
</teiHeader>
Example 1. A minimal TEI header.

This example shows how a <teiHeader> element must contain a <fileDesc> (file description) element, providing a description of the electronic file. In order to be complete, it must consist of three subsections, in that order:

  1. <titleStmt>: a title statement about the electronic text
  2. <publicationStmt>: information on the publication of the electronic text
  3. <sourceDesc>: a bibliographic description of the source for the electronic text

The <titleStmt> element must minimally contain a title for the electronic text. Depending on the nature of this text, this title may repeat the original’s title, followed by a paraphrase like “electronic version/transcription/edition.” Details about the publication and source of the electronic text must be provided in <publicationStmt> and <sourceDesc> respectively. These details ca be given either as informal prose in loose paragraphs, or in specialised elements. Those are covered in detail in the next sections of this tutorial.

You will have noticed that this minimal example of a TEI header does quite a poor job providing an identity card of this novel, compared to the library record example above. However, there are two things of notice:

  1. the TEI header is an integral part of any TEI document, and must precede the <text> element with the actual text content
  2. the TEI header minimally documents aspects of the title, publication, and source of the electronic text

Of course, the TEI header allows for much more descriptive sophistication. The most important sections of the TEI header are treated in the next sections of this tutorial.

Summary

The TEI header contains meta-information about the electronic text, and is considered an integral part of it. Therefore, the <teiHeader> element must precede the <text> part of any TEI text, documenting at least some aspects of the electronic text in a <fileDesc> element. A file description minimally contains information about the title of the electronic text in <titleStmt>, about its publication in <publicationStmt>, and bibliographic information about the source document from which it is derived <sourceDesc>.