Module 0: Introduction to Text Encoding and the TEI

5. TEI P3 (SGML)

The sample text could be encoded in TEI P3 as well. Being TEI, this is a descriptive encoding scheme that allows the encoder to explicate the structure and semantics of the textual features s/he wants to analyse. In our sample, we see the typical features of TEI documents (although some of the names have evolved since version P3): a document is encoded in a <TEI.2> element, containing both a <teiHeader> section for the meta-information, and a <text> part for the actual text contents. The header must contain a minimal amount of meta-information, while the text content itself is encoded in <body>. Inside the text, the structural elements (heading — <head>, paragraph — <p>, footnote — <note @place=foot>), as well as semantic features (title — <title>, emphasis — <emph>, term — <term>) can be fully expressed with comprehensible tag names.

Notice, however, that this is SGML, not XML: some elements can occur without end tags (<title>, <body>, <p>, <head>), and attribute values can occur without surrounding quotes (“type=foot”).

<TEI.2> <teiHeader> <fileDesc> <titleStmt> <title>Review: an electronic transcription </titleStmt> <publicationStmt> <p>Published as an example for the Introduction module of TBE. </publicationStmt> <sourceDesc> <p>No source: born digital. </sourceDesc> </fileDesc> </teiHeader> <text> <body> <head>Review <p><title>Die Leiden des jungen Werther <note place=foot>by <name>Goethe</name> is an <emph>exceptionally</emph> good example of a book full of <term>Weltschmerz</term>. </text> </TEI.2>
Example 4. A TEI P3 SGML example.