TEI by ExampleModule 5: DramaRon Van den BrandenEdward VanhoutteMelissa TerrasAssociation for Literary and Linguistic Computing (ALLC)Centre for Data, Culture and Society, University of Edinburgh, UKCentre for Digital Humanities (CDH), University College London, UKCentre for Computing in the Humanities (CCH), King’s College London, UKCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumCentre for Scholarly Editing and Document Studies (CTB)Royal Academy of Dutch Language and LiteratureKoningstraat 189000 GentBelgiumctb@kantl.beEdward VanhoutteMelissa TerrasCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumCentre for Scholarly Editing and Document Studies (CTB) , Royal Academy of Dutch Language and Literature, BelgiumGentCentre for Scholarly Editing and Document Studies (CTB)Royal Academy of Dutch Language and LiteratureKoningstraat 189000 GentBelgium
Licensed under a Creative Commons Attribution ShareAlike 3.0 License
9 July 2010TEI By Example.Edward VanhoutteeditorRon Van den BrandeneditorMelissa Terraseditor
TEI By Example offers a series of freely available online tutorials walking individuals through the different stages in marking up a document in TEI (Text Encoding Initiative). Besides a general introduction to text encoding, step-by-step tutorial modules provide example-based introductions to eight different aspects of electronic text markup for the humanities. Each tutorial module is accompanied with a dedicated examples section, illustrating actual TEI encoding practise with real-life examples. The theory of the tutorial modules can be tested in interactive tests and exercises.
en-GBintegrated examples in a single file
Module 5: Drama
Henrik Ibsen: The Wild Duck
The following example is a fragment (the front matter, and pages 102 to 105, belonging to the fifth act) of Henrik Ibsen’s play
The Wild Duck, encoded and made available by the University of Virginia Library, for their Text Collection.
The text of the play is preceded by front matter, consisting of a title page, and a table of contents.
The body of the play (body) consists of 5 acts, in which no further scenes are discerned. Acts are encoded in div1 elements, with an act value for their type attributes. The first act is preceded by a character list, encoded in a separate div1 element, of typesection. This character list is transcribed as part of the text’s body, in the form of a simple list, with role names and descriptions as plain text inside item elements. Inside the same div1 element, the cast list is followed by two paragraphs (p). As descriptions of global aspects of the play’s settings, they could have been wrapped in a more expressive set element, were they transcribed as part of the text’s front part (set is only allowed as a child element of front). Inside the acts, each speech is marked with sp, indicating the speaker as it occurs in the source (speaker), without formal reference to the character’s definition in the cast list. This link could be provided with a who attribute on sp.
Stage instructions are encoded inside stage. The speeches are encoded as prose paragraphs (p). Notice, however, how this encoding makes abstraction of physical lines: these are explicitly encoded using the lb element.
Besides the regular drama elements, this fragment also contains one footnote, which is transcribed as:
literally "the life-lie."
right before the corresponding page break (pb). From this encoding it is not clear, however, whether this is a transcribed authorial annotation, or an annotation made by the editor; the resp attribute could have avoided this confusion. Moreover, as it apparently concerns a translation, the contents of the note could have been encoded more semantically as a term - gloss pair. The note indicator in the running text is encoded as
where it occurs in the text.
Christopher Marlowe: The Tragedie of Doctor Faustus (B text)
The following example is a fragment (the front matter, scene 2 of the first act, and back matter) of Christopher Marlowe’s
The Tragedie of Doctor Faustus (B text), encoded and made available by the Perseus Digital Library.
The text of the play is preceded by front matter, consisting of a character list, and a prologue. The character list is encoded as a castList structure within a div container in the front part. The cast list mainly consists of loose descriptions of the roles’ names (role) per character (castItem); some have a role description in roleDesc. The Sins are grouped in a labeled castGroup element; another castGroup groups Charles, Darius, and Alexander without explicit label. The cast list is concluded by a list of minor characters, grouped in a castItem type="list" element, which overrides the default value of role for the type attribute on castList. The front matter is concluded with a prologue (prologue) consisting of a speech (sp) of 28 lines (l) spoken by the Chorus, as indicated by the who attribute on sp, which refers to the ID code of the Chorus role in the cast list.
The play is concluded by an 8 line epilogue (spoken by the Chorus), an epigraph, and trailing material in trailer. These are grouped in the back section.
The body of the play (body) consists of 20 scenes, grouped into 6 acts. Acts are encoded in div1 elements, in which the scenes occur as div2 elements. Each speech is marked with sp, containing the indication of the speaker as it occurs in the source text (speaker), as well as a formal indication (using the who attribute). Stage instructions are encoded inside stage. Notice how the first 10 speeches contain paragraphs (p), while the last 4 are made up of verse lines (l).
Finally, notice how this text is encoded as any other text, resulting in the use of many common TEI elements (name, foreign, orig / reg, add, ...). A system of milestone unit="page" elements is used to mark the page boundaries (as an equivalent to the shorter pb element), while each visual line break is explicitly marked with a lb element, if it does not coincide with a verse line.In this transcription, the join element is used to group the lines of the play in alternative groups, thus overriding the structural organisation in speeches. Although the purpose of this alternative grouping is unknown to us, it could well be for analytical reasons. The join element lists pointers to the identification codes of the elements to be grouped as a space-separated list in the target attribute. The purpose of this element is to formally indicate elements that should be joined. The actual join is supposed to be performed in further processing (e.g., by means of XSLT transformations). For a detailed account of the use of join, see section 16.7 Aggregation of the TEI Guidelines.
Herman Melville: Moby-Dick or, The Whale
This example features the first two pages of chapter 40 of Herman Melville’s novel
This example nicely illustrates a mixture of different genres. The main structure is a novel, divided in chapters, most of which consist of narrative paragraphs. However, this chapter (recognisable as such by the heading Chapter XL), has the form of embedded drama, with speeches (sp), containing indications of the speaking characters (speaker) and the speech contents. Moreover, some of the speeches of this drama fragment consist of prose paragraphs (p), while others are expressed in verse lines (l). The second speech on p. 214 even mixes paragraphs and verse lines. Notice, also, how stage directions (stage) occur between speaker indications and speech contents. The first speech of p. 215 contains an embedded stage direction.
Of course, the main structure of this text will have the form of a novel, consisting of chapter text divisions, without any traditional drama front matter (such as cast lists, epilogues, etc.).
William Shakespeare: Titus Andronicus
The following example is a fragment (the front matter, and scene 2 of the second act) of William Shakespeare’s
Titus Andronicus, encoded and made available by the Perseus Digital Library.
The text of the play is preceded by front matter, consisting of a character list, and a prologue. The character list is encoded as a castList structure within a div1 container in the front part. The cast list consists of castItem elements, listing the roles (role) with their description (roleDesc). Each role is identified with the xml:id attribute. Three named groups of characters are grouped into castGroup elements; one nameless group of minor characters is listed as castItem type="list". Notice, how in the latter type of lists, both role and roleDesc are used a bit indiscriminate at first sight (e.g., both
Goths and Romans
occur). On second sight, however, role appears to be used for all speaking characters, who are formally identified with an xml:id attribute. The front matter is concluded with a prologue (prologue) consisting of 28 lines spoken by the Chorus. The cast list is succeeded by a general description of the setting in which the action takes place, in the set element.
The body of the play (body) consists of 14 scenes, grouped into 5 acts. Acts are encoded in div1 elements, in which the scenes occur as div2 elements. Each speech is marked with sp, containing the indication of the speaker as it occurs in the source text (speaker), as well as a formal indication (using the who attribute). Stage instructions are encoded inside stage. The speeches are encoded as verse lines (l) Notice, however, how logical lines (l) are distinguished from typographic lines: the latter are explicitly encoded with the lb element, occurring inside l.Notice, how the lb elements in this example make use of the ed (edition) attribute, for indicating the specific edition in which the specific line breaks occur. For an explanation of this feature, see section 3.10.3 Milestone Elements of the TEI Guidelines.
Oscar Wilde: The Importance of Being Earnest
This example features a fragment (the front matter and first page) of Oscar Wilde’s
The Importance of Being Earnest, a play in three acts. In this transcription, no further scenes are discerned within the acts.
The actual text is preceded by a character list and a list of the scenes, both encoded as div elements inside the front part of the text, with appropriate values for their type attributes. The character list is encoded as a plain list structure, containing item elements for the characters (divided into sub-lists of male and female characters). Role descriptions are encoded with emph elements. Whereas the specialised castList, castGroup and castItem, role, and roleDesc elements could have been used, this is a perfectly valid (though less expressive) interpretation and application of the TEI elements. The scenes are listed in a stage element, which is a bit more controversial, as the TEI Guidelines make a clear distinction between the stage element (stage directions in or in between speeches) and set (a description of the setting, time, locale, appearance, etc., of the action of a play, typically found in the front matter of a printed performance text (not a stage direction)) elements. Because it is wrapped inside a div structure, this is valid TEI, but the encoding could probably be improved to:
THE SCENES OF THE PLAY:
Act I. Algernon Moncrieff's Flat in Half-Moon Street, W.Act II. The Garden at the Manor House, Woolton.Act III. Drawing-room at the Manor House, Woolton.TIME: The Present
The play itself is encoded as a div1 level text division, in which each act is wrapped in a div2 element. Inside the speeches (sp), the speakers are transcribed as speaker, and the speech as prose paragraphs (p). Stage directions (stage) occur between and in the speeches. Notice how at the beginning of the act, the view element is used inside a stage direction, to describe the visual aspects of the setting. This is probably a liberal interpretation of the semantics of this element, which is more geared to the visual context of some part of a screen play, viz. the description of what’s on a screen. The view element doesn’t seem strictly necessary here: a stage type="setting" would probably convey the same information.
Ibsen, Henrik. 1918. The Wild Duck. New York: Boni and Liveright, Inc.. Encoded and made available by the University of Virginia Library, Text Collection at .Marlowe, Christopher. 1616. The Tragedie of Doctor Faustus. Encoded and made available by the Perseus Digital Library. Available online at .Melville, Herman. 1922. Moby-Dick or, The Whale. London, Bombay, Sidney: Constable and Company LTD.p. 214–215.. Facsimile available from Internet Archive at .Shakespeare, William. 1594. Titus Andronicus. Encoded and made available by the Perseus Digital Library. Available online at .Wilde, Oscar. 1930. The Importance of Being Earnest. In: Plays, Prose Writings and Poems. London: Everyman. Encoded and made available by CELT: Corpus of Electronic Texts: a project of University College, Cork. Available online at .