Module 5: Drama
1. Henrik Ibsen: The Wild Duck #
The following example is a fragment (the front matter, and pages 102 to 105, belonging to the fifth act) of Henrik Ibsen’s play The Wild Duck, encoded and made available by the University of Virginia Library, for their Text Collection.
The text of the play is preceded by front matter, consisting of a title page, and a table of contents.
The body of the play (<body>) consists of 5 acts, in which no further scenes are discerned. Acts are encoded in <div1> elements, with an "act" value for their @type attributes. The first act is preceded by a character list, encoded in a separate <div1> element, of @type "section". This character list is transcribed as part of the text’s body, in the form of a simple <list>, with role names and descriptions as plain text inside <item> elements. Inside the same <div1> element, the cast list is followed by two paragraphs (<p>). As descriptions of global aspects of the play’s settings, they could have been wrapped in a more expressive <set> element, were they transcribed as part of the text’s <front> part (<set> is only allowed as a child element of <front>). Inside the acts, each speech is marked with <sp>, indicating the speaker as it occurs in the source (<speaker>), without formal reference to the character’s “definition” in the cast list. This link could be provided with a @who attribute on <sp>.
Stage instructions are encoded inside <stage>. The speeches are encoded as prose paragraphs (<p>). Notice, however, how this encoding makes abstraction of physical lines: these are explicitly encoded using the <lb> element.
Besides the regular drama elements, this fragment also contains one footnote, which is transcribed as:
right before the corresponding page break (<pb>). From this encoding it is not clear, however, whether this is a transcribed authorial annotation, or an annotation made by the editor; the @resp attribute could have avoided this confusion. Moreover, as it apparently concerns a translation, the contents of the note could have been encoded more semantically as a <term> - <gloss> pair. The note indicator in the running text is encoded as <ref target="#note5">*</ref> where it occurs in the text.
2. Christopher Marlowe: The Tragedie of Doctor Faustus (B text) #
The following example is a fragment (the front matter, scene 2 of the first act, and back matter) of Christopher Marlowe’s The Tragedie of Doctor Faustus (B text), encoded and made available by the Perseus Digital Library.
The text of the play is preceded by front matter, consisting of a character list, and a prologue. The character list is encoded as a <castList> structure within a <div> container in the <front> part. The cast list mainly consists of loose descriptions of the roles’ names (<role>) per character (<castItem>); some have a role description in <roleDesc>. The “Sins” are grouped in a labeled <castGroup> element; another <castGroup> groups Charles, Darius, and Alexander without explicit label. The cast list is concluded by a list of minor characters, grouped in a <castItem type="list"> element, which overrides the default value of "role" for the @type attribute on <castList>. The front matter is concluded with a prologue (<prologue>) consisting of a speech (<sp>) of 28 lines (<l>) spoken by the Chorus, as indicated by the @who attribute on <sp>, which refers to the ID code of the Chorus <role> in the cast list.
The play is concluded by an 8 line <epilogue> (spoken by the Chorus), an <epigraph>, and trailing material in <trailer>. These are grouped in the <back> section.
The body of the play (<body>) consists of 20 scenes, grouped into 6 acts. Acts are encoded in <div1> elements, in which the scenes occur as <div2> elements. Each speech is marked with <sp>, containing the indication of the speaker as it occurs in the source text (<speaker>), as well as a formal indication (using the @who attribute). Stage instructions are encoded inside <stage>. Notice how the first 10 speeches contain paragraphs (<p>), while the last 4 are made up of verse lines (<l>).
Finally, notice how this text is encoded as any other text, resulting in the use of many common TEI elements (<name>, <foreign>, <orig> / <reg>, <add>, ...). A system of <milestone unit="page"/> elements is used to mark the page boundaries (as an equivalent to the shorter <pb> element), while each visual line break is explicitly marked with a <lb> element, if it does not coincide with a verse line.
Note
In this transcription, the <join> element is used to group the lines of the play in alternative groups, thus overriding the structural organisation in speeches. Although the purpose of this alternative grouping is unknown to us, it could well be for analytical reasons. The <join> element lists pointers to the identification codes of the elements to be grouped as a space-separated list in the @target attribute. The purpose of this element is to formally indicate elements that should be joined. The actual join is supposed to be performed in further processing (e.g., by means of XSLT transformations). For a detailed account of the use of <join>, see section 16.7 Aggregation of the TEI Guidelines.3. Herman Melville: Moby-Dick or, The Whale #
This example features the first two pages of chapter 40 of Herman Melville’s novel Moby Dick:
This example nicely illustrates a mixture of different genres. The main structure is a novel, divided in chapters, most of which consist of narrative paragraphs. However, this chapter (recognisable as such by the heading “Chapter XL”), has the form of embedded drama, with speeches (<sp>), containing indications of the speaking characters (<speaker>) and the speech contents. Moreover, some of the speeches of this drama fragment consist of prose paragraphs (<p>), while others are expressed in verse lines (<l>). The second speech on p. 214 even mixes paragraphs and verse lines. Notice, also, how stage directions (<stage>) occur between speaker indications and speech contents. The first speech of p. 215 contains an embedded stage direction.
Of course, the main structure of this text will have the form of a novel, consisting of chapter text divisions, without any traditional drama front matter (such as cast lists, epilogues, etc.).
4. William Shakespeare: Titus Andronicus #
The following example is a fragment (the front matter, and scene 2 of the second act) of William Shakespeare’s Titus Andronicus, encoded and made available by the Perseus Digital Library.
The text of the play is preceded by front matter, consisting of a character list, and a prologue. The character list is encoded as a <castList> structure within a <div1> container in the <front> part. The cast list consists of <castItem> elements, listing the roles (<role>) with their description (<roleDesc>). Each role is identified with the @xml:id attribute. Three named groups of characters are grouped into <castGroup> elements; one nameless group of minor characters is listed as <castItem type="list">. Notice, how in the latter type of lists, both <role> and <roleDesc> are used a bit indiscriminate at first sight (e.g., both <roleDesc>Romans</roleDesc> and <role xml:id="tit-11">Goths and Romans</role> occur). On second sight, however, <role> appears to be used for all speaking characters, who are formally identified with an @xml:id attribute. The front matter is concluded with a prologue (<prologue>) consisting of 28 lines spoken by the Chorus. The cast list is succeeded by a general description of the setting in which the action takes place, in the <set> element.
The body of the play (<body>) consists of 14 scenes, grouped into 5 acts. Acts are encoded in <div1> elements, in which the scenes occur as <div2> elements. Each speech is marked with <sp>, containing the indication of the speaker as it occurs in the source text (<speaker>), as well as a formal indication (using the @who attribute). Stage instructions are encoded inside <stage>. The speeches are encoded as verse lines (<l>) Notice, however, how logical lines (<l>) are distinguished from typographic lines: the latter are explicitly encoded with the <lb> element, occurring inside <l>.
Note
Notice, how the <lb> elements in this example make use of the @ed (edition) attribute, for indicating the specific edition in which the specific line breaks occur. For an explanation of this feature, see section 3.10.3 Milestone Elements of the TEI Guidelines.5. Oscar Wilde: The Importance of Being Earnest #
This example features a fragment (the front matter and first page) of Oscar Wilde’s The Importance of Being Earnest, a play in three acts. In this transcription, no further scenes are discerned within the acts.
The actual text is preceded by a character list and a list of the scenes, both encoded as <div> elements inside the <front> part of the <text>, with appropriate values for their @type attributes. The character list is encoded as a plain <list> structure, containing <item> elements for the characters (divided into sub-lists of male and female characters). Role descriptions are encoded with <emph> elements. Whereas the specialised <castList>, <castGroup> and <castItem>, <role>, and <roleDesc> elements could have been used, this is a perfectly valid (though less expressive) interpretation and application of the TEI elements. The scenes are listed in a <stage> element, which is a bit more controversial, as the TEI Guidelines make a clear distinction between the <stage> element (stage directions in or in between speeches) and <set> (“a description of the setting, time, locale, appearance, etc., of the action of a play, typically found in the front matter of a printed performance text (not a stage direction)”) elements. Because it is wrapped inside a <div> structure, this is valid TEI, but the encoding could probably be improved to:
The play itself is encoded as a <div1> level text division, in which each act is wrapped in a <div2> element. Inside the speeches (<sp>), the speakers are transcribed as <speaker>, and the speech as prose paragraphs (<p>). Stage directions (<stage>) occur between and in the speeches. Notice how at the beginning of the act, the <view> element is used inside a stage direction, to describe the visual aspects of the setting. This is probably a liberal interpretation of the semantics of this element, which is more geared to “the visual context of some part of a screen play,” viz. the description of what’s on a screen. The <view> element doesn’t seem strictly necessary here: a <stage type="setting"> would probably convey the same information.
Bibliography
- Ibsen, Henrik. 1918. The Wild Duck. New York: Boni and Liveright, Inc.. Encoded and made available by the University of Virginia Library, Text Collection at https://etext.lib.virginia.edu/toc/modeng/public/IbsWild.html.
- Marlowe, Christopher. 1616. The Tragedie of Doctor Faustus. Encoded and made available by the Perseus Digital Library. Available online at https://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.03.0011.
- Melville, Herman. 1922. Moby-Dick or, The Whale. London, Bombay, Sidney: Constable and Company LTD. p. 214–215.. Facsimile available from Internet Archive at https://www.archive.org/details/mobydickorwhale01melvuoft.
- Shakespeare, William. 1594. Titus Andronicus. Encoded and made available by the Perseus Digital Library. Available online at https://www.perseus.tufts.edu/hopper/text?doc=Perseus:text:1999.03.0037.
- Wilde, Oscar. 1930. “The Importance of Being Earnest.” In: Plays, Prose Writings and Poems. London: Everyman. Encoded and made available by CELT: Corpus of Electronic Texts: a project of University College, Cork. Available online at https://www.ucc.ie/celt/published/E850003-002/.