Module 7: Critical Editing
4. Encoding Textual Variants #
4.1. Basic Organisation of an Apparatus Entry #
Traditionally, printed critical editions have developed efficient mechanisms to represent textual variants on as little physical space as possible in what is commonly called a critical apparatus. Many types of apparatus exist, depending on the editorial theory, but all tend to put the different readings found in the different text witnesses on a par with one version of the text, which is commonly called the base text. The TEI Guidelines offer an analogous mechanism for representing textual variants in a concise way. A piece of text with corresponding variants in the different text witnesses, is encoded in an <app> (apparatus entry) element, which holds all different readings. Each reading must be encoded in a <rdg> (reading) element, which can be associated to its respective text witness by means of the @wit attribute. Its value should point to the definition of the text witness in a <listWit> element elsewhere in the edition (see section 3). For example, let’s have a closer look at the chapter title in our sample:
[witness p2] |
Chapter 2 A GENTLE INTRODUCTION TO SGML |
---|---|
[witness p3] |
Chapter 2 A Gentle Introduction to SGML |
[witness p4] | 2 A Gentle Introduction to XML |
[witness p5] |
v A Gentle Introduction to XML |
In above example, all text that differs from the corresponding fragment in any other witness is highlighted in yellow. Only the word “A” is shared between all text witnesses. In a digital edition of our sample, these stretches of variant text could be encoded in two apparatus entries:
In this example, both textual variants are encoded as two apparatus entries, with four readings each. Each <rdg> element points to the definition of its corresponding text witness by means of the sigla in its @wit attribute. Notice how each sigil starts with a # sign, because it addresses the @xml:id value of a <witness> element in the edition.
Note
Notice, how the TEI Guidelines offer the means to encode textual variation, without imposing any theoretical assumptions on how to encode an apparatus for the variants in different texts. The treatment of variation in different text versions is an explicit theoretical act of interpretation, and it is up to the encoder to determine corresponding text fragments, and where to delimit stretches of variation. Likewise, the examples in this TBE tutorial module are fairly theory-neutral, in that they tend to use the maximal length of differing text fragments as guiding principle for the demarcation of textual variants.In printed critical editions, the assumption of a base text against which all other versions are compared is quite common. Therefore, besides readings, a TEI apparatus entry can also contain a <lem> (lemma) element, identifying the reading it contains as a “preferred” reading, according to the editor’s theory of the text. Notice that if a <lem> element is used, it must occur as the first element inside <app>. If version p2 were considered the base text to the edition of this sample, the previous example could be encoded as follows:
Note
Because in the context of electronic critical editing a “preferred” reading in a <lem> element is fairly theory-dependent, the examples in this TBE tutorial module will mostly just list all variants as equal <rdg> elements. You have to know, however, that each <app> element may always specify one of its readings as lemma (<lem>) as well.In order to make this representation more efficient, identical readings can be collapsed into one single <rdg> element, by combining the sigla into a list separated by white spaces in the @wit attribute:
Remember how we distinguished different witness groups in the previous section of this tutorial? This allows us to rewrite the sigla of readings shared by the versions of the TEI Guidelines dealing with either SGML or XML, using the group identification code for the corresponding group of witnesses:
You should consider an <app> element as a cross-section of a text fragment over all of the different text witnesses. This means that all <lem> and <rdg> contents should be interpreted as mutually exclusive alternatives. Therefore, each text witness listed in the @wit attributes inside an <app> element should occur only once. Ideally, this should be the minimal requirement as well, so that each apparatus entry contains one corresponding text fragment across all different text witnesses included in the edition (although this is not strictly necessary when the edition uses one base text: see section 5).
Summary
Each variant in a TEI encoded critical edition should be encoded as an apparatus entry, in an <app> element. An apparatus entry contains the different textual variants found in the text witnesses, encoded in different <rdg> (reading) elements. If the edition considers one of the text witnesses as the base text, the readings from that witness can be encoded as a lemma instead, in a <lem> element. Each <lem> or <rdg> element should indicate the text witness(es) it corresponds to in a @wit attribute. The value of this attribute consists of a white space separated list of pointers to the @xml:id code(s) of the <witness> element(s) describing the corresponding text witness(es).4.2. Grouping Readings #
In both variants considered so far, arguments could be made for (re)grouping the readings. In the first apparatus entry, reading p5 is set apart from all others because of the diverging chapter number. In the second apparatus entry, one possible case for explicit grouping could be the “genetic” similarity of the variants in those versions of the TEI Guidelines dealing with SGML or XML.
One way of grouping readings is provided by a <rdgGrp> element. It can be wrapped around <rdg> elements in an apparatus entry, in order to indicate their relatedness in some way. This <rdgGrp> really is nothing more than a wrapper for grouping related readings. For example, the readings in the previous example could be grouped as follows:
When you take a closer look at these variants, you’ll see that some of these readings contain common text as well. In the first variant, the number “2” is shared between both teiSGML readings, and the p4 reading. In the last variant, the p2 and p3 readings are set apart by the common phrase “SGML,” as opposed to “XML” in the teiXML readings. Yet, both p2 and p3 text witnesses vary internally in their use of capitals. Such refinements can’t be expressed using the <rdgGrp> grouping mechanism, as a <rdgGrp> element can only contain <rdg> or <lem> elements. If this grouping is to be maintained, you could express them in a more fine-grained manner using another grouping mechanism: introducing nesting <app> elements in the <rdg> elements that share common text as well as variant readings:
In the first variant, the apparatus distinguishes between those readings whose heading refers to the second chapter (teiSGML and p4), and reading p5, which refers to chapter five. However, as the first group of readings shows internal variation, this can be expressed in further nesting <app> elements (see the nesting <app> elements for the “Chapter” sub-variant, and the line break). The common text can be encoded as plain text contents of the grouping <rdg> element (see the “2,” which occurs in all readings of the group: teiSGML, and p4). In the second variant, the readings corresponding to the text witnesses dealing with SGML are set apart from those dealing with XML. Since the first group of readings contains internal variation, the variant text (“Gentle Introduction to”) is wrapped in a nesting <app> element, while the common text (“SGML”) appears as plain text inside the grouping <rdg> element.
Summary
When so desired, related readings can be grouped using one of two mechanisms. The first one wraps a dedicated <rdgGrp> element around related readings. This element can only contain <lem> and <rdg> elements. A more sophisticated way of grouping readings is provided by using nesting <app> structures inside a <rdg> element.4.3. Classification #
So far, the most elaborate encoding of the chapter’s title in the different text witnesses looks as follows:
Admittedly, this organisation is not the most intuitive one, mostly because it mixes different perspectives:
- a content-oriented one in the first apparatus entry, grouping those variants with a common reading (i.e., the chapter number referred to)
- a genetic-oriented one in the second apparatus entry, grouping the readings according to the groups of witnesses (i.e., those occurring in the versions of the TEI Guidelines dealing with SGML or XML)
However, this is not necessarily the most interesting perspective, for it obscures some obvious correspondences. For example, there is no way of deducting the correspondence between the <lb> reading occurring in three of the four witnesses, as it is “buried” in two different reading groups. There is no reason, however, not to reorganise these apparatus entries in more atomic units:
One could argue that on closer examination, not all of these variants have the same “status”: some are more substantive than others. This may be pointed out at the level of the individual readings, by means of a @type attribute. In this way, we could for example distinguish between "orthographic" readings (differing only in their spelling or presentation) and "substantive" readings (differing in meaning):
With this distinction in place, the type of reading could be adopted as guiding principle to derive larger stretches of variation: only when two subsequent variants only have orthographically different readings, they can be merged to one apparatus entry. Notice also, how in this case all readings for the different apparatus entries share the same type. This can be encoded at the higher level of the apparatus entry as well, simply by providing a @type attribute for the <app> element:
The <rdgGrp>, too, can have a @type attribute for specifying the nature of the group of readings it holds. For example, we could revisit the earlier grouping example using <rdgGrp>:
Summary
The readings inside <rdg> and <lem> can be categorised with a @type attribute, in order to indicate what type of variant they contain. When readings are grouped using <rdgGrp>, the @type attribute equally can indicate what type of variants the reading group consists of. When an apparatus entry only contains variants of the same type, this may be expressed by the @type attribute at the <app> level.4.4. Reading Details #
Besides witness (@wit) and type information (@type), readings and lemmas can provide more information about the readings they hold, in dedicated attributes. One type of information that is particularly useful for critical editions of manuscript source materials, is the identification of a document hand that is responsible for a certain reading, especially when its text witness has been written by different hands. This can be expressed in a @hand attribute, which points to the definition of that hand in the TEI header (see Module 2: The TEI Header, section 3.3.4). This could be applied to our example texts: although the TEI Guidelines are not manuscripts, they are written collaboratively by a team of editors who could be considered document hands. Suppose that we could determine who was responsible for what change in the different versions included in our example critical edition, this could be encoded as follows:
Of course this attribution is subject to a greater or lesser deal of interpretation (especially in this contrived example). Therefore, it makes sense to indicate who is responsible for this interpretation. This can be expressed in a @resp attribute, which can point to an individual responsible for some aspects of the digital edition, as identified in the TEI header (see Module 2: The TEI Header, section 3.1.1). As always, the @resp attribute applies to all aspects of the element it is attached to, and can equally be used to indicate the responsibility for an unsure transcription of a reading. As the hand attribution in the previous example can be considered quite putative, it makes sense to provide responsibility information as well:
Using attributes on <rdg> holds the danger of overgeneralisation, as in following example:
This example is incorrect because the first reading of the first apparatus entry overgeneralises the hand information for the p3 witness, and the last reading of the last entry incorrectly attributes the hand information for the p5 witness. It can be done, however, using a dedicated <witDetail> element, which is intended to provide more information about a specific reading in an apparatus entry. It must have a @wit attribute, identifying the specific text witness it provides more information for. In order to anchor it to a specific <rdg> element, a @target attribute can be used to point to the @xml:id of the concerned <rdg> element. This implies that the reading concerned must be formally identified with an @xml:id attribute. For example, the previous example could be corrected as:
The <witDetail> element is a specialised type of <note>, which means it can occur at many places in the document: either inline at the place of the reading needing further specification, or grouped together elsewhere in the document. The TEI Guidelines recommend to place this element inside <app>, immediately after the <lem> or <rdg> element it provides more information for.
Summary
Lemma (<lem>) and readings (<rdg>) can be further qualified by means of attributes. The @resp attribute can be used to identify the person responsible for the encoding of the reading, while the document hand responsible for that particular reading can be referred to in a @hand attribute. When more detailed information is to be given for a particular reading in a particular text witness, this can be done in a <witDetail> element, whose @wit attribute must point to the concerned text witness, and whose @target attribute can be used to point to the identification code of the affected reading(s).Bibliography
- Vanhoutte, Edward, and Ron Van den Branden. 2009. “Describing, Transcribing, Encoding, and Editing Modern Correspondence Material: a Textbase Approach.” Literary and Linguistic Computing 24 (1): 77–98. 10.1093/llc/fqn035.