Module 7: Critical Editing

5. Encoding Variation in Texts #

After this discussion of the encoding of textual variation itself, it is time to have a look at the bigger picture: how do you integrate these variants into an electronic critical edition? The TEI Guidelines provide 3 different mechanisms for integrating apparatus entries in the encoding of texts (don’t let the names intimidate you):

location-referenced method: apparatus entries are linked to the identified text blocks in a base text that contain the respective lemmas [I, E]
double end-point attachment method: apparatus entries are linked to explicitly identified start and end positions in a base text [I, E]
parallel segmentation method: apparatus entries are encoded inside a transcription of the common (invariant) text of all text witnesses [I]

In this overview, the [I] and [E] labels indicate where an apparatus encoded with that method can be physically located with regards to the transcription of the (base) text it is linked to:

[E]: external apparatus: the apparatus is located outside the transcription of a base text, either in some other part of the TEI document containing the transcription, or in a physically distinct document
→ location-referenced, double end-point attachment
[I]: internal apparatus: each apparatus entry is located inline in the transcription of a (base) text, at the place where the variant occurs
→ location-referenced, double end-point attachment, parallel segmentation

The method chosen and the physical location of the apparatus must be encoded in the TEI Header, in the <variantEncoding> element inside the <encodingDesc> section. This is an empty element with two mandatory attributes (see Module 2: The TEI Header, section 3.2.7):

@method: indicates the method of linking the critical apparatus to the text: either "location-referenced", "double-end-point", or "parallel-segmentation".
@location: indicates the location of the critical apparatus with regards to the text: either "external" or "internal".

Summary

The TEI Guidelines offer 3 methods for linking the critical apparatus to the text. The method chosen must be documented in the <encodingDesc> section of the TEI header, in a special <variantEncoding> element. This is an empty element with 2 mandatory attributes. The @method attribute specifies the method of linking the apparatus to the text (either "location-referenced", "double-end-point", or "parallel-segmentation"). The @location attribute specifies the location of the apparatus relative to the text (either "external" or "internal").

5.1. The Location-Referenced Method #

The location-referenced method links an apparatus entry to a base text, by anchoring it to the text structure in the base text where the variant occurs. This can be done either internally (inside the running text), or externally (outside the running text).

In an internal location-referenced apparatus, the apparatus entries are encoded within the text structures in which the variants occur. The exact location, however, is unimportant. For example, the second paragraph could be encoded as follows:

</encodingDesc>

</teiHeader>

<text>

<body>

<p>The encoding scheme defined by these Guidelines is

<app>

</app>

formulated

<app>

<rdg wit="#p4">either</rdg>

</app>

as an application of the Extensible

<app>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

Markup Language (XML) (Bray et al. (eds.) (2006)). XML is widely used

<app>

<rdg wit="#p2">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing--Text and office systems--Standard Generalized Mark-up Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

).</bibl>

Although widely said to be short for the surnames of its progenitors, the official expansion of this abbreviation is "Standard Generalized Markup Language."</note>

SGML is an international standard</rdg>

<rdg wit="#p3">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

SGML is an international standard</rdg>

<rdg wit="#p4">(SGML)

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

or of the more recently developed W3C Extensible Markup Language (XML)

<bibl>

<editor>World Wide Web Consortium</editor>

<title>Extensible Markup Language (XML) 1.0</title>

, available from

</bibl>

</note>

. Both SGML and XML are widely-used</rdg>

</app>

for the definition of device-independent, system-independent methods of storing and processing

<app>

<rdg wit="#p4">storing and processing</rdg>

<rdg wit="#p2 #p3">representing</rdg>

</app>

texts in electronic form. It is now also the interchange and communication format used by many applications on the World Wide Web. In

<app>

<rdg wit="#p2 #p3">. This chapter presents a brief tutorial guide to its main features, for those readers who have not encountered it before. For a more technical account of TEI practice in using</rdg>

<rdg wit="#p4">; XML being in fact a simplification or derivation of SGML. In the present chapter we introduce informally the basic concepts underlying such markup languages and attempt to explain to</rdg>

</app>

the present chapter we informally introduce some of its basic concepts and attempt to explain to the reader encountering them for the first time how and why they are

<app>

<rdg wit="#p2">SGML standard, see chapter 30, "TEI Conformance," [in separate fascicle]; for a more technical description of the subset of SGML</rdg>

<rdg wit="#p3">SGML standard, see chapter 28, "Conformance," on page 727. For a more technical description of the subset of SGML</rdg>

<rdg wit="#p4">reader encountering them for the first time how they are actually used in the TEI

<app>

<rdg wit="#p2 #p3 #p4">encoding</rdg>

</app>

scheme. Except where the two are explicitly distinguished, references to XML in what follows may be understood to apply equally well to the TEI usage of SGML. a more technical account of For TEI practice see chapter 28

<hi>Conformance</hi>

; for a more technical description of the subset of SGML</rdg>

</app>

used in

<app>

</app>

the TEI scheme. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines.

<app>

<rdg wit="#p2">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," [in separate fascicle]</rdg>

<rdg wit="#p3">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," on page 1247</rdg>

<rdg wit="#p4">, see chapter 39

<hi>Formal Grammar for the TEI-Interchange-Format Subset of SGML</hi>

</rdg>

</app>

</p>

</body>

</text>

</TEI>

Example 22. Encoding variation with an internal “location-referenced” apparatus.

Notice how the apparatus entries can occur anywhere as long as it is inside the text structure (in this case, the <p> element) that contains their variants. The same method can be used for an external apparatus, in which the textual variants are encoded either at a different place inside the base text, or in a physically distinct TEI document. In this external apparatus, each apparatus entry must have a specific attribute: @loc. Its value should refer to the canonical reference of the text structure that contains the variants concerned. In an external apparatus, the previous example could look as follows:

Note

Notice, how the @loc attribute does not refer to an @xml:id value of the text structure concerned, but to its “canonical reference.” For more information, see the documentation of the <app> element, and section 2.3.5 The Reference System Declaration of the TEI Guidelines.

</encodingDesc>

</teiHeader>

<text>

<body>

<p n="par2">The encoding scheme defined by these Guidelines is formulated as an application of the Extensible Markup Language (XML) (Bray et al. (eds.) (2006)). XML is widely used for the definition of device-independent, system-independent methods of storing and processing texts in electronic form. It is now also the interchange and communication format used by many applications on the World Wide Web. In the present chapter we informally introduce some of its basic concepts and attempt to explain to the reader encountering them for the first time how and why they are used in the TEI scheme. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines.</p>

</body>

<back>

<p>

</app>

<rdg wit="#p4">either</rdg>

</app>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

</p>

</div>

</back>

</text>

</TEI>

Example 23. Encoding variation with an external “location-referenced” apparatus.

In these examples, the p5 version of the TEI Guidelines is adopted as the base text to which the apparatus entries are linked. This is the sole text witness for which a full transcription is provided in the electronic critical edition using this reference method. Because of this, the reading of this base text may be omitted from the <app> elements, as in the examples above. Due to the implicit nature of the location references of the apparatus entries, it may be hard to identify the exact places with textual variation. Therefore, the reading of the base text may equally be provided in the apparatus entries inside a <lem> element; combined with string matching, this can help the user of the edition to find out where the actual variation occurs (but notice the difficulty with apparatus entries encoding additions to the base text, as in the second <app> element of following example):

</encodingDesc>

</teiHeader>

<text>

<body>

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines.</p>

</body>

<back>

<p>

</app>

<rdg wit="#p4">either</rdg>

</app>

<lem wit="#p5">the Extensible</lem>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

</p>

</div>

</back>

</text>

</TEI>

Example 24. Including the base text in an external “location-referenced” apparatus.

Summary

The location-referenced method uses an implicit anchoring technique to link the apparatus entries with the base text. In an internal apparatus, the apparatus entries can occur anywhere inside the text structure in which their variants occur. In an external apparatus, the link is established through the use of the @loc attribute on the <app> elements, which points to a canonical reference of the relevant text structures in the base text.

5.2. The Double End-Point Attachment Method #

The double end-point attachment method links an apparatus entry to a base text, by anchoring it to the exact start and end positions of its lemma in the base text. This can be done either internally (inside the running text), or externally (outside the running text).

In an internal double end-point attachment apparatus, the apparatus entries occur immediately after their lemma in the transcription of the base text. A specific @from attribute must be used to point exactly at the starting point of the preceding lemma in the text. Its value should be a pointer to the formal identification code of an element in the base text that corresponds to the start of the lemma. If this point coincides with the start of an existing text structure, the identification code of its element may be used; otherwise, an empty <anchor> element must be inserted in the base text, whose sole purpose is to provide a formal code in its @xml:id attribute. For example, an internal double end-point attachment apparatus for the example in the previous section could look as follows:

</encodingDesc>

</teiHeader>

<text>

<body>

<p>The encoding scheme defined by these Guidelines

</app>

formulated

<rdg wit="#p4">either</rdg>

</app>

as an application of the

Extensible

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

Markup Language

(XML) (Bray et al. (eds.) (2006)). XML is widely used

<rdg wit="#p2">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing--Text and office systems--Standard Generalized Mark-up Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

).</bibl>

Although widely said to be short for the surnames of its progenitors, the official expansion of this abbreviation is "Standard Generalized Markup Language."</note>

SGML is an international standard</rdg>

<rdg wit="#p3">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

SGML is an international standard</rdg>

<rdg wit="#p4">(SGML)

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

or of the more recently developed W3C Extensible Markup Language (XML)

<bibl>

<editor>World Wide Web Consortium</editor>

<title>Extensible Markup Language (XML) 1.0</title>

, available from

</bibl>

</note>

. Both SGML and XML are widely-used</rdg>

</app>

for the definition of device-independent, system-independent methods of

storing and processing

<rdg wit="#p2 #p3">representing</rdg>

<rdg wit="#p4">storing and processing</rdg>

</app>

texts in electronic form

. It is now also the interchange and communication format used by many applications on the World Wide Web. In

</app>

the

present chapter we informally introduce some of its basic concepts and attempt to explain to the reader encountering them for the first time how and why they are

<rdg wit="#p2">SGML standard, see chapter 30, "TEI Conformance," [in separate fascicle]; for a more technical description of the subset of SGML</rdg>

<rdg wit="#p3">SGML standard, see chapter 28, "Conformance," on page 727. For a more technical description of the subset of SGML</rdg>

<rdg wit="#p4">reader encountering them for the first time how they are actually used in the TEI scheme. Except where the two are explicitly distinguished, references to XML in what follows may be understood to apply equally well to the TEI usage of SGML. a more technical account of For TEI practice see chapter 28

<hi>Conformance</hi>

; for a more technical description of the subset of SGML</rdg>

</app>

used

</app>

the TEI

<rdg wit="#p2 #p3 #p4">encoding</rdg>

</app>

scheme

. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines

<rdg wit="#p2">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," [in separate fascicle]</rdg>

<rdg wit="#p3">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," on page 1247</rdg>

<rdg wit="#p4">, see chapter 39

<hi>Formal Grammar for the TEI-Interchange-Format Subset of SGML</hi>

</rdg>

</app>

.</p>

</body>

</text>

</TEI>

Example 25. Encoding variation with an internal “double end-point attachment” apparatus.

An external double end-point attachment apparatus is very similar to its internal equivalent, apart from the fact that the apparatus entries are located outside of the running text. Due to this physical separation, the need arises to explicitly point out the end point of the lemma in the base text as well (again, either using the @xml:id attribute of an existing text structure, or that of an explicit <anchor> element). In order to refer to this end point of the textual variation, the <app> element must have another attribute: @to, pointing at the identification code of the relevant point in the base text. For example, an external apparatus for the previous example could look as follows:

</encodingDesc>

</teiHeader>

<text>

<body>

<p>The encoding scheme defined by these Guidelines

formulated

as an application of the

Extensible

Markup Language

(XML) (Bray et al. (eds.) (2006)). XML is widely used

for the definition of device-independent, system-independent methods of

storing and processing

texts in electronic form

. It is now also the interchange and communication format used by many applications on the World Wide Web. In

the

present chapter we informally introduce some of its basic concepts and attempt to explain to the reader encountering them for the first time how and why they are

used

the TEI

scheme

. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines

.</p>

</body>

<back>

<p>

</app>

<rdg wit="#p4">either</rdg>

</app>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

</p>

</div>

</back>

</text>

</TEI>

Example 26. Encoding variation with an external “double end-point attachment” apparatus.

Of course, here too, the lemma of the base text can be explicitly recorded in the apparatus entries as well:

</encodingDesc>

</teiHeader>

<text>

<body>

<p>The encoding scheme defined by these Guidelines

formulated

as an application of the

Extensible

Markup Language

(XML) (Bray et al. (eds.) (2006)). XML is widely used

for the definition of device-independent, system-independent methods of

storing and processing

texts in electronic form

. It is now also the interchange and communication format used by many applications on the World Wide Web. In

the

present chapter we informally introduce some of its basic concepts and attempt to explain to the reader encountering them for the first time how and why they are

used

the TEI

scheme

. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines

.</p>

</body>

<back>

<p>

</app>

<rdg wit="#p4">either</rdg>

</app>

<lem wit="#p5">the Extensible</lem>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

</app>

</p>

</div>

</back>

</text>

</TEI>

Example 27. Including the base text in an external “double end-point attachment” apparatus.

Summary

The double end-point attachment method provides a means to explicitly anchor an apparatus entry to the exact position where its lemma in the base text differs from one of the other readings. In an internal apparatus, the apparatus entries should be placed immediately after the base text’s lemma. Each <app> element must have a @from attribute pointing to the @xml:id identification code of an element indicating the start of the lemma in the base text. In an external apparatus, the apparatus entries must formally identify the end point of the lemma as well, using a @to attribute that points to the @xml:id identification code of an element indicating the end of the lemma in the base text. If no other elements are available, these @xml:id attributes may be encoded on empty <anchor> elements inside the base text.

5.3. The Parallel Segmentation Method #

Contrary to both other methods, the parallel segmentation method only allows for the encoding of an inline apparatus. Similarly to an internal double end-point attachment apparatus entry, a parallel segmented apparatus entry is encoded inline, at the exact place where the variation occurs. However, a parallel segmented apparatus entry encodes all readings as equal variants, thus interweaving the common (invariant) text of all text witnesses with apparatus entries that contain all different alternative readings. In this sense, the notions of a base text and lemma become obsolete: all text that is common, is shared; all varying text is encoded as a separate reading in an apparatus entry. Because of this exact anchoring at the place of occurrence in the “palimpsest” text, no specific attributes are necessary for the <app> element. For example, the preceding example can be expressed as a parallel segmented apparatus as follows:

</encodingDesc>

</teiHeader>

<text>

<body>

<p>The encoding scheme defined by these Guidelines

<app>

</app>

formulated

<app>

<rdg wit="#p4">either </rdg>

</app>

as an application of

<app>

<rdg wit="#p2 #p3">a system known as the Standard Generalized</rdg>

<rdg wit="#p4">the ISO Standard Generalized</rdg>

<rdg wit="#p5">the Extensible</rdg>

</app>

Markup Language

<app>

<rdg wit="#p2">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing--Text and office systems--Standard Generalized Mark-up Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

).</bibl>

Although widely said to be short for the surnames of its progenitors, the official expansion of this abbreviation is "Standard Generalized Markup Language."</note>

SGML is an international standard</rdg>

<rdg wit="#p3">(SGML).

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

SGML is an international standard</rdg>

<rdg wit="#p4">(SGML)

<bibl>

<editor>International Organization for Standardization</editor>

<title>ISO 8879: Information processing - Text and office systems - Standard Generalized Markup Language (SGML)</title>

, ([

<pubPlace>Geneva</pubPlace>

)</bibl>

</note>

or of the more recently developed W3C Extensible Markup Language (XML)

<bibl>

<editor>World Wide Web Consortium</editor>

<title>Extensible Markup Language (XML) 1.0</title>

, available from

</bibl>

</note>

. Both SGML and XML are widely-used</rdg>

<rdg wit="#p5">(XML) (Bray et al. (eds.) (2006)). XML is widely used</rdg>

</app>

for the definition of device-independent, system-independent methods of

<app>

<rdg wit="#p2 #p3">representing</rdg>

<rdg wit="#p4 #p5">storing and processing</rdg>

</app>

texts in electronic form

<app>

<rdg wit="#p5">. It is now also the interchange and communication format used by many applications on the World Wide Web. In</rdg>

</app>

the

<app>

<rdg wit="#p2">SGML standard, see chapter 30, "TEI Conformance," [in separate fascicle]; for a more technical description of the subset of SGML</rdg>

<rdg wit="#p3">SGML standard, see chapter 28, "Conformance," on page 727. For a more technical description of the subset of SGML</rdg>

<hi>Conformance</hi>

; for a more technical description of the subset of SGML</rdg>

<rdg wit="#p5">present chapter we informally introduce some of its concepts and attempt to explain to the reader encountering them basic for the first time how and why they are</rdg>

</app>

used

<app>

</app>

the TEI

<app>

<rdg wit="#p2 #p3 #p4">encoding</rdg>

</app>

scheme

<app>

<rdg wit="#p2">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," [in separate fascicle]</rdg>

<rdg wit="#p3">, see chapter 39, "Formal Grammar for the TEI-Interchange-Format Subset of SGML," on page 1247</rdg>

<rdg wit="#p4">, see chapter 39

<hi>Formal Grammar for the TEI-Interchange-Format Subset of SGML</hi>

</rdg>

<rdg wit="#p5">. More detailed technical accounts of TEI practice in this respect are provided in chapters

<hi>23. Using the TEI</hi>

<hi>1. The TEI Infrastructure</hi>

, and

<hi>22. Documentation Elements</hi>

of these Guidelines</rdg>

</app>

.</p>

</body>

</text>

</TEI>

Example 28. Encoding variation with an (internal) “parallel segmentation” apparatus.

Summary

The parallel segmentation method encodes all variants as equal readings inside apparatus entries that are located at their precise place of occurrence in all texts. This results in a single text that contains an integral view on both the common text and the textual variants. Because of this, the notions of base text and lemma become irrelevant.

Bibliography

Vanhoutte, Edward, and Ron Van den Branden. 2009. “Describing, Transcribing, Encoding, and Editing Modern Correspondence Material: a Textbase Approach.” Literary and Linguistic Computing 24 (1): 77–98. 10.1093/llc/fqn035.

↑dashboard

Module 7 Sections

Table of Contents

Export

Module 7: Critical Editing

5. Encoding Variation in Texts #

Summary

5.1. The Location-Referenced Method #

Note

Summary

5.2. The Double End-Point Attachment Method #

Summary

5.3. The Parallel Segmentation Method #

Summary

Bibliography