Module 1: Common Structure, Elements, and Attributes
5. Global Attributes #
Just as any XML element, TEI elements can carry one or more attributes which provide additional information, and function as their qualifiers and quantifiers. The full list of all attributes defined in TEI is available as Appendix D Attributes of the TEI Guidelines. A couple of these attributes can occur on all TEI elements: those are defined as “global attributes,” in the att.global attribute class, and its subclasses. Not all of those subclasses are always present for all TEI documents (see Module 8: Customising TEI, ODD, Roma, section 5.1 for more information on including TEI modules in a TEI schema), but a number of attribute classes are always present in any TEI schema (since they are defined in the tei module). Together, they define 11 global attributes, available on any TEI element:
- att.global
-
- @xml:id
- provides a unique identifier for an element.
- @n
- provides a number or other label for an element, which does not need to be unique within the document.
- @xml:lang
- indicates the language of an element using a “tag” generated according to BCP 47.
- @xml:base
- provides a base URI reference with which applications can resolve relative URI references into absolute URI references.
- @xml:space
- signals an intention about how white space should be managed by applications.
- att.global.rendition
-
- @rend
- indicates how the element was rendered or presented in the source text.
- @style
- contains an expression in some formal style definition language which defines the rendering or presentation used for this element in the source text.
- @rendition
- points to a description of the rendering or presentation used for this element in the source text.
- att.global.responsibility
- att.global.source
-
- @source
- specifies the source from which some aspect of this element is drawn.
5.1. @xml:id #
The @xml:id attribute provides a unique identifier for the element bearing the attribute. The identifier must be unique in the whole XML document. If there is another element in the XML document bearing the same identifier as a value for this attribute, a validating XML parser will signal a syntax error. Conforming to the World Wide Web Consortium’s XML Recommendations, the attribute value must be a legal name, which means that it must start with a letter or the underscore character and contain no characters other than letters, digits, hyphens, underscores, full stops, and certain combining and extension characters. The use of the colon in a unique identifier is forbidden as it has the specific purpose of indicating namespace prefixes in XML.
Challenge
Which one of the following examples demonstrates a correct use of @xml:id and why?
-
<p xml:id="p:1">For the first time in twenty-five years, Dr Burt Diddledygook decided not to turn up to the annual meeting of the Royal Academy of Whoopledywhaa.</p><p xml:id="p:2">It was a sunny day in late September 1960 bang on noontime and Dr Burt was looking forward to a stroll in the park instead.</p><p xml:id="p:2">He hoped his fellow members of the Royal Academy weren't even going to notice his absence.</p>
-
<p xml:id="1">For the first time in twenty-five years, Dr Burt Diddledygook decided not to turn up to the annual meeting of the Royal Academy of Whoopledywhaa.</p><p xml:id="2">It was a sunny day in late September 1960 bang on noontime and Dr Burt was looking forward to a stroll in the park instead.</p><p xml:id="3">He hoped his fellow members of the Royal Academy weren't even going to notice his absence.</p>
-
<p xml:id="p1">For the first time in twenty-five years, Dr Burt Diddledygook decided not to turn up to the annual meeting of the Royal Academy of Whoopledywhaa.</p><p xml:id="p2">It was a sunny day in late September 1960 bang on noontime and Dr Burt was looking forward to a stroll in the park instead.</p><p xml:id="p3">He hoped his fellow members of the Royal Academy weren't even going to notice his absence.</p>
Solution
Example 3 demonstrates a correct use of the @xml:id attribute: the attribute value is unique, it starts with a letter, and contains no illegal characters. The attribute values in example 1 are not unique and the use of a colon is forbidden. The attribute values in example 2 start with a number, which is not allowed.
5.2. @n #
The @n attribute also provides an identifier for an element, but its value doesn’t need to be a legal XML name. This means that they don’t have to be unique inside the XML document and they may start with and contain any character. Typically @n is used to number or label elements. All @n values in the following examples are legal:
Although by no means mandatory, it often makes sense to enrich the structural units of a document (e.g., lines in a poem) with some sort of identification (in @xml:id) or reference mechanism (in @n). Of course, when dealing with complex and/or long documents, this labelling could become a rather demanding task in itself. Fortunately, this job can be done automatically by an XML processor, which can identify the sequential position of one element within another in an XML document without any additional tagging. Instead of manually providing mechanical references for a long poem or collection of poems, you could as well instruct an XML processor to either enrich the TEI encoding and add @xml:id or @n attributes with appropriate values, or to automatically deduct such reference systems from your markup and present them while rendering the document (e.g., in an HTML version of a poem).
Reference
See section 3.10.2 Creating New Reference Systems of the TEI Guidelines for guidance on creating sensible reference systems for text structures.5.3. @xml:lang #
The language of the content of a given element may be documented as the value of an @xml:lang attribute. If it is not specified, the value is inherited from that of the immediately enclosing element. Therefore, it is simplest to specify the base language of a text on the <TEI> element and override that with @xml:lang attributes only for those elements with a different language.
Reference
The values for the @xml:lang attribute must be constructed in a uniform way as explained in section vi.1. Language identification of the TEI Guidelines.5.4. @xml:base #
Many TEI attributes take a URI reference as their value. Those can be either absolute (starting with the protocol, such as http:, ftp:, ...) or relative (either starting with a local file name, such as names.xml, and/or a fragment identifier, such as #EV). The @xml:base attribute can be used to set a context for all relative URLs appearing within the element on which the @xml:base attribute is specified. For example:
In this example, the relative URI names.xml#EV will be resolved to a subfolder named xml of the folder containing the electronic text containing that reference. Hence, the URI reference will be evaluated as ../xml/names.xml#EV.
5.5. @xml:space #
This global attribute provides a mechanism for indicating to systems processing an XML file how they should treat white space. It has two possible values: "default" (white space will most probably be normalised during processing) and "preserve" (white space should be preserved as is during processing).
In this example, the @xml:space on the <sic> element specifies that the (unusual) spacing in the original form should be preserved when this document is being processed.
Notice, how the @xml:space attribute is rarely used in TEI documents because such layout features are generally expressed more confidently, and descriptively, with TEI elements such as <lb> or <space>, or using the renditional attributes described next.
5.6. @rend #
The @rend attribute is used to document information about the physical appearance of the text in the source. In the following example, it is used to indicate that the title, the French phrase, and the name of the Royal Academy are printed in italics:
The value for @rend can take the form of a white space separated list of idiosyncratic keywords, which an XML processor can act upon when rendering the document. This means that multiple renditional features can be enumerated with @rend.
5.7. @style #
The @style attribute can also be used to document information about the physical appearance of the text in the source. Contrary to @rend, @style must express this information in some formal style definition language. This will most often be CSS, although others are possible as well. The name of that formal style definition language can be given in the <encodingDesc> section of the header, in a <styleDefDecl> element:
5.8. @rendition #
Whereas the @rend and @style attributes documents the appearance of text locally, i.e., attached to an element, the @rendition attribute points to a description of the rendering or appearance in the header (<teiHeader>), more specifically inside a <tagsDecl> inside the <encodingDesc> section. This is done in free text or using a formal language inside a <rendition> element. This way, only one description of the rendering must be given, which can be referred to with @rendition attributes on elements in the text. The advantage of this system becomes clear when both @rendition and @rend are used for occurrences of a given element. While the former refers to an overall description of the appearance of that element in the source, the latter documents the local deviation from that generally imposed rendition.
In the following example, we see a description of the overall rendering of <hi> elements in a document, in the <tagsDecl> element inside the <encodingDesc> section of <teiHeader>. The @gi attribute of <tagUsage> names the elements for which the rendition described in <rendition> is documented. The formal namespace in which the tags described in <tagUsage> are defined, must be specified in the @name attribute of a surrounding <namespace> element. The value of the @rendition attribute of <tagUsage> refers to <rendition> by way of the latter’s @xml:id attribute. This way, all <hi> elements inside <text> have the style, defined as "italic", as their default rendition. In the following example, the third occurrence of the <hi> element in the text documents a deviant rendition, by means of the @rend attribute.
5.9. @cert #
The @cert attribute provides a method of indicating the encoder’s certainty concerning an intervention or interpretation represented by the markup. This can be done with an informal classification, such as "high", "medium", or "low", or more formal systems, such as a probability scale between "1" and "0".
In this example, two alternatives are presented for the transcription of the original form, with an indication of the certainty in their respective @cert attributes.
5.10. @resp #
The @resp attribute is used to indicate the person or agency considered responsible for some aspects of the information encoded by an element. This responsible party should be identified formally in an element with an @xml:id attribute, either in the same document, or elsewhere.
5.11. @source #
The @source attribute is used to indicate the source of an element and its content, for example by pointing to a bibliograhpic citation.