Customising TEI, ODD, Roma

2. Customising TEI: why and how?

This TBE tutorial module starts as any other module: from a concrete text example. This time, we'll consider Lewis Carroll's Alice in Wonderland. To get a sense of the structure of the document, here are the first pages:
A typical page looks like this:
As always, the first step in approaching the encoding of a text is a document analysis, considering this is a prose work consisting of chapters.

Challenge

Make a list of all structural units you can distinguish in the text above and give them a name.

When you're done, click here!
Some of the significant structural elements to be distinguished are these:
  • The document
  • The title page
  • Document title
  • Chapters
  • Headings
  • (Sub)Divisions
  • Paragraphs
  • Quotations
  • Citations
  • Page breaks
  • Figures
  • Line groups
In addition, we are especially interested in the semantic encoding of the names of the different characters and places.
This document analysis allows us to get an idea of the phenomena we want to encode and how to express them in TEI; for a suggestion of the corresponding TEI elements we refer you to the other TBE tutorial modules or the full TEI Guidelines.
However, after completion of this document analysis, we're not quite ready to start encoding our TEI version of Alice's Adventures in Wonderland. Unless you know TEI by heart, it will be very hard to produce a valid TEI transcription, without a TEI schema.
There are two options to get a TEI schema:
  1. Pick one of the sample TEI customisations, available at http://www.tei-c.org/Guidelines/Customization/, in the format of your choice. TEI provides a number of basic customisations, each with their own focus on different aspects of the TEI model. Depending on your needs, these may provide the elements and attributes you need, or you may want to build on them.
  2. Create your own schema with the Roma web tool.
Although the existing TEI customisations in many cases provide all that's needed for the encoding of common textual phenomena, and the study of these customisations provides an excellent source of information on customising and modifying TEI, in this tutorial we'll start from scratch. This way, all concepts can be introduced one at a time, and you will get to learn how to actually interpret existing customisations. Alongside the Roma tool itself, the most important concepts of TEI customisation will be treated, split in two strands:
  • selection and restriction of existing TEI elements and attributes
  • extension of the TEI model with new elements and attributes

Summary

Encoding TEI texts with a TEI schema involves customising the TEI. Either you use one of the precooked TEI customisations, or start creating your own with the Roma web tool. Customisation can be roughly divided into selection and restriction of the existing TEI model, and extension of the TEI model.