Customising TEI, ODD, Roma

1. Introduction

Throughout its history, TEI has grown into a complex and encompassing system, allowing you to express your view on a text in a very flexible way, ranging from rather general statements on the textual structure to highly specific analyses of all kinds of textual phenomena. Currently, the TEI defines no less than 505 different elements and it is hard to imagine a document that would need them all. On the other hand, it is much easier to imagine a document that would need just that element that isn't present in the current set of elements defined by TEI.
The TEI community anticipated such concerns and explicitly designed TEI P5 as
  • a highly modular system, allowing users to cherry-pick the parts they need
  • an extensible system, allowing users to add new elements and attributes or modify existing ones
Put differently, TEI very much resembles a library of text concepts where you can walk in, stroll through shelves filled with TEI element and attribute definitions, and choose exactly those that suit your document analysis. When you check out at the counter, they will all be collected and put in a nice bag, reading 'schema for [your name]'s documents'. What's more, even if you have brought your own elements and attributes, they will be included in the same schema! You take your receipt, labelled 'blueprint for [your name]'s TEI schema', walk home and happily start encoding your texts with your TEI schema. In the TEI world, this library visit is called 'customising TEI'. As it is a visit you will have to repeat often, this tutorial will guide you through the most relevant steps of customising TEI schemas.
A couple of elements in above analogy will be the focus of this tutorial, so allow us to elaborate them a bit more:
  • In this tutorial, the general term TEI schema will be used for any formal representation of the elements and attributes you expect in a document. TEI schemas can be expressed as Relax NG schemas, W3C Schemas, or DTDs (don't worry, you ordered for one of these at check-out). In general, you don't have to work on these schemas themselves; they are rather meant as auxiliary files for your XML editing / processing software to validate your document(s) and make sure they conform to the rules.
  • Even more important than a schema is the 'blueprint' for your TEI schema. This will allow you to remember the choices made and facilitate you to share your schemas with others. In TEI world, such a 'blueprint' is just another TEI document with specific elements, and is called an ODD (One Document Does it all).
  • It's important to know that, as of TEI P5, there is no 'fixed' monolithic one-size-fits-all TEI schema. Instead, you are supposed to create your own before you can start encoding TEI texts. In this sense, customisation is a built-in prerequisite for using TEI. Testimony to this centrality is the fact that TEI maintains a specific tool for easing this customisation process. It is called 'Roma', and accessible as a user-friendly web form at http://www.tei-c.org/Roma/. Consider it an electronic librarian.
This tutorial won't discuss the different TEI schema formats but instead focus on both the formal ODD way of expressing TEI customisations and the Roma tool. In doing so, it will be the odd one out: at the same time slightly more conceptual than the other ones and more concrete, using and introducing the Roma web tool along the way throughout the examples.