This module defines elements that can be used to encode the physical structure of books and manuscripts, either in order to provide a higher
level of bibliographic detail or more structured encoding of bibliographic facts than allowed by the TEI Header or the Manuscript Description module, or
in order to associate transcribed text or images of pages with an encoding of the physical structure of the book from
which the transcription or images are taken. Two kinds of tags are provided to supplement the standard provisions of the <sourceDesc>
section of the TEI header, those that allow encoding of bibliographic formulae, that is,
standard or project-specific systems of representation or notation of the physical facts of books or manuscripts, such as the "collation formula"
refined by Fredson Bowers, and those
that permit direct encoding of the physical facts themselves. In addition, tags are provided to enable book structure to serve as the primary hierarchy
governing the encoding of the text itself, and tags and a stand-off markup strategy are provided for users who must choose another kind of TEI hierarchy as their primary one
in order to capture the textual features that are of interest to them, but who also wish to encode the physical structure of the source as an aspect of their text encoding.
The <collation> element will appear within the <msDescription> or
<bookDescription> elements in the <sourceDesc> section of the TEI header. It can contain a collation formula and the other
elements that form a full
bibliographic description following the Bowers notation (or some other standard
or project-specific collation formula); or a paragraph-form description of the structure of a book
or manuscript. It can also contain a full formal representation of the structure of the book itself using <codexStructure> and other
tags defined below.
<collation> element has one or more of the following components:
<gatherings>,
an indication of the total number of leaves in <totalLeaves>, and a <pagination> statement.The <collationFormula> element is designed to be used to encode any of the standard kinds of collation formulae, such
as the type of collation formula specified by Fredson Bowers in his influential book Principles of Bibliographical Description, the kinds used by manuscript
cataloguers, and the kind employed in the Gesamtkatalog der Wiegendrucke, or to be adaptable to a project-specific style of collation. It contains
the following elements, none of which is obligatory:
<gatheringRange>, plus
(optionally) a record of the alphabet used to mark signatures (in <signatureAlphabet>) and
(also optionally) a record of the leaves on which signature marks appear (in <signatureLeaves>
and <anomSignature>). Sub-elements of <gatherings>:
Sub-element of <pagination>:
Example (showing the encoding for the formula for an Aphra Behn book given on page 471 of Fredson Bowers's Principles of Bibliographical Description):
<collation>
<format>quarto</format>
<collationFormula>
<gatherings>
<signatureAlphabet>23 letter</signatureAlphabet>
<gatheringRange signed="no">
<start>A</start>
<end>A</end>
<leaves>4</leaves>
</gatheringRange>
<gatheringRange signed="yes">
<start>B</start>
<end>L</end>
<leaves>4</leaves>
</gatheringRange>
<gatheringRange>
<start>M</start>
<end>M</end>
<leaves>4</leaves>
</gatheringRange>
<gatheringRange signed="no">
<start>N</start>
<end>N</end>
<leaves>1</leaves>
</gatheringRange>
<signatureLeaves>
<start>1</start>
<end>2</end>
</signatureLeaves>
<anomSignature type="added">
<gathering>B</gathering>
<leaf>3</leaf>
</anomSignature>
</gatherings>
<totalLeaves>49</totalLeaves>
<pagination>
<pageRange type="front matter" numbered="no">
<start>1</start>
<end>8</end>
</pageRange>
<pageRange numbered="yes">
<start>1</start>
<end>33</end>
</pageRange>
<pageRange numbered="yes">
<start>26</start>
<end>27</end>
</pageRange>
<pageRange numbered="yes">
<start>36</start>
<end>37</end>
</pageRange>
<pageRange numbered="yes">
<start>30</start>
<end>31</end>
</pageRange>
<pageRange numbered="yes">
<start>40</start>
<end>89</end>
</pageRange>
<pageRange numbered="no">
<start>90</start>
<end>90</end>
</pageRange>
<totalPages>90</totalPages>
<paginationAppears>in parens centered in hdl.</paginationAppears>
</pagination>
</collationFormula>
</collation>
The <codexStructure> element encloses a complex of elements that together describe
the full physical form of a printed or handwritten book, such as <gathering>, <leaf>,
and <page>. In the case of multi-volume works,
<codexStructure> may be repeated for each volume.
Sub-elements of <codexStructure>:
conjunct" may be used
to refer to the xml:id of the <leaf> element representing the leaf that is conjoined to this leaf at the spine of the book. The
attribute "signature" may be used to record the signature letter/number printed or written on this leaf. The attribute "sheet" may be used
to assign the leaf to one of the sheets folded together, if more than one sheet is involved in the construction of a single gathering.no" may
be used to record the page number printed or written on this leaf. The attribute "cutFromN" may be used to refer to the xml:id of the page
that before folding into the gathering and cutting of leaves was attached to the top (North) end of the current page; similarly "cutFromS",
"cutFromE", and "cutFromW". The attribute "W" may be used to refer to the xml:id of the page to
which the current page is attached by its leftward edge; similarly "E", and for books with the spine at the top or bottom of the pages,
"N" or "S". The attribute "sheetSide" may be used to assign
the current page to one or other of the surfaces of the printed sheet before folding into a gathering.Example (a representation of a gathering of common octavo, folded as in the illustration in Figure 50 of Gaskell's New Introduction to Bibliography, with all relationships explicitly represented in the encoding)
<gathering>
<leaf xml:id="leaf1" conjunct="#leaf8"><page xml:id="p1" SheetSide="1" cutFromN="#p8" W="#p16"/><page xml:id="p2" SheetSide="2" cutFromN="#p7" E="#p15"/></leaf>
<leaf xml:id="leaf2" conjunct="#leaf7"><page xml:id="p3" SheetSide="2" cutFromN="#p6" W="#p14"/><page xml:id="p4" SheetSide="1" cutFromN="#p5" E="#p13"/></leaf>
<leaf xml:id="leaf3" conjunct="#leaf6"><page xml:id="p5" SheetSide="1" cutFromN="#p4" W="#p12"/><page xml:id="p6" SheetSide="2" cutFromN="#p3" E="#p11"/></leaf>
<leaf xml:id="leaf4" conjunct="#leaf5"><page xml:id="p7" SheetSide="2" cutFromN="#p2" W="#p10"/><page xml:id="p8" SheetSide="1" cutFromN="#p1" E="#p9"/></leaf>
<leaf xml:id="leaf5" conjunct="#leaf4"><page xml:id="p9" SheetSide="1" cutFromN="#p16" cutFromE="#p12" W="#p8"/><page xml:id="p10" SheetSide="2" cutFromN="#p15" cutFromW="#p11" E="#p7"/></leaf>
<leaf xml:id="leaf6" conjunct="#leaf3"><page xml:id="p11" SheetSide="2" cutFromN="#p14" cutFromE="#p10" W="#p6"/><page xml:id="p12" SheetSide="1" cutFromN="#p13" cutFromW="#p9" E="#p5"/></leaf>
<leaf xml:id="leaf7" conjunct="#leaf2"><page xml:id="p13" SheetSide="1" cutFromN="#p12" W="#p4" cutFromE="#p16"/><page xml:id="p14" SheetSide="2" cutFromN="#p11" cutFromW="#p15" E="#p3"/></leaf>
<leaf xml:id="leaf8" conjunct="#leaf1"><page xml:id="p15" SheetSide="2" cutFromN="#p10" cuFromE="#p14" W="#p2"/><page xml:id="p16" SheetSide="1" cutFromN="#p9" cutFromW="#p13" E="#p1"/></leaf>
</gathering>
Note: these tags replace <pb/>, <cb/>, and <lb/> tags included in previous editions of these Guidelines.
The following "milestone" tags may be used to indicate within a text the points at which the various articulations of the physical source occur:
<newColumn/> may be used once to mark the
beginning of the transcription of each column in the source document, or may be invoked multiple times in order to signal the changes of
column within each line of a text arranged in tabular form.<l>) begins in the source document.The physical structure of a book can be conceptualized as a series of hierarchically-organized objects, such as gatherings which contain leaves, and pages which contain lines of text. For some encoders, especially those with strong bibliographic interest and those preparing electronic transcriptions of manuscript or print materials, the physical structure hierarchy will be the primary one, and tags are provided elsewhere in this chapter to facilitate such a choice of primary hierarchy. For many other encoders, the rich resources of these Guidelines for encoding conceptual textual hierarchies such as chapters, sections and paragraphs are important and a primary hierarchy other than physical book structure must be chosen. The situation arises so frequently that a researcher using another TEI hierarchy as her or his primary hierarchy also wishes to encode the book structure hierarchy in the same file that special provision is made here to facilitate this in addition to the resources offered in Chapter 31, Multiple Hierarchies.
The mechanism described here creates a kind of within-file "stand-off markup" in which information about the book structure hierarchy is kept separate
from the encoded text but is linked to the book-structure milestone tags within the encoded text. Reference from the encoded text to the elaboration of book structure
in the <codexStructure> section of sourceDesc is by means of pointer-like references to the xml:id attribute of instances of the
<page> element in <sourceDesc>, references which occur within the <pageID> attribute of the empty
milestone element <newPage>.
The following example shows the use of this strategy:
<teiHeader> . . .
<sourceDesc> . . . <msDescription><collation><codexStructure>
. . .
<leaf xml:id="leaf4" ><page xml:id="p7" /><page xml:id="p8"/></leaf>
<leaf xml:id="leaf5" ><page xml:id="p9" /><page xml:id="p10" /></leaf> . . .
</codexStructure></collation></msDescription>
</sourceDesc> . . . </teiHeader>
<text> . . .
<newPage pageID="#p7"/>Text from page seven with associated markup.<newPage pageID="#p8"/>Text from page eight. . . .
</text>
In this example, <newPage/> tags within the text indicate the places where pages begin in the physical book. The <newPage/>
tags are milestone tags that do not contain any text and do not participate in the document hierarchy, so elements that do, such as <div>,
<p>, or <hi> can be used even if the marked sections of text cross page boundaries. However, the book structure hierarchy is specified in the
<codexStructure> section of <sourceDesc>, and the <newPage/> tags within <text>
are linked to that specification of the hierarchy by means of the pageID. In effect, the <newPage/> tags specify the points at which the
book structure hierarchy specified in <codexStructure> intersects with the running text in which they are inserted. Note that only the
<newPage/> tags need to be inserted into the transcription in this example, since leaves and gatherings, composed of pages, can be fully represented
in <codexStructure>. In effect, the book structure hierarchy "stands off" from the encoded text, since it exists as a hierarchy only in
<codexStructure>.
Scholars creating book surrogates or electronic transcriptions, or those who have a strong interest in representing bibliographic structures, may wish to make book structure the primary organizing principle of their encoding of a text. The following tags are provided to permit such encoding. Users should note that in most instances the use of a book structure hierarchy will make it necessary to treat the addition of other forms of TEI markup carefully, either by avoiding the creation of a competing hierarchy or by employing one or more of the techniques outlined in Chapter 31, Multiple Hierarchies. To signal this need for caution, tags provided for recording the physical structure of the source document within the encoded text are provided with the preface "phys":