Chapter 2. DocBook XML Markup Guidelines

Here are some guidelines that you may find helpful for writing in DocBook. If you have questions about what markup to use for a particular element, or whether our toolchain supports a specific type of markup, please contact .

Keep It Simple

“Keep it Simple” sounds a bit silly when referring to something as complex as DocBook, but the point is that even though DocBook offers over 400 elements, you’ll need only a fraction of them. DocBook is meant to be comprehensive across a universe of technical documentation, but we’re dealing with a very specific subset: content in an O’Reilly title. Practically speaking, you’ll use elements similar to standard HTML elements, such as itemizedlist and table. You can safely stay away from more exotic elements such as confsponsor, msgsub, and seriesvolnums.

“Common Elements” covers some of the commonly used elements in O’Reilly books.

Using Elements Correctly

For XML to be valid, it must not only be well-formed (i.e., all the start and end tags match), it must also have all the tags in the proper hierarchy according to the associated DTD (in our case, the DocBook 4.5 DTD). The tag at the top of the hierarchy is called the root element (e.g., <book>) and contains various child elements (e.g., <chapter>s or <part>s). Logical rules apply, such as the fact that a <sect3> cannot be directly nested within a <sect1>; it must be within a <sect2>. Improper nesting will result in invalid DocBook.

One very good reason to use an XML editor is that it will safeguard you from moving, adding, or deleting elements in ways that don’t follow the DTD hierarchy.

The terms tag and element are sometimes used interchangeably, but there is a distinction. For example, <chapter> is a tag that indicates the start of a chapter element. For the XML document to be well-formed, it must contain an end tag, </chapter>. Some tags are self-contained and stand alone as complete elements, without the need for separate end tags. For example, <xref linkend="foo"/> is self-contained. If you’re familiar with HTML, the rules are pretty much the same.

Sample Markup and PDFs

This document is made up of the same DocBook markup as our books, so you can use it as a model for your own manuscript. In addition to the section “Common Elements”, please check out the sample_chapter we have posted at the URL below.

https://github.com/oreillymedia/orm_book_samples/blob/master/docbook_only/

The samples directory also contains the following skeleton files, if you need them:

  • afterword.xml

  • appa.xml

  • bibliography.xml

  • ch01.xml

  • foreword.xml

  • glossary.xml

  • part1.xml

  • preface.xml (Preface—includes standard boilerplate language)

Please contact if you have any questions about using these files.

Organizing Your Files

As you will see in the book.xml file that O’Reilly provides for you, it contains only references to the chapter files and some metadata. Each chapter file is a complete DocBook document with its own DOCTYPE declaration, which makes validation easier. Once you check out your repository, you can open the ch01.xml file and begin writing.

This section discusses the files as we structure and name them once they are submitted to Production, but when you are working on them, you can structure and name them in any way that’s convenient for you. All that really matters is that you have a valid book.xml—whether it’s a monolithic file you edit directly, generate from a custom Makefile, etc.

Chapters

The book.xml indicates which files comprise the book and the order in which they appear. Here’s an example:

<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<book role="animal">
<title>Some Fantastic Book</title>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="dedication.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="bookinfo.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch00.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch01.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch02.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch03.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch04.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch05.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appa.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appb.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appc.xml"/>
</book>

You can name your chapter files whatever you’d like,[3] and then reference and include them in the book.xml as shown above.

Chapter contributors

For books with multiple contributors, you may want an author name to appear with each chapter. Simply add the following markup above each chapter title:

<chapterinfo>
  <author>
    <firstname>Author</firstname>

    <surname>Name</surname>
  </author>
</chapterinfo>

You can also use this markup for forewords and prefaces (just use prefaceinfo instead of chapterinfo) as well as appendixes (use appendixinfo). Please note that contributor names in a foreword or preface will render after the rest of the content, right-aligned, and preceded by an em dash.

Parts

If you want to group your chapters into parts, grab a skeleton part file here:

https://github.com/oreillymedia/orm_book_samples/blob/master/docbook_only/part1.xml

Then add your chapter files to the appropriate partN.xml file instead of the book.xml file.

Sections

Each chapter is made up of sections. Please use sect1, sect2, and sect3 elements—not generic section elements—to structure your chapter. By default only sect1 and sect2 titles will appear in the TOC.

The barebones structure of a chapter is something like this:

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE chapter PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<chapter id="chapter_id">
  <title>Chapter Title Here</title>
  <sect1>
    <title>Sect1 Title Here</title>
    <para>Text goes here...</para>
    <sect2>
      <title>Sect2 Title Here</title>
      <para>Text goes here...</para>
      <sect3>
        <title>Sect3 title here</title>
        <para>Text goes here...</para>
      </sect3>
    </sect2>
  </sect1>
</chapter>

Note the paras between sections. We ask that you don’t add a section directly after the previous section’s title with no para or other element in between (though doing so is valid). For example, don’t do this:

<sect1>
  <title>Sect1 Title Here</title>
  <sect2>
    <title>Sect2 Title Here</title>
    <para>Text goes here...</para>
  </sect2>
</sect1>

For a complete list of O’Reilly’s style conventions, including proper heading and title capitalization, consult the O’Reilly Stylesheet and Word List. Also keep in mind that, other than inside code listings, you should not put blank lines or paras in your XML documents.

You may also use sect4 elements, although they are not very common in O’Reilly books. A sect4 title renders inline, with an autogenerated period following it, rather than as a separate heading.

Common Elements

The following sections describe common DocBook elements in O’Reilly books.

Block Versus Inline

There are two kinds of elements:

Block

Usually presented with a paragraph break before and after them, block elements may contain character data, inline elements, and possibly other block elements. Examples include paras, lists, sidebars, tables, and block quotes.

Inline

Usually distinguished by a font change rather than obvious breaks, inline elements may contain character data and possibly other inline elements, but never block elements. Examples include cross-references, filenames, commands, and URLs.

Inline Font Markup

Here are some commonly used inline elements:

[<citation>]

Used in cross-reference syntax. Authors can also use this for hardcoded cross-references to other, non-O’Reilly books. As in, “See [TITLE], published by publisher”.

<command>

An executable program, or the entry a user makes to execute a command. As in, “Compare the two documents with the diff command.”

<email>

An email address, such as . (Note that these will become hyperlinks in online versions, so for fake or example addresses, use <emphasis> instead.)

<emphasis>

Provided for use where you would traditionally use italics to emphasize a word or phrase.

<emphasis role="bold">

A general-purpose tag provided for where you would use bold type to emphasize a word or phrase. (Note that O’Reilly house style prefers italics for emphasis.)

<phrase role="roman">

Provided for use within italicized text where you would ordinarily use italics to emphasize a word or phrase.

<filename>

Used for the name of a file, directory, or path (e.g., /usr/bin).

<keycap>

The text printed on a physical key on a computer keyboard (e.g., Return).

<literal>

Any stretch of text that must appear in constant width font.

<replaceable>

Text that should be replaced with user-supplied values or by values determined by context. Appears in constant width italic.

<subscript>

A subscript character.

<superscript>

A superscript character.

<ulink url="ulink.org"/>

Several styles of ulinks and various URL markup/rendering options are supported. See “Hyperlinks” for more details. (Note that these will become hyperlinks in online versions, so for fake or example URLs, use <emphasis> or <uri> instead.)

<userinput>

Data entered by the user, typically at a prompt line. Use with <replaceable> if needed: <userinput><replaceable><userinput>

Cross-References

All cross-references to titled elements—figures, tables, examples, sections, chapters, parts, etc.—should be marked up using xrefs, not written in plain text. xref elements will become live hyperlinks in online versions, and they will automatically update if you move the referenced elements around while editing. There is never any need to hardcode labels (e.g., “Chapter 1”, “Figure 1”) or page numbers, as these aspects of the rendered xref are autogenerated by the stylesheets.

To insert an xref, follow these steps:

  1. Note the id of the element you are referencing. If the element does not have an id, you will need to add one. For the book to be valid, id attributes must be unique across the entire book, have no spaces, not contain a colon, and not start with a number. Here’s an example of a figure id:

    <figure id="figure_titles_written_with_underscores_make_nice_ids">
  2. Once you have the id, you can insert an xref element that references it via a linkend attribute, like so:

    <xref linkend="figure_titles_written_with_underscores_make_nice_ids" />

You cannot use the word “inherit” as an id. It won’t render properly.

The following table shows examples of xref markup and rendering for various elements.

Element to be referenced xref markup xref rendering
<sect1 id="keep_it_simple"> <xref linkend="keep_it_simple"/> “Keep It Simple”
<chapter id="picking_an_xml_editor"> <xref linkend="picking_an_xml_editor"/> Chapter 1
<figure id="docbook_duck_fig"> <xref linkend="docbook_duck_fig"/> Figure 2-1
<example id="sample_example"> <xref linkend="sample_example"/> Example 2-3
<table id="maximum_widths"> <xref linkend="maximum_widths"/> Table 2-1

Note that cross-references to terms in a Glossary use special markup, not xref. See the glossary directory on GitHub for details.

Figures

Figures have a title (aka caption), an autogenerated number, and (per O’Reilly house style) an explicit cross-reference. You do not need to number the figure in the XML; the O’Reilly stylesheets autogenerate the number in the figure label and in all xrefs to it. Unless you have a special reason for using an informal figure (e.g., if it’s impractical for the image to have a title), you should use a formal figure. See “Alt text for images” for an example of informal figure markup. It’s essentially the same as a figure, but without the id attribute or title element.

For more information on how to prepare the image files themselves, see the O’Reilly Media Illustration Guidelines.

Here’s an example of proper figure markup:

<figure id="docbook_duck_fig">
<title>The DocBook duck</title>
<mediaobject>
  <imageobject>
    <imagedata fileref="images/docbook_duck.png" format="PNG" 
               width="4.8in"/>
  </imageobject>
</mediaobject>
</figure>

If you are working on files from an earlier edition of a book, you may see the more complex figure markup we formerly used in Production (it includes a second imageobject, among other things). For any new figures you add, you can stick with the simpler markup shown here.

Figure 2-1 shows how the above markup renders.

The DocBook duck
Figure 2-1. The DocBook duck

The width attribute value in the imagedata is a quick way to make large images “fit” within the PDF page while you’re writing your manuscript. (Note that this is strictly optional, and for your own convenience; width attributes will be stripped out during Production.) See below and “Inline graphics” for more about image sizing.

Make sure to add your image files to the repo (typically in a figs/ or images/ directory). Then set the fileref and format attributes in the XML markup so that they match the image file names and types exactly. For example, if an image is named battery.png in the figs/ directory, it should be referenced in the XML as figs/battery.png, not figs/Battery.png, and the format should be PNG.

Figure floating

Figures appear exactly where you place them in the text, which can sometimes create PDF pages with a lot of white space. While it is not generally necessary, you can add an attribute of float="true" so that the text flows around the image. For example:

<figure id="docbook_duck_fig" float="true">
<title>The DocBook duck</title>
<mediaobject>
  <imageobject>
    <imagedata fileref="images/docbook_duck.png" format="PNG" 
               width="4.8in"/>
  </imageobject>
</mediaobject>
</figure>

Image sizing

When your book goes into Production, O’Reilly Illustration staff will handle processing the images you submit, including scaling them to the appropriate size. However, for the purposes of generating draft PDF documents, you can scale your images using the width attribute of the imagedata element, which scales the image proportionally to the width value supplied. For example, to set a width of 4.8 inches (the maximum width for Animal Guide books), you’d add a width attribute value of 4.8in.

Table 2-1 contains a list of maximum widths you can use to scale images to fit your book’s design.

Table 2-1. Maximum widths for figures in different book templates
Book series Maximum width value (in inches)
Animal or Cookbook 4.8in
Small Animal (and other books with a 6×9 inch trim size) 4.3in
Hacks 4.7in

For more complete under-the-hood info, see http://www.sagehill.net/docbookxsl/ImageSizing.html. (Note that not everything described there will work for O’Reilly’s toolchain.)

Inline graphics

If you need to add an inline graphic (e.g., a small icon that is part of the text), use an inlinemediaobject:

<inlinemediaobject>
   <imageobject>
      <imagedata fileref="images/icons_0501.png" width="0.12in"/>
   </imageobject>
</inlinemediaobject>

A width is required for an inlinemediaobject so that the processor knows how much space to allocate for it. The value 0.12in works well. You can also find the width of the graphic using a web browser, Adobe Acrobat, or any other program that shows you an image’s dimensions.

ASCII art

ASCII art may be usable, but it does create ambiguities for Tools staff who perform an “intake” on your files when they come into Production, as well as the illustrators. Please contact for detailed guidelines before proceeding.

Alt text for images

O’Reilly is committed to making electronic formats of its books accessible to visually impaired readers. EPUB versions of our titles contain alternative text descriptions for images (in the alt attribute of img elements) whenever possible.

By default, for figure elements, we use the contents of the title element as the alt text. However, you can supply your own custom alt text for a figure by adding a textobject element as a child of the figure’s mediaobject, and enclosing the alt text in a phrase element. Here’s an example of the markup to use:

<figure id="figure_with_custom_alt_text">
  <title>Figure image with custom alt text</title>

  <mediaobject>
    <imageobject>
      <imagedata fileref="images/universal_design_for_web_applications_cover.png"/>
    </imageobject>

    <textobject>
      <phrase>Universal Design for Web Applications Cover</phrase>
    </textobject>
  </mediaobject>
</figure>

For images you include in your book that do not have title elements (e.g., informalfigures and inlinemediaobjects), we highly encourage you to supply your own custom alt text in textobjects. (By default, we use the text “image with no caption” as the alt text for informalfigures and leave alt attributes empty for inlinemediaobjects.) Example 2-1 and Example 2-2 show examples of the markup for an informalfigure and inlinemediaobject with custom alt text.

Example 2-1. informalfigure with textobject
<informalfigure id="informalfigure_with_custom_alt_text">
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/universal_design_for_web_applications_cover.png"
                    width="2.4in"/>
    </imageobject>

    <textobject>
      <phrase>Universal Design for Web Applications Cover</phrase>
    </textobject>
 </mediaobject>
</informalfigure>
Example 2-2. inlinemediaobject with textobject
<inlinemediaobject>
  <imageobject>
    <imagedata fileref="images/oreilly_logo.png" width="0.12in"/>
  </imageobject>
  <textobject>
    <phrase>O’Reilly Media, Inc. logo</phrase>
  </textobject>
</inlinemediaobject>

For some tips on writing good alt text, O’Reilly’s Universal Design for Web Applications is a great resource. In particular, see the section, “Keys to Writing Good Text Alternatives,” which is available on Safari. We’d also be happy to supply you with a PDF or EPUB of the book on request.

Tables

If your table requires a description, you expect to refer to it later elsewhere in the text, or it’s especially complex, you probably want to use a table element. Otherwise, consider using an informaltable.

Formal tables

Here’s the markup for a table with a header:

<table id="example_table">
<title>Example formal table</title>
  <tgroup cols="2">
    <thead>
      <row>
        <entry>Heading1</entry>
        <entry>Heading2</entry>
      </row>
    </thead>
    <tbody>
      <row>
        <entry>Text1</entry>
        <entry>Text2</entry>
      </row>
      <row>
        <entry>Text3</entry>
        <entry>Text4</entry>
      </row>
    </tbody>
  </tgroup>
</table>

Table 2-2 shows how it renders.

Table 2-2. Example formal table
Heading1 Heading2
Text1 Text2
Text3 Text4

Note that the title describes the entire table, while the header contains information about each column. A formal table does not always need to have to have a header.

Tables can get much more complex than this example. See http://www.docbook.org/tdg/en/html/table.html for details, though please note that not everything discussed there will work with our toolchain or conform to O’Reilly’s style (check with your editor about the latter).

Informal tables

The markup of an informaltable is similar to that of a table, but it does not have a title or need an id. Here’s an example.

Text1 Text2
Text3 Text4

This particular informal table doesn’t have a header (no thead), but it would be valid to add one. Also, the bottom rule has been suppressed with the use of a frame="none" attribute; see the next section for details on table borders.

Please check with your editor about O’Reilly house style before overriding table defaults, as table markup can be quite labor-intensive for you or Production staff to change later on.

Frames and borders

You can adjust the appearance of the frames and borders in table elements. To control the lines surrounding the <table> element itself, you can set the frame attribute:

frame

A value of all means all sides will be black. A value of none means the bottom rule will be suppressed.

To control the interior cell borders, you can set colsep and rowsep attributes on various elements inside the table:

colsep

A value of 1 draws a rule to the right of the element. A value of 0 suppresses the rule.

rowsep

A value of 1 draws a rule below the element. A value of 0 suppresses the rule.

Lists

There are four common types of lists. The O’Reilly Stylesheet and Word List has more details about when to use them, but here’s the markup and an example of each.

Simple lists

Markup:

<simplelist>
  <member>This is a list of several short items.</member>
  <member>Usually one or a few words each.</member>
</simplelist>

Rendering:

This is a list of several short items.
Usually one or a few words each.

Bulleted (aka itemized) lists

Markup:

<itemizedlist>
  <listitem><para>This is a list.</para></listitem>
  <listitem><para>With bullets.</para></listitem>
<itemizedlist>

Rendering:

  • This is a list.

  • With bullets.

In the case of an itemizedlist nested in an itemizedlist, the child list will use em dashes in place of bullets. If you want to use symbols other than em dashes or bullets, you can set the symbol for an entire itemizedlist by using the mark attribute, or for a single listitem by using the override attribute. For instance, <itemizedlist mark="emdash"> causes an entire list to render with the mark “—" instead of standard bullets. Other options include endash (–), square (■), circle (○), and whitesquare (□).

Again, check with your editor about O’Reilly house style before changing the defaults.

Numbered (aka ordered) lists

Markup:

<orderedlist>
  <listitem><para>This list uses numbers.</para></listitem>
  <listitem><para>Instead of bullets.</para></listitem>
<orderedlist>

Rendering:

  1. This list uses numbers.

  2. Instead of bullets.

To continue the numbering of an orderedlist from a previous list, use a continuation attribute with a value of continues:

<orderedlist continuation="continues">

The default is continuation="restarts". This causes the numbering to begin at 1.

If an orderedlist has other lists nested within it, <orderedlist contin⁠uation="continues"> may cause them to start at the wrong number. In these cases you can add an override attribute with the number at which you’d like the incorrectly numbered listitem to start. Your continued orderedlist will then begin at that number.

Labeled (aka variable or term-definition) lists

A variable list is made up of pairs of items.

Markup:

<variablelist>
  <varlistentry>
    <term>The first part could be a term</term>
    <listitem><para>Followed by a definition.</para></listitem>
  </varlistentry>
  <varlistentry>
    <term>Or a name</term>
    <listitem><para>Followed by a description. Etc.</para></listitem>
  </varlistentry>
</variablelist>

Rendering:

The first part could be a term

Followed by a definition.

Or a name

Followed by a description, etc.

By default, the list term will render in italics. To remove the italics, add a role attribute of plain:

<term role="plain">Variable list term</term>

Notes, Warnings, and Sidebars

You may use these block elements for adding supplemental information or warnings to the reader.

The tip element will render the same as a note.

The caution element will render the same as a warning.

Notes and warnings may contain paras, code blocks, and lists. They should not contain figures, tables, or examples.

Footnotes

A footnote generates a superscript number wherever it is placed in the text, and the body of the footnote appears at the bottom of the page.[4] Table footnotes are lettered and appear directly after the table (not at the bottom of the page). For example:

Here is some text.[a] A bit more text.
This is text. You get the idea.[b]

[a] Here’s a table footnote.

[b] Here’s another.

Footnotes should generally be inserted after punctuation. See the O’Reilly Stylesheet and Word List for guidelines.

We may be able to support the use of symbols instead of numbers for non-table footnotes via a stylesheet customization. Please check with your editor whether this is appropriate for your book.

Indexing

O’Reilly provides professional indexing as a normal part of Production, but if for some reason you’d like to add index markers yourself, this section covers the proper markup. See the O'Reilly Indexing Guidelines for complete details.

To include an index in your PDF, add a line that says <index/> to your book.xml file before the closing </book> tag, like so:

<?xml version="1.0"?>
<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd">
<book role="animal">
<title>Some Fantastic Book</title>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="dedication.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="bookinfo.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch00.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch01.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch02.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch03.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch04.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="ch05.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appa.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appb.xml"/>
<xi:include xmlns:xi="http://www.w3.org/2001/XInclude" href="appc.xml"/>
<index/>
</book>

It’s also helpful if you add a remark element somewhere explaining to Production whether you’re merely adding a few terms that you’d like a professional indexer to incorporate, or whether you’re creating a complete index. Discuss these options with your editor first.

Here’s the basic index entry markup:

<indexterm><primary>index entry syntax, level 1</primary></indexterm>

Secondary entry (subentry) markup:

<indexterm>
    <primary>index entry syntax</primary>
    <secondary>for a subentry</secondary>
</indexterm>

Tertiary entry (sub-subentry) markup:

<indexterm>
    <primary>index entry syntax</primary>
    <secondary>for a subentry</secondary>
    <tertiary>with a subentry</tertiary>
</indexterm>

Index entry with a range markup:

This book is full of geeky text with DocBook XML markup, which starts here:
<indexterm class="startofrange" id="geekytext">
<primary>geeky DocBook XML text</primary></indexterm>blah blah blah Ajax
blah blah blah Ruby on Rails
...
and ends here<indexterm class="endofrange" startref="geekytext">.

The closing indexterm tag does not contain a primary or secondary entry, just a startref attribute that references the starting indexterm entry. Do not place the closing tag on its own line.

Code

In general, put your code snippets in programlisting or screen elements. These render exactly the same—the choice is yours. Here’s a very simple one:

Hello World

If you have larger blocks of code that you want to have a title, a number, and a cross-reference, use an example element. Example 2-3 shows a basic one.

Example 2-3. Sample example
Hello World

These elements are verbatim environments, which means whitespace is preserved in rendered versions. You must either escape all characters that have special meaning in XML (such as < and >—these characters obviously come up quite a bit in code) or use a CDATA block.[5]

External code files

If you want to manage your code in separate files from the manuscript, you can use <xi:include> tags to point to your code (more on XIncludes in “Organizing Your Files”). If you do this, the parser doesn’t try to interpret them as XML, but you must include a parse="text" attribute:

<programlisting>
<xi:include 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting>

Please be aware that once the book is in Production, we will run a script that pulls all included code into the chapter files, and the XInclude links will be gone.

Caveats

Although inline markup and newlines within verbatim environments[6] are valid DocBook, we ask that you follow these guidelines to prevent rendering problems downstream.

Long Code Lines

The allowed number of characters per line of code varies depending on the book series and where the code is positioned in the markup. The following table lists some common cases.[7]

Please keep in mind that these are just the maximum characters allowed. You should review your PDFs and make your own judgments about the best way to present code to the reader.

Series Body (top-level code) Examples Lists Sidebars/notes/warnings
Animal or Cookbook 85 90 80 80
Small Animal (6x9) 76 80 72 70
Theory in Practice 85 90 80 72
Nutshell 76 80 72 66
Pocket Reference 58 62 53 48
Hacks 76 80 72 70

Please rebreak any code lines that exceed the max number of characters; otherwise, the code will run into the margin in your PDFs, which is unacceptable for print. It’s best to fix long code lines in the manuscript stage, while you still have access to the source. Making such edits during Production is much more cumbersome for everyone involved.

Tabs

Please don’t use tabs in code blocks, as tabs don’t necessarily translate to the same amount of space on different systems. To align or indent within your code, use spaces.

Inline markup on multiple lines

When using inline markup on multiple lines of code (e.g., <emphasis role="bold">), please close the tag at the end of each line and open a new one on the next line. For example, instead of this:

<programlisting><emphasis role="bold">GLuint m_gridTexture;
IResourceManager* m_resourceManager;
</emphasis>};</programlisting>

do this:

<programlisting><emphasis role="bold">GLuint m_gridTexture;</emphasis>
<emphasis role="bold">IResourceManager* m_resourceManager;</emphasis>
};</programlisting>

Failing to do this can cause headaches and delays in Production.

Newlines

Be careful not to add newlines to the beginning or end of code blocks. Because all line breaks are preserved in verbatim blocks, newlines can result in excess whitespace. For example, the following listing will render with unwanted blank lines at the top and bottom due to the line breaks after the opening <programlisting> tag and before the closing </programlisting> tag:

<programlisting>
CLLocationManager *locationManager = [[CLLocationManager alloc] init];
locationManager.delegate = self;
    [locationManager startUpdatingLocation];
} else {
    NSLog(@"Location services not enabled.");
}
</programlisting>

Although we have tools to remove the extraneous whitespace once the files are in Production, we prefer not to run global changes on code content, so it’s best if you avoid adding it in the first place. Do this instead:

<programlisting>CLLocationManager *locationManager = [[CLLocationManager alloc] init];
locationManager.delegate = self;
    [locationManager startUpdatingLocation];
} else {
    NSLog(@"Location services not enabled.");
}</programlisting>

Callouts

If you want to have cross-references to specific lines of code, you can use callouts. Just put a co element at the end of each line you want to reference—these will generate callout markers. Then create a calloutlist element after the code block. This list contains callout items that discuss or explain each referenced line.

Here is an example of the markup:

<screen><programlisting> <co id="opening_tag_co"
          linkends="opening_tag" />
<xi:include <co id="xinclude_co" linkends="xinclude" /> 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting> <co id="closing_tag_co" linkends="closing_tag" />
      
<calloutlist>
  <callout arearefs="opening_tag_co" id="opening_tag">
    <para>The opening tag for a <literal>programlisting</literal> element.</para>
  </callout>

  <callout arearefs="xinclude_co" id="xinclude">
    <para>An <literal>XInclude</literal>.</para>
  </callout>

  <callout arearefs="closing_tag_co" id="closing_tag">
    <para>The closing tag for a <literal>programlisting</literal> element.</para>
  </callout>
</calloutlist></screen>

And here's how it will render:

<programlisting> 1
<xi:include 2 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" />
</programlisting> 3

1

The opening tag for a programlisting element.

2

An XInclude.

3

The closing tag for a programlisting element.

Each co element in the code block includes an optional linkends attribute that points to the callout elements that refer to it, forming a link between the marker and the callout. Conversely, each callout element requires an arearefs attribute that points to co elements, forming a link between the callout and the marker. The markers will be rendered as clickable bidirectional cross-references if you use this markup.

The markup for the above looks like this:

<screen>&lt;programlisting&gt; <co id="opening_tag_co" 
  linkends="opening_tag"/>
&lt;xi:include <co id="xinclude_co" linkends="xinclude"/> 
  xmlns:xi="http://www.w3.org/2001/XInclude" 
  parse="text" href="hello.c" /&gt;
&lt;/programlisting&gt; <co id="closing_tag_co" linkends="closing_tag"/>
</screen>

<calloutlist>
<callout arearefs="opening_tag_co" id="opening_tag">
<para>The opening tag for a <literal>programlisting</literal>
element.</para>
</callout>

<callout arearefs="xinclude_co" id="xinclude">
<para>An <literal>XInclude</literal>.</para>
</callout>

<callout arearefs="closing_tag_co" id="closing_tag">
<para>The closing tag for a <literal>programlisting</literal>
element.</para>
</callout>
</calloutlist>

For more information on DocBook callout markup, see http://www.sagehill.net/docbookxsl/AnnotateListing.html#Callouts. Please note that our toolchain does not support areaspec/area/areaset elements to specify callout regions.

Although DocBook has markup for adding line numbers and annotations directly to code, O’Reilly’s toolchain doesn’t support these options. Line numbers don’t allow for good cross-referencing and can potentially cause problems if code is revised and line numbers change. If you want to cross-reference code blocks by number, we recommend using callouts instead; they are autonumbered and will adjust automatically if you shift code around.

Syntax Highlighting

The Atlas toolchain now supports syntax highlighting via Pygments, and we recommend that all authors use it when possible. All you need to do is add a language attribute to each code block that should include syntax highlighting, and specify the language of the code. For example:

<programlisting language="java">int radius = 40;
float x = 110;
float speed = 0.5;
int direction = 1;</programlisting>

Here’s how it renders:

int radius = 40;
float x = 110;
float speed = 0.5;
int direction = 1;

Pygments supports a wide variety of languages that can be used in the language attribute; see the full list at http://pygments.org/docs/lexers/. Ebook readers that do not have color screens will still display the highlighting, but in more subtle shades of gray.

Please note the following caveats:

  • The color scheme is consistent across books and cannot be changed at this time.

  • This feature is supported in EPUB, ebook PDFs, and KF8 for the Kindle Fire. Syntax highlighting is not supported in print books unless it’s printing in color.

If you would like to do something that’s not currently supported, please write to us at and we’ll do our best to work with you on incorporating it.

Unicode for Special Characters

For nonstandard characters, use Unicode. The following table provides the values for some common characters; for all others, use the Unicode Char⁠acter Search (but keep in mind that our default fonts don’t have glyphs for every exotic character; send email to if you have questions about this). If you’re using XXE with the ORM customizations file, most of the characters below have keyboard shortcuts.

To add a Unicode character directly to XML in a text editor, use the entity &#xCODEPOINT;, where CODEPOINT is the four-digit hexadecimal number after U+ (e.g., for U+20A0, enter &#x20A0;). Letters that are part of the codepoint may be entered as either upper- or lowercase (i.e., &#x03bb; is the same as &#x03BB;), but the x between the # symbol and the codepoint must be lowercase.

Character Unicode value (hexadecimal codepoint)
— (Em Dash) U+2014
- (En Dash) U+2013
“ (Curly Left Double Quotation Mark) U+201C
” (Curly Right Double Quotation Mark) U+201D
‘ (Curly Left Single Quotation Mark) U+2018
’ (Curly Right Single Quotation Mark) U+2019
× (MathMultiplier) U+00D7
→ (CharMenuDelim) U+2192
€ (Euro Currency Symbol) U+20A0
✓ (Check Mark) U+2713
✗ (Ballot X) U+2717
⌘ (Place Of Interest Sign) U+2318
↵ (Carriage Return Arrow) U+21B5

Comments and Remarks

You have two options for adding comments to your manuscript: standard XML comments (<!--foo-->) and remark elements.

XML comments are useful for commenting out large blocks of text—for example, text that is under review, or text that you don’t currently want to include in your manuscript. In the following example, the entire paragraph is commented out:

<!-- O’Reilly’s mission statement. 
<para>O’Reilly Media spreads the knowledge of innovators through its books, 
online services, magazines, research, and conferences. Since 1978, O’Reilly 
has been a chronicler and catalyst of leading-edge development, homing in 
on the technology trends that really matter and galvanizing their adoption 
by amplifying “faint signals” from the alpha geeks who are creating the future. 
An active participant in the technology community, the company has a long 
history of advocacy, meme-making, and evangelism.</para> -->

remark elements are better for directing specific comments to the editor or Production. For example:

<remark>PRODUCTION: Please stet grammatical errors in the following</remark>

<para>I can haz cheezburger, plz?</para>

If you have comments for Production staff, we would appreciate you formatting them as remark elements and starting them with “PRODUCTION,” as shown above. This is helpful for distinguishing comments that need to be addressed during Production from comments directed toward editorial staff or coauthors.

By default, comments are not displayed in your PDF builds. If you’d like them to appear while you’re working on the book, let us know at .

Quotes and Epigraphs

To add a quote anywhere in your book, use the blockquote element. Since it’ll be set apart from the text, there’s no need to put quotation marks around it. Here’s some example markup—a quote attributed to Benjamin Disraeli (by Wilfred Meynell, according to Frank Muir):

<blockquote>
  <attribution>Wilfred Meynell</attribution>

  <para>Many thanks; I shall lose no time in reading it.</para>
</blockquote>

Here’s how it renders:

 

Many thanks; I shall lose no time in reading it.

 
  --Wilfred Meynell

If you want to add a quote at the beginning of your chapters (or sections, parts, etc.), use the epigraph element. Here’s some example markup:

<epigraph>
  <attribution>Robert Benchley</attribution>

  <para>There are two kinds of people in the world: those who believe 
  there are two kinds of people in the world, and those who don't.</para>
</epigraph>

And here’s how it renders:

There are two kinds of people in the world: those who believe there are two kinds of people in the world, and those who don’t.

Robert Benchley

Math in DocBook

For complex math, the Atlas system now supports LaTeX and MathML—details on these below. However, it’s best to use regular text whenever possible for writing simple equations and expressions.[8]

Simple Math

Use regular text set in equation elements for any math you can type using the following:

  • Standard keyboard characters

  • Superscripts and/or subscripts

  • Greek letters (e.g., ∑)

  • Operators or other special characters (e.g., ∫—see “Unicode for Special Characters”)

For example, write the Pythagorean theorem in text, but not the quadratic formula.

Here are some examples:

Titled formal equation
<equation id="pythagorean">
<title>Pythagorean Theorem</title>
  <mathphrase>
    <emphasis>a</emphasis><superscript>2</superscript> + 
    <emphasis>b</emphasis><superscript>2</superscript> = 
    <emphasis>c</emphasis><superscript>2</superscript>
  </mathphrase>
</equation>
Equation 2-1. Pythagorean theorem
a2 + b2 = c2
Block, untitled informal equation
<informalequation>
  <mathphrase>
    <emphasis>a</emphasis><superscript>2</superscript> + 
    <emphasis>b</emphasis><superscript>2</superscript> = 
    <emphasis>c</emphasis><superscript>2</superscript>
  </mathphrase>
</informalequation>
a2 + b2 = c2
Inline equation

Simply write the math inline with the regular text, styling with emphasis, superscript, and subscript as needed. (An inlineequation element does exist, but it’s not necessary for standard rendering in our toolchain.)

<para>Here is a simple text inline equation: <emphasis>a</emphasis>
<superscript>2</superscript> + <emphasis>b</emphasis><superscript>2
</superscript> = <emphasis>c</emphasis><superscript>2</superscript> 
(yes, it's the Pythagorean theorem).</para>

Here is a simple text inline equation: a2 + b2 = c2 (yes, it’s the Pythagorean theorem).

Complex Math

You have three options for representing more complex math in your book: LaTeX, MathML, or images (we prefer the first two).

LaTeX

All LaTeX equations in DocBook source should be tagged as mathphrase with role="tex" set on the element. LaTeX mathphrase elements should be wrapped in equation for titled block equations, informalequation for untitled block equations, and <inlineequation> for inline equations.

Here are some markup examples:

Titled formal equation
<equation>
  <title>Derivative</title>
  <mathphrase role="tex">\begin{equation}
  {\frac{dy}{dx} = 2x}
  \end{equation}</mathphrase>
</equation>
Equation 2-2. Derivative
\begin{equation} {\frac{dy}{dx} = 2x} \end{equation}
Block, untitled informal equation
<informalequation>
  <mathphrase role="tex">\begin{equation}
  {\int_{-10}^{10}x^2\,dx}
  \end{equation}</mathphrase>
</informalequation>
\begin{equation} {\int_{-10}^{10}x^2\,dx} \end{equation}
Inline equation
<para>The volume of a sphere can be calculated with the formula 
<inlineequation><mathphrase role="tex">$\frac{4}{3} \pi r^3$</mathphrase>
</inlineequation>.</para>

The volume of a sphere can be calculated with the formula $\frac{4}{3} \pi r^3$.

Here are a couple of helpful resources:

MathML

If you’d like to use MathML markup, please make sure to include this modified DTD at the top of your XML files:

<!DOCTYPE book PUBLIC "-//OASIS//DTD DocBook XML V4.5//EN" 
"http://www.oasis-open.org/docbook/xml/4.5/docbookx.dtd" [
<!ENTITY % MATHML.prefixed "INCLUDE">
<!ENTITY % MATHML.prefix "mml">
<!ENTITY % equation.content "(alt?, (graphic+|mediaobject+|mml:math))">
<!ENTITY % inlineequation.content "(alt?, (inlinegraphic+|inlinemediaobject+|
mml:math))">
<!ENTITY % mathml PUBLIC "-//W3C//DTD MathML 2.0//EN" 
"http://www.w3.org/Math/DTD/mathml2/mathml2.dtd">
%mathml;
]>

MathML markup is very complex, and it would be difficult to include a primer in the space here. Here are some helpful resources:

Please write to with any questions.

Images

If you don’t want to use LaTeX or MathML, you can create expressions using an equation editor and save them as images. Here are a few, but feel free to try others:

MathType

Use this application to create equations and save them as images. Comes with a free trial.

Word’s built-in EquationEditor

If you have Word 2007, do not create equations with the default EE; use the previous EE, which you can access from the Insert ribbon. When you have created an equation, you can capture it as a screenshot.

Here are some markup examples:

Titled formal equation
<equation id="quadratic">
<title>Quadratic Formula</title>
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/quadratic.png" format="PNG" />
    </imageobject>
  </mediaobject>
</equation>
Equation 2-3. Quadratic formula
Block, untitled informal equation
<informalequation>
  <mediaobject>
    <imageobject>
      <imagedata fileref="images/quadratic.png" format="PNG" />
    </imageobject>
  </mediaobject>
</informalequation>
Inline equation

None. Avoid using images for inline equations. Sizing inline images poses difficulties in downstream formats, as some ereaders may render the images much larger or smaller than the surrounding text (which can lead to customer complaints). If you need to use an image, set it apart from the text as an informalequation as described above. If you have other requests, please discuss with .



[3] O’Reilly’s filenaming convention is ch00, ch01, etc. When your chapters come into Production, we will rename them per this convention so that they flow through our Production workflows easily.

[4] Like this.

[5] You can use a CDATA section as long as you don’t need inline markup within the code (see “Caveats”). In a CDATA section, any text between <![CDATA[ and ]]> is ignored by the XML parser. You can’t use CDATA sections if you’re using XXE, but on the other hand, you also won’t need to worry about escaping special characters (as XXE takes care of that for you), which is probably the better end of the bargain.

[6] E.g., programlistings or screens, used for code blocks where line breaks and spaces need to be preserved.

[7] These values apply to Atlas-generated PDFs only. Please see the old guidelines for max characters in SVN-generated PDFs.

[8] For any platforms or devices where LaTeX/MathML is not supported, our toolchain automatically converts to images.

[9] This is a free tool (also usable online via the Chrome browser) that’s good for creating LaTeX equations via a WYSIWYG interface.

[10] Not free, but we can potentially provide you with a license.