Getting Started

Unless you're working on one of the few books that will continue to use the old Subversion toolchain, please see the new guidelines geared toward the Atlas platform. If you are using Atlas, you should consider the guidelines you are reading to be deprecated.

DocBook is an OASIS standard for XML that is ideal for writing long, technical documents that have complex structure and cross-references. With its semantic tagging, DocBook can be rendered as PDF for printing, HTML, man pages, audio, or even Braille. This versatility and reusability make DocBook 4.5 XML the preferred format for O’Reilly books.

Not only does this document explain how to start writing in DocBook, but it was created using the same markup and toolchain used for a typical O’Reilly book, so you can treat it as a model for your own manuscript. Chapter 3 contains information about markup.

We’ve got nothing against LaTeX, troff, or any other formatting markup system. But typesetting markup like LaTeX is inherently focused on formatting—font size, margins, alignment, etc. We’d rather you spend your time focused on the semantic structure of your book (how sections are divided, which information goes in a sidebar versus a note, and so on), and that’s where DocBook shines. In the same way most well-designed websites separate content from formatting using XHTML and CSS (XHTML for the content, and CSS for the formatting), DocBook lets us abstract formatting decisions away from content decisions.

Particularly as O’Reilly expands its efforts to provide content in multiple formats and at multiple stages of the content’s life cycle, cost-effectively generating multiple, distinct output formats from the same source document becomes critical, even though it means losing some degree of granular control over the output.

What does that mean for your book? It means that you’ll be able to view drafts of your book throughout the authoring process that are formatted much as they will be for print, and it means once your book is finished, it will go live almost immediately on Safari Books Online and other digital and ebook retail channels, rather than first needing to be converted into DocBook from another format, which can take a week or longer.

Will What I See in the XML Editor Mirror the PDF? Other Electronic Formats?

DocBook markup specifies the structure and semantics of your document, but not the appearance. DocBook isn’t a WYSIWYG format, so it will display differently in your XML editor than it will after rendering to PDF and other formats downstream. This not only means that fonts and formatting will display differently, but line breaks may differ as well.

The O’Reilly toolchains that transform your DocBook into its final form (both print and downstream electronic formats) rely on semantic XML tags that you apply to the elements of your text. For PDFs, those tags are processed in combination with customized XSL-FO stylesheets and a commercial FO to PDF processor to render your content into a PDF. See “Triggering PDF Builds of Your Book” for more information. Downstream digital channels (such as Safari Books Online and oreilly.com) use the XML tags analogously (but via different toolchains) to transform or render your content for their own presentation formats.

For More on DocBook

This guide is tailored to help authors get the most out of O’Reilly’s DocBook authoring toolchain. For more general information on DocBook XML, see Norm Walsh’s excellent DocBook: The Definitive Guide. You’ll find reference pages for each DocBook element, including a content model, a description of purpose, a list of parents that can contain it and children that can be nested in it, and allowed attributes. Other resources that may be helpful include: