Blog

book-845280_640

During my time at Ooligan, I have been told by multiple people that XML coding is the portion of the Ooligan workflow that they are most unfamiliar with and therefore most anxious about volunteering for. It’s not hard to see why XML seems daunting or confusing: the work is done by the editorial department, but it requires coding tags one would expect to see in the digital department, and the product is used exclusively by the design department. It’s easy to get lost in all of that. If the work requires coding, why isn’t it done in the digital department? If the product is only used by the design department, why don’t they do the work? To help clarify, here’s a crash course in XML.

First, let’s start with what XML coding is. XML stands for “extensible markup language,” and true to its name, XML is a way of marking up a text. Think of it like color coding for manuscripts: you highlight standard paragraph text in yellow, chapter titles in blue, poems in orange, block quotes in green, and italics in red. Except instead of using colors, you enclose the text in a set of tags that identifies what type of material it is, like so (note that the single quotation marks you see in the tags below are not normally present—they have been added here to ensure that the web page doesn’t read them as HTML tags):

<‘para’>text<‘/para’>

<‘chaptitle’>Chapter Name<‘/chaptitle’>

<‘poem’>text<‘/poem’>

<‘blockquote’>text<‘/blockquote’>

<‘ital’>text<‘/ital’>

The first tag tells whoever is looking at it where that type of text begins, and the closing tag (with the forward slash) says where it ends. By the time you are finished, every word of the manuscript should be inside some sort of tag. Most of the time, that will be the paragraph tag. But every time you come across text that either has special formatting or is not a normal paragraph, you will give it a unique tag to identify what it is. This means that sometimes you must put tags inside of each other; these are called “nested tags.” Italicized text is a perfect example of this, since often a single word or phrase within a paragraph is set in italics. Thus you will end up with something like this:

<‘para’>Paragraph text<‘ital’>text<‘/ital’>rest of the paragraph text.<‘/para’>

The great thing about XML is that, unlike with other forms of coding such as HTML, there isn’t a set list of tags that you must use. You are welcome to make up tags to your heart’s content. Most publishing houses that XML code their manuscripts will create a style sheet over time to ensure consistency from book to book. But whether your house chooses to use <‘blockquote’> or <‘bq’> doesn’t matter in the long run, as long as you use the same tag consistently throughout. Similarly, if you come across a style that isn’t in your style sheet, such as song lyrics or another uncommon type of content, then you can invent a tag to use.

This brings me to the reason why we XML code manuscripts. The purpose is to make the book designer’s life easier by clearly marking where they should begin and end formatting. It gives them a way to quickly navigate the document through the “find” command. They can do a search of the document for the tag and quickly format all chapter names with the style they have created. Similarly, by searching for the tag they can make certain that no buried italicized text was missed.

So why is XML coding done by the editorial department? The answer is that they are the ones who are most familiar with the manuscript. They’ve already spent hours poring over the text during copyediting, and so they are best equipped to identify which tags should be used and where. It would be possible for another department (such as digital or design) to handle the XML coding of a manuscript, but it would take more time because they lack the editorial department’s level of familiarity with the text.

XML is easy to learn, in large part because the sky is the limit. When you code ebooks, you must use strict tags with correct letter case, or the file won’t work properly. But with XML, you can choose as many or as few tags as you need, and your only requirement is that you are consistent throughout a manuscript. XML coding occupies its current place in Ooligan’s workflow in order to allow the people who know the text best to do the work, and also to facilitate greater efficiency in the design process. This creates a unique intersection between the digital, editorial, and design departments, but it need not be a daunting one.

Interested in learning more about XML? You can find some good examples (along with much more in-depth information) at W3Schools.

Leave a Reply