US Blog
XML Workflow and the Holy Grail
Michael Jensen | 03/25/2009 | Digitization
At publishing conferences, there has been of late a lot of pressure to shift to an “XML workflow,” pressure mostly exerted by futurists, technologists, and geeks. They imply that a publisher should be embarrassed if they don’t have XML from the very beginning of the publishing process. Usually, they’re also trying to sell XML editorial systems of one kind or another.
Well, as both a futurist, a technologist, and a sometime geek, let me take a few minutes to differ on that score. There are places where XML workflow makes a lot of sense—journals, or newspapers, or other “throughput” publishers. But for book publishers, there may be a different equation.
A few hundred dollars to an offshore vendor can transform a final PDF of a book into an archival-quality TEI-Lite XML form, an .epub ebook format, and almost any other format you’d want. One has to stop and do some math, before going too far down the path of revolutionizing your workflow processes.
At the National Academies Press, we publish about 180 books a year. While we have been revisiting this “XML workflow” question every 18 months or so for about five years, we have yet to come to a conclusion that it makes sense. We currently send our PDFs to an offshore service upon release, and in a few weeks receive back several versions of the book in XML (and HTML) formats.
The full costs of changing to “XML workflow” include the cost of training (not just compositors, but editors) in a new process and workflow; the cost of transformation and disruption (it’s not a switch that can be clicked, but rather a fundamental process change); the cost of frustration (which is never free); not to mention the substantial cost of new software.
The touted benefits of a full XML workflow generally fall into the category of “better chunks” (being able to sell chapters, repurpose content, or license items, more easily); of getting editorial participation via mark-up early in the process; of being able to have quasi-composition ahead of final composition; of intersecting with “content management systems” that allow easier handling of many documents.
But unless you’re a book publisher who produces more than a 300 books a year, or who have a real market in realtime-licensing of content upon release, or who can justify the cost and disruption in other ways, then it’s probably best to just wait for awhile – wait for the XML workflow to make sense financially, and wait a bit for the XML and the epub version of the books.
Two to three weeks of waiting, and a few hundred dollars per title, is not a dramatic penalty to pay for avoiding the pain of a learning curve whose benefits may be small.
Comments [0]
RSS Feed
Standards, Exceptions, and Perfection
Michael Jensen | 03/24/2009 | Digitization
Publishing is all about “exceptions,” I’ve learned over the years. Book X is just like Book Y, except for the parts that aren’t. Book A needs to have call-outs, while Book B is a standard monograph (except for the section with that sideways table, and the chapter-open drop-caps that the designer feels is important). That book has the standard royalty, except for sales in Eastern Europe; this book has a short discount, except when it’s sold via Amazon.
Computers hate exceptions. A software program is happiest when it knows what to expect, and gets just what it expected. Many a “blue screen of death” on a computer comes with the message “uncaught exception found” – because the computer doesn’t know what to do with the exception.
What does that have to do with publishing? Well, it has affected our databases (because database designers tend to presume a constrained list of options), affected our designs (because we want our books to “stand out” in some way, and *not* be template-limited), and affected our thinking about XML and presentation.
The challenge of fitting our exception-riddled publications into a relatively standardized model of presentation (in .epub format, for example) is substantial—yet we can’t afford to have a second form of composition to “design just for the ebook.”
Part of our constraints also come from our “expectation of perfection.” Because book publishing is a capital-heavy process, and because once printed, it can’t be changed, we developed a culture of perfection: no typos, no grammatical errors, no going-back-and-fixing.
The digital world is, and is likely to remain, pretty forgiving. Yes, if I pay for an ebook, I expect high quality—but if I have to scroll around to view that table, or see ugly whitespace around a drop-cap, it’s okay.
So finding ways of accepting imperfection in our digital representations of our books, and of recognizing that we *can* go back and fix something that was missed the first time—may mean the difference between an affordable production process and one that is too expensive. As they say, 99% of the cost of perfection lies in achieving the last 1%.
I’m not saying that we should be sloppy – but if we philosophically accept that our e-books will not look like our p-books, and that those ebooks are more about the content than the presentation, we may save ourselves some heartburn, and a few bucks as well.
Note: not every book makes a good ebook, and design-heavy publications in particular are unlikely to be satisfactory in an ebook format. But shifting the scale of what variance is acceptable may mean that we can reach more readers in the digital environment.
What is it about DRM that makes us so crazy?
Michael Jensen | 03/19/2009 | Digitization
Because really, it’s so 20th century.
Digital Rights Management systems (on ebook files) is not unlike printing our books in non-photo-blue, so that nobody can photocopy them. Or having unrippable shrinkwraps on our books in bookstores. Or having some RFID system on every page that prevents skipping ahead….
It’s asking for misery—misery with customer service, misery because it angers our customers, misery because it relegates our publications to a previous mindset, treating a book as an isolated object that only exists in the form we control.
Publishing is much more than just producing a book, as we all know; and today, it’s much more than just producing a static object. As publishers, we have been conditioned to think about a book as being that static object that will be just as “fresh” in 2020 as it is now.
But the reality is that our marketplaces are increasingly fragile, and angering our customers is no way to build brand or market share. The reality is that our readers increasingly expect to start a book on the laptop, and finish it on the iPhone. The reality is that there *is no DRM that has not been broken by the pirate community.*
And further—for $400 or less, one can buy a scanner that produces a searchable PDF in about an hour, from a printed book.
So we need a new paradigm of thought, in which we recognize that our job is to distinguish our premium content from the welter of mediocre content; it’s to encourage readers (whether pirate or paying) to come buy another book from us, via links, promotion, and encouragement; to develop means of making the “new” or “updated” version available to paying readers for free, and pirate readers for a small fee; to find new and innovative ways to *take advantage* of superdistribution and viral distribution, rather than try to put tin-can roadblocks into the path of our readers’ 4x4s.
It’s hard, making this shift. But if publishers don’t, we may well ensure that we will find ourselves without a buying public at all.
Categories
Recent posts
03/29/2010
03/22/2010
03/15/2010
03/12/2010
02/24/2010
02/23/2010
02/22/2010
01/04/2010
New Year’s Resolution: Re-read These Articles
12/04/2009
11/25/2009
10/13/2009
10/07/2009
04/06/2009
Quid Pro Quo and the online experience
04/02/2009
Retaining Relevance as a Publisher
04/01/2009
03/30/2009
The Content of Things, and SEO
03/27/2009
Boosting the Canadian Books Catalog
03/26/2009
Open Access to Francophone Developing Countries
03/25/2009
XML Workflow and the Holy Grail
03/24/2009
Standards, Exceptions, and Perfection
03/19/2009
What is it about DRM that makes us so crazy?
03/18/2009
03/01/2009
