New posts on book production

Over at Electric Book Works, we’re often deep in the plumbing of the Electric Book workflow, the software we created for making books in multiple formats. If you’re curious about that work, I have written a few pieces there that may be interesting.

Producing The Economy with the Electric Book workflow‘ is a case study in multi-format book production, explaining how we built a huge, open-access economics textbook for web and print publication. It’s a great example of the symbiosis of print and digital publishing:

… the practical matter of skills has framed the evolution of publishing as ‘print vs digital’, when of course the conversation should be about print and digital. Not just because we’re stuck with a multi-format world whether we like it or not, but because print and digital formats are symbiotic. …

Print books generate instant credibility. They carry a sense of permanence and authority that digital formats cannot muster.

… Web publications struggle to muster the authority of a printed book, but they scale instantly and allow for a range of funding models.

So, when a book needs to make an impact, it simply must be in print and digital formats. It cannot have impact without the authority of print. And it cannot have impact without the scale of the web.

In ‘Book production with CSS Paged Media’ I explain in more detail how we create print books using HTML and CSS. It has lots of pictures. And also explains:

In our team, we dedicate a significant piece of everyone’s time to technical skills development – both editors and designers – to reduce dependency on developers. And, as our technical lead, I have to spend at least half my time learning or training others. A commitment to digital-first publishing is a commitment to a serious learning curve.

And in ‘Publishing research in useful formats‘ I explain how and why it’s important not to bury research publications in inaccessible PDFs; and the exponential value to be had from publishing as web pages in particular. Ultimately, if you can publish in multiple formats at once, you get a bunch of value, since different people will have different needs and preferences:

  • It can be very powerful to hand a high-quality printed book to an influential person. A beautiful printed book lends a project real credibility.

  • Many people like to download and print out PDFs to read on paper, or to use PDFs in PDF-annotation apps on tablets. Those PDFs must be optimised for use on screens, including clickable navigation and reasonable image sizes. (That is, they are not the same as PDFs for book printing.)

  • Many people do their reading on their phones today. Mobile-friendly web pages are much easier to read and bookmark on a phone.

  • Many people like to read long-form content on an ereader, like Amazon Kindle. A key feature of this is the ability to highlight and annotate as you read, and to see what others are highlighting. If your research is available in the Kindle store (or other stores like iBooks), it’s easy for people to find and annotate it like this.

  • Web pages are easy to share on social media. Today, we get many of our recommendations from contacts sharing links on social media. So website versions of research can be critical for getting others to share your work in this way.

  • Search engines likely rate web pages higher than PDFs, for various reasons. So research published as web-page content (as opposed to PDFs for download) will be far more visible and popular in search results.

  • You can get in-depth analytics from website publications. Using a service like Google Analytics you can see what people read most, what they search for, and where they are, right down to city level.

  • Well-constructed web pages are better for accessibility: for instance, for read-aloud screen readers and high-contrast displays for the visually impaired. And good accessibility has the added advantage of being useful to voice-driven services, such as Google Assistant, which can read out web pages in response to a user’s voice requests.

  • Website versions can be updated instantly, should information change.

 

Markdown vs HTML – hoedown or showdown?

On a mailing list recently, a friend asked: should my workflow use an HTML editor or markdown? There is, of course, no easy answer. It depends what trade-offs you want to make. At Fire and Lion we use markdown for book production, and I know very smart people who think that’s crazy. They’d pick an HTML editor any day.

What worries me about working in an HTML editor is that it must make assumptions about what the user intends. That is, HTML editors abstract what you see from what you’re storing. What you see is what you hope you get. The real source format is what the user types, but HTML editors effectively discard it, jumping straight to rendered output and hiding its structures from view. Non-technical users often have no idea what they’re actually storing, and very little control over it.

And those assumptions are the root of all editor evil: they inevitably lead to a hidden mess of legacy markup. As users edit, and especially as they paste from other sources, their editing software has to make guesses about what formatting and what HTML elements the user wants to keep and what it can discard. You only have to glance at the source of a heavily edited WordPress page to find a teeming mass of unnecessary spans, redundant attributes and inline CSS.

Over time, the HTML gets messier, and that mess is swept under the rug of the WYSIWYG view. And as it gets messier, it becomes less portable, and conversion tools become less useful. Suddenly I can’t just reuse my HTML somewhere else without unpredictable results. For any given reuse I lose hours to cleaning up my HTML, effectively creating a whole new fork of my project, and losing the ‘single-source master’ feature of my workflow. I’m sure every new HTML editor aims to solve that problem, but I haven’t found one that’s solved it yet.

So that’s why I remain a champion of markdown-based workflows: there is no abstraction in the editor, because I’m only ever working in plain text. The simplicity of plain text means my content stays clean as I go, because there is no rug to sweep a mess under.

The bare bones of markdown have other spin-offs, too:

  • Markdown is more portable. By ‘portable’ I mean between people and between machines. For non-technical people, markdown is more open than HTML: it’s instantly readable and copy-pastable. A format that’s useful without a developer in the room is exponentially cheaper to work with, especially when you have to move it between machines.
  • The contraints of markdown force us to keep document structures simpler, sticking to fewer, standardised elements.
  • And diffs of plain-text markdown (in Git especially) are easy to read. We can use them in editorial workflows as is.

However! Those who prefer HTML editors are right that markdown has serious constraints. Or, rather, that HTML5 (like many markup languages) provides more features than markdown can provide natively. For instance, markdown can’t produce tables with merged cells, create plain divs and spans, or manage nested snippets for things like figures. They’d argue rightly that the markdown editing experience can be clunky, especially to those accustomed to Word-like UIs. And that non-technical users mostly don’t share my concerns about messy underlying markup: they just want an editor that looks great and is easy to use.

Like tabs versus spaces, I don’t expect this debate will ever be resolved. What matters is that we each pick our own trade-offs, and respect the trade-offs others make.

 

I love you, InDesign, but it’s time to let you go

I love you, InDesign, but it’s time to let you go. We just can’t be together in a multi-format world.

InDesign is expensive, so I can’t have my whole team working in it. It’s so powerful that it takes years of experience to use it without making a mess. And it’s fundamentally incapable of producing both print PDF and ready-to-use HTML from a single master file, despite some amazing hacks.

Adobe has tried valiantly to turn this page-based, hot-lead-replacement into a multi-format tool, but its roots in print are just too deep. Making books in InDesign and converting them to high-quality ebooks and websites is a rocky journey that leaves even the smartest typesetters bloodied and broke.

At Fire and Lion we make a lot of books for screen and paper (mostly for publishing companies and non-profits). To our clients, what matters most is that the books are well-crafted in every format, and that working with us is problem-free. Behind the scenes, we have to do something special to make that possible.

So the first thing we do is avoid using InDesign for setting everything but the most heavily illustrated books. We don’t do page-by-page layout and convert to HTML later. In fact, we do exactly the opposite: we make each book as a little website, and then output to PDF.

To put it another way: Fire and Lion makes responsive websites that respond not only to screen sizes but to the pages of a book. And we do it so well that, looking at the finished product, you can’t tell the difference between our books and those you’d get from a typesetter working in InDesign.

We’ve been lucky to work with clients who’ve let us make their books with this cutting-edge toolset. You have to be brave to accept a GitHub repository as your open files, rather than an InDesign package; but it’s brave people like that who move our industry forward.

Nothing we’re doing is a secret: our workflow is open. So when we’re not making books, we’ll be talking about how we make them. If you’re working with similar tools, or curious about ours, let us know.

Three things every editor should know about digital publishing

A while ago I gave this talk at the Cape Town Professional Editors Group. Here are my speaking notes.

Today, every passage you edit will sooner or later be read on screen. This digital world desperately needs our craft and high standards, but what does that mean for our daily work? In this talk I’ll pick out three big, important issues, and talk about some of the tools we’re using to tackle them. The first is text-only editing. That is, the end of word processing as we know it. The second is real-time, collaborative editing. And third, automagical pagination: how do we edit when there’s no such thing as ‘page two’?

So what does this digitisation thing really mean for editors? I think, basically, it means you’re editing text that will be read on a screen. Importantly, you’re editing text that will be read on a screen and on paper.

Now if you’re going to edit for the screen, the single most important thing is to actually read on a screen yourself. If you aren’t reading on screen, you simply cannot edit for the screen. Just like you can’t fix a car if you’ve never ridden in one.

That said, we are all busy people and there is an infinite amount to learn about computers: the rate at which the technium evolves far outstrips the rate that we can understand it. Even the greatest minds in computing readily admit that the Internet is now bigger and more complex than any one person understands.

So the trick is to not try too hard to learn it. Rather, just start using web- and screen-oriented tools and the learning will come when you need it. No one went to a whole seminar on how to use email before they sent an email.

In the next thirty minutes or so I’ll pick out three big, important developments and talk about some of the tools we’re using to tackle them. This is basically show and tell.

The first is text-only editing. That is, the end of word processing as we know it.

The second is real-time, collaborative editing.

And third, I’ll talk a little about automagical pagination: how do we edit when there’s no such thing as ‘page two’?

Text-only editing

First, what is text-only editing? Text-only editing is editing in plain-text files. When you do this, you’ll probably be using a particular writing structure called Markdown. For instance, let’s use Stackedit to write plain-text markdown. Type on the left, and on the right Stackedit turns our plain text into formatted HTML.

On the left, I type plain text in a markdown structure. On the right, formatted HTML.

On the left, I type plain text in a markdown structure. On the right, formatted HTML.

What we’re seeing here is the separation of content (which is structured text and image-references only) from formatting and design.

What are the big advantages of text-only editing?

  1. Smaller, faster files.
  2. Computers need perfect consistency (the digital age is a wonderful place for obsessive copy editors). Here the tools force our hand, and we learn to be less sloppy.
  3. Text-only means fewer copy-paste messes (when you copy paste into a new document and the fonts go all weird), because I’m getting only and exactly what I’m seeing. Plain text. We do have learn some new tricks like unicode glyphs (there is no ‘Insert symbol’ font or formatting gimmicks, like superscripting an o for a degrees symbol). This is actually a good thing, even if it seems like more work at first while we learn its tricks.
  4. Less file corruption, because there is simply less going on – less code to go wrong.
  5. Better version control, especially if you learn to use a tool like Git.

Collaborative editing

Collaborative editing has literally changed the way I write, edit and deliver documents.

What is collaborative editing? In short, me and someone else editing the same online document at the same time. The biggest tool for this is Google Docs.

What are the major pros of collaborative editing?

  1. It lets others watch while you work. And you can watch while others work. Publishing is weird because it’s always been a team sport played by lonely freelancers from their own home offices. Collaborative editing instantly makes the team aspect real and useful.
  2. You can use commenting for feedback and discussion. Track changes just isn’t the same as actual live annotation. No more emailing documents with increasing repetitions of the word ‘final’ in the file name. (Also, see Hypothesis.)
  3. Instant delivery of work and real-time review. As soon as you’re ready for your client to check something, share the doc and the ball’s in their court. So much editing is problem solving, and collaborative editing means the publisher-editor-designer are basically always in the room together at the same time.

I cannot believe that Google Docs has been around for years and people are still editing in MS Word. I promise, promise, promise you want to move all your writing and editing into Google Docs. (You could also use something similarly cloud-based with live collaborative editing but, for better or for worse, most people are familiar with Google and already have Google accounts).

Automagical pagination

Lastly, what is automagical pagination? Well, on screen, our software and screen size are going to decide how much text is on the ‘page’, the visible area in front of us. On screens we might refer to this as the ‘viewport’. If you’ve used Kindle, iBooks, or Google Play Books you probably know what this looks like.

There are a few key issues that arise when text flows into a viewport. And very importantly, when you’re editing the same text for both that viewport and also print output.

  1. Hyphenation and non-breaking spaces. Of course you never want to put a hard hyphen into a line because that line will be made and remade in countless different lengths in its life, and you don’t want your hyphens turning up in the middle of a line. You also don’t want the space in a number like 100 000 breaking over a line, so you need to learn how to insert a non-breaking space. And there are several other glyphs that have similar complications, like ellipses and en dashes.
  2. Cross-references. That is, referring to other places in the document. On screen, you can’t say ‘see page twenty’, because ‘page twenty’ is completely different on my computer and on my phone. You can’t say ‘Click here to go to the figure’, because in print there is nothing to click. And you can’t say, ‘in the figure below’, because on screen the figure might shift position. Common solutions are to introduce numbering systems for sections and figures, or to completely rephrase cross references. (Some smart digital-first workflows let you insert a variable that becomes a page reference in print, and is a hyperlink on screen.)
  3. Elements that appear on screen but not in print. For instance, let’s say you want to include a YouTube clip in an ebook, but you can’t have the clip in the print version. In some systems, it is possible to mark certain elements to appear in one version but be completely hidden in another.
  4. Minimalist courage. For maximum compatibility with unknown reading systems, you have to use fewer, more carefully-chosen features. You can’t have ten different variations on headings or boxes. Pick very few features, and treat them consistently with the same styling rules. Make no single-instance exceptions (e.g. never say “I’ll make this one heading smaller because otherwise it’ll look funny here.”)
  5. Strict content hierarchies. You have to place every feature of the book in a hierarchy, as if your whole book was a tree of trunk, branches and leaves. Computers need hierarchy.

There is lots more we could go into here but there isn’t enough time and we’d bore half the room. And as I suggested at the start, it doesn’t matter how much you try to stuff in your head now, when the only way to make it useful and make it stick is to deal with issues as they come up in your work.

I hope that you have some concrete questions, though, so we can spend some time dealing with those real issues that you’ve already come up against.

Uncapped: Can humans handle a blank cheque for data?

I remember the first time I used uncapped Internet. I was sitting in a friend’s coffee shop in Bristol, on a trip from South Africa, and my brain exploded. Till then, every time I’d clicked a link I’d worried about how much data it would use. I’d had to think carefully about every video, every download, every large page. And that anxiety was a constant force against browsing freely. That was just the way the web worked: you had a data allowance and, no matter how big it got, you had to use it wisely.

So uncapped Internet was not just fun, it was a revelation. Revolution, even. Suddenly nothing stood between me and anything my heart desired. I could indulge my curiosity at will. Games, video, porn, software, music, and learning. So. Much. Learning. That hour in a cafe in Bristol literally changed my life.

And now so many of us — almost anyone at the wealthier end of middle class — just take it for granted. Once we cross from limited to unlimited — from finite to infinite — we easily forget what it’s like to manage with limited resources. Our behaviour on one side of that line is very different from our behaviour on the other.

Doug Hoernle, founder of mobile-education business Rethink Education, tells me their young, low-income users are relentlessly careful about the data they consume on their phones. When a class is doing research, one student will open Wikipedia, screenshot the page, and WhatsApp the image to everyone else. This isn’t just to save data, but to ration it: they don’t know how big the Wikipedia page will be, but they can tell exactly how big a screenshot is before they download it. Similarly, before they decide to download a free app, they’ll check its size and calculate its cost in data: its real price. And even then, they’ll weigh up carefully whether that app’s size in their cheap phone’s memory is worth all the photos they could save in its place. For them, there is no such thing as ‘free’.

As much as I love unlimited data, it has a real danger: without a budgetary constraint on our browsing, there is far less pressure to choose carefully. We just open the Internet firehose and let it run. We’ll curate later, we think, and then we complain that there’s so much crap on the web and we never have any time for ourselves or our work. And then, perversely, we turn that firehose back on the web and upload our own stuff — often half-baked — for others to deal with.

Humans have a lot to learn about managing the firehose. I may learn a few tricks in my lifetime, and I’ll pass them on to my son, who’ll learn a few more. It’ll take generations for us to be comfortable, confident, happy dealing with an unlimited supply of information.

For some of those lessons, we should look to those who’re still capped, for whom every byte counts. How do they make their decisions? Who influences them? What sites and apps really matter? Are their constraints teaching them — at least till they cross that great divide — how to be more discerning people? Maybe they have something to teach us about priorities.

(This post was first published on Medium.)