Ebook formats suck.

/Ebook formats suck.

So here’s another dense-but-informative Mike Shatzkin post, this one on the economics of book (and ebook) retailing. This passage in particular caught my eye:

In fact, the mobi format that Kindle uses today was developed at the time as a bridging format, able to be read on both Microsoft and Palm devices. This was before the creation of the epub format used by everybody except Kindle today. When Amazon bought Mobi, it was apparently to prevent any other retailer from building a real ebook business selling to what was then the “entire” ebook market. B&N’s one-time exit from ebooks was because they could sell only to Microsoft and not to Palm devices, which meant they had the smaller piece of what was a very small market. Amazon apparently figured then that they’d enter the market when they were ready, but they wanted to prevent B&N from building a foothold in it before then.

This is what’s called vendor lock-in, and it’s the sort of vendor lock-in only a tech company could have leveraged. Pretty much everyone can get at least some utility out of a physical book, even if it’s only as a doorstop or display item. “Lock-in”, such as it exists, in the physical book market happens at the supply end, i.e. in the relationship between authors and publishers, and between publishers and distributors. I honestly can’t even imagine how it would work on the demand end (i.e. between readers and distributors), and, I assume, neither could booksellers, which is why they suck so hard at trying it.

Tech companies, on other other hand, are very good at leveraging lock-in; the entire tech industry is built on this, and if you think it’s not, try running an iOS app on your Windows machine then come back to me.1

Amazon is a tech company, not a bookseller or a publisher or a retailer. It learnt the lessons of its forebears, investing in both the tech and the content it would need grab customers and shut out competitors. Which it did, with massive success.

But there’s a side-effect. Bear with me for a moment while I rant…


So here’s the thing: Amazon’s mobi lock-in is, if you ask me, the single contributing factor to why there’s been basically fuck-all innovation in the ebook markup space in the last decade. Ebook markup is basically a modified version of HTML/XML, i.e. what internet sites are rendered with, and ereaders are essentially specialised web browsers. There are some beautiful websites out there–ones with all sorts of reflowable content, drool-worthy typography, and rich content–and yet, for the most part, ebooks still look like they’re being formatted by cheap-ass 1970s print typesetters worried about page counts and binding thickness. Why for godssakes?

There’s a whole host of what one would assume would be super-basic, mid-2000s era “web 2.0” stuff that could be included in ebooks but, for whatever reason, isn’t. Like, why can’t I define a custom dictionary/glossary? You know when you highlight a word in iBooks and hit “define”, and it gives you the dictionary and/or Wikipedia lookup? Why can’t the book itself define something to appear in that context? Like, would that not be fucking awesome? (Genre authors I’m looking at you in particular.) I would totally love that. Why is it not done? Because no-one’s invested in the tech, I guess.

Or what about social reading? Amazon’s shared annotations are kind of a start, but imagine something like a cross between that and, say, commenting in Google Docs. You could have various privacy level settings; a user-invited “reading circle” only; all friends/contacts; everyone. This one’s a little tougher because it would be reader-dependent–i.e. Kindle readers could only see the annotations of other Kindle users, iBooks readers only of other iBooks users, and so on–but… actually, it’s worth examining that for a second, too. Universal, platform-independent commenting systems do work; think of website comments, for example.2 But the systems that drive websites–specifically the ideologies behind HTML and web browsers–come from an earlier age, one anathema to our current, highly commercialised, vendor-driven world. (The early internet comes from academic and military research communities, not the skunkworks divisions of companies looking for new revenue streams.)

Conceptually, there’s not that much difference between a browser and an ereader; they’re both software products designed to render markup for the display of texts produced by third parties. Architecturally, however, the distinctions are massive.

Most people don’t realise this, of course, because ereaders and ebook markup are so stilted compared to HTML and browser technologies. And, to me, ebook formatting is eerily reminiscent of where HTML was back in about the early-00s, when Microsoft Internet Explorer dominated the space. At the pinnacle of IE’s success, in 2002, for example, only 4% of web users were using a browser that wasn’t IE.

IE’s domination of the browser market had two side effects. The first was that Microsoft essentially stopped releasing new versions of it; IE6 lasted about five years, from 2001 to 2006; in contrast, Microsoft has released a new version of IE about every year since IE8 came out in 2009. The second side effect was that very little innovation happened in the HTML space. This might be odd to think about for the non-technical, but the markup that powers websites does change,3 with new features added to it over time. In a nutshell, this is why webpages in 2014 are richer and more dynamic–they’re more interactive, have more animation and media, and so on–than they were in 2004. For a timeline, HTML 2 was released in 1995, HTML 3.2 in 1997, then HTML 4 a little later that year. Then HTML5, which was officially released in 2008.

Yeah. You read that right: it took over ten years for HTML to move from version 4 to version 5. It’s not super-coincidental that the Dark Ages of HTML coincided with the end of the First Browser Wars and the victory of Microsoft. Because the working group that formalises changes to the HTML spec–essentially saying what HTML can officially “do” versus what it can’t–is dependent on browser makers to implement its changes. When Microsoft was the only player in the space, why would it bother? (Hint: it didn’t.)

So what changed? Firefox, Chrome, and the iPhone, basically. Browsers are a tough market to crack because they’re free, so it’s tough for a company to just be a “browser maker” (some companies experimented with for-pay browsers, though it never really went anywhere). Nonetheless, Mozilla managed it with Firefox in 2004. Firefox quickly started gobbling up market share from the languishing IE by introducing new features like tabbed browsing, spell-checking, plug-ins/extensions, and far fewer security holes. We take these things for granted nowadays, but when Firefox first came out, this stuff was huge.

Firefox was soon followed by Google Chrome (2008), which offered a similarly rich feature set, plus things like cross-device syncing. This latter innovation became A Big Deal because, in 2007, the iPhone launched and introduced everyone to a whole new world of off-desktop browsing.

In other words, by 2008 the Browser Wars were very definitely back on, and this time they were playing on a whole new mobile device battlefield. Not to mention people remembered the sins of Internet Explorer. All things considered, IE nowadays is actually a pretty decent browser. Except, well. This is its current market share (in blue, Chrome is green, Firefox yellow):

Most used web browser by country.

Most used web browser by country.

The point is that, with competition back in the market, browser-makers had a reason, once again, to differentiate their products by adopting new features. Some of this was the user-facing, in-browser stuff, like the tabs and extensions. But it was also about adopting and implementing universal standards for backend technologies like HTML. And, surprise surprise, suddenly when there was more than one player in the market, the innovations came rolling in.

That’s a bit of a detour, but the point is this: Amazon’s mobi format for ebooks feels like what the internet would’ve been like if early-00s-era Microsoft had managed to implement its own proprietary version of HTML (it did, occasionally, try to do this, thankfully without success). The actual ebook equivalent of HTML isn’t even mobi; it’s EPUB, which is used by basically every reader except Kindle.

There is a reason this is so. Kindle has dominance in the market, and dominating that market with a single, closed ebook format is a big part of that strategy. By only supporting mobi, Amazon not only ensures its ebooks can’t be used on any platform other than Kindle, but that books bought outside of its ecosystem can’t be read on Kindle. Obviously, these systems are “perfect”–you can read Kindle books on non-Kindles and vice versa, if you want to do the work for it–but they don’t actually need to be. Most people will take the path of least resistance as far as their digital media libraries are concerned; a lesson Amazon learnt well from Apple’s adventures in the music business.

Unlike HTML and EPUB, however, mobi is controlled by one single company (Mobipocket, owned by Amazon since ’05). The reality is that, the IDPF can add all the awesome features it wants to EPUB, and vendors can even implement them in their readers, but because Amazon owns so much of the ebook market, publishers will drift towards formatting ebooks in the most “cross-platform” compatible way, which means the features included in mobi becomes the de facto baseline standard. Why bother, after all, spending the extra time adding extra, EPUB-only features versions of your books that will only be read by a tiny fraction of the market?

In other words, Amazon has effectively engineered a market where competition can’t occur based on format innovations. Being the ereader with the “best” implementation of EPUB means fuck all when no-one is leveraging that, either on the publishing or the consuming side. (Apple, for the record, has sort of tried this with iBooks Author, which creates very fancy-looking content with a proprietary implementation of EPUB… and which is used by almost nobody.)

Net result? Ebooks look shit. Like, sorry, formatting departments, but they do. And it’s not even your fault, since you’re doing the best you can with a bunch of extremely broken toys.

As to what the solution to this all is? Support EPUB. Except doing so means effectively shutting oneself out of the Kindle ecosystem, which is more of a sacrifice than most consumers are willing to make.

For the rest of us? Welcome to the ebook wars.

  1. Yes, I know this is technically possible. But, a) it ain’t easy, and b) it ain’t, strictly speaking, “legal”, either. ^
  2. I’m talking “work” here in a technical sense of “is achievable” rather than in a moral sense of “do we really want to do this?”. ^
  3. For the technical: I’m  super-simplifying here, so you can assume when I say “HTML” I mean “HTML, CSS, JavaScript, and associated technologies”. Though not vendor-owned stuff like Flash, et al. ^
2018-05-01T10:25:55+00:001st December, 2014|Tags: amazon, ebooks, epub, mobi, xp|Comments Off on Ebook formats suck.