The end of growth in eBooks
Many types of DRM technologies were offered to the publishing industry over the past decade, but few of them caught on. For the past six years or so, DRM has largely meant only one thing in the book-publishing world: eBooks. The eBook market emerged and rapidly consolidated during the Internet bubble of 1999-2000, and never really measured up to the hype that surrounded it (cf. Bohn 2005).

Two signposts for the end of growth in the eBook market appeared recently: first, in November 2004, Adobe quietly announced plans to withdraw its market-leading eBook DRM packaging software (Adobe Content Server) from the market and shift its focus to corporate document security (cf. Rosenblatt 2004); second, the trade association Open eBook Forum (OeBF) changed its name to the International Digital Publishing Forum in April 2005.

The usual reason given for eBooks' lack of success is that most consumers don't like reading books on PCs or dedicated hardware devices such as those from Gemstar and Franklin. But an equally important reason is that publishers only really accepted eBooks as digital facsimiles of print books that were cheaper to manufacture and distribute. Publishers did little to explore the potential of eBooks to implement new business models or new ways of distributing content – not even in markets that seemed especially promising, such as professional and higher education publishing.

The lack of innovation around eBooks can be largely attributed to publishers' reluctance to disrupt their existing supply chains, which, after all, they have cultivated carefully over a period of centuries.

Google Book Search
Google's Google Book Search program, which emerged in July 2005 (and which was called Google Print until mid-November), represents the biggest threat to those supply chains in a long time. Google has been scanning, digitizing, and indexing tens of thousands of print books, mostly borrowed from prominent university libraries, and making their texts searchable online.

The Association of American Publishers (AAP) organized a lawsuit against Google in October 2005, on behalf of five major US-based book publishers, alleging that Google infringed copyright when it scanned and indexed books without publishers' permission (cf. AAP 2005). But that allegation was more like a subterfuge: supply chain concerns are the biggest reason for the publishing industry's concerns about Google.

The truth is that Internet search engines like those of Google, Yahoo, and MSN have the potential to radically change the business landscape for book content, because they capture consumers' interest at the primary point of ‘’discoverability’’ for content online. A search engine has the power to expose content as the result of a user's Internet search, direct her to any other resource on the Internet to find the full content… and potentially make money on the referral.

In the publishers' lawsuit (and a similar one brought by the Authors Guild; (cf. Authors Guild 2005), Google is arguing that its use of the print books is legal according to US copyright law (17 U.S.C. 107), which judges "fair use" of content based on four principles. One of those principles is the effect that the use has on the market for the content; Google claims that because it is helping more consumers purchase more content, its effect on the market is positive for publishers.

However, another of the four principles is the purpose and character of the use, including whether such use is of commercial nature. In addition to the revenue that Google currently garners from ads that it displays alongside book content in search results, the potential number of content transactions from which Google could directly benefit financially is staggering.

To put the potential impact into some perspective: the technology that may currently be the largest source of online referrals to copyrighted text works is Amazon's affiliate marketing program, Amazon Associates. Amazon Associates' websites contain specially encoded links that lead users to purchase pages on Amazon; if the user makes the purchase, the Associate earns a commission. Although there are over a million Amazon Associates, the impact of Google's ability to lead consumers to copyrighted material has the potential to dwarf that of the Amazon Associates program: bear in mind that any Google search can lead a user to book content, whereas users must click on special URLs to find book content through Amazon Associates.

DRM and the discoverability paradox
Discoverability of copyrighted works online has been a stumbling block to the growth of the market for online content. It is a paradox: many copyrighted works – those generally judged to be the most valuable – are the hardest to find on the Internet. Publishers are concerned about piracy of their valuable works (as opposed, say, to copyrighted works judged less valuable, such as ephemeral news stories), so they don't expose them online, which means that users of search engines can only find them through more limited means, such as summaries, abstracts, and metadata.

DRM provides a way out of this paradox – and not just in theory. Perhaps the cleverest application of DRM to making copyrighted works discoverable was a technology called eLuminator, which appeared around 1999. ELuminator was the product of MediaDNA, a DRM startup that originated in Sweden and subsequently moved to the United States.

ELuminator worked by extracting all of the nontrivial words from a document – a typical step in search engines' text indexing techniques – and placing them on a web page as invisible meta-tags. Search engines would then index that page, so that users searching for words included in the text would find the page in search results. The visible portion of the page would contain an offer to purchase a version of the document that was packaged (encrypted) with MediaDNA's proprietary DRM.

In other words, eLuminator was a fancy, automated version of what we now call search engine optimization (SEO): the art and science of tweaking web pages so that the major search engines are more likely to give them more favourable search result rankings. Unfortunately, eLuminator did not catch on with publishers beyond a handful of pilot projects. MediaDNA ceased operations, sold eLuminator to Inceptor (an SEO technology company), and sold its core DRM technology to Macrovision – all in late 2001.

With Google Book Search, Google is, in a way, taking the eLuminator concept to the next level. It indexes the text of copyrighted works and makes them available for viewing, but only a few lines at a time – just enough to provide context around search results. This is really just a form of access control, i.e., DRM.

Once a user sees book text in Google search results, Google could then offer to sell the user a DRM-protected document itself; but instead – at least for now – it provides links to other websites, such as Amazon, Barnes & Noble, BookSense, and publishers' own websites, for purchase of physical products. (It could just as easily refer users to purchase opportunities for other versions of the content, such as eBooks at eReader.com or OverDrive, or audiobooks at Audible.com.).

More recently, Google has been holding discussions with book publishers about supporting a weekly rental model, somewhat like a cross between a public library and an online video-rental service like MovieLink. The discussions are very preliminary at this point, but one thing is for sure: Google will need to adopt full-blown DRM technology in order to make that model work. Although Adobe's Content Server technology might be available for acquisition, one suspects that, given its history, Google will design its own.

Amazon and the Open Content Alliance
Amazon itself announced plans in November 2005 to take the concept of online rendering a step further (cf. Amazon.com 2005). Amazon already offers "Search Inside the Book," a feature that makes a small number of pages in books available for online viewing in a streaming-style page rendering format that inherently deters piracy. It intends to extend this in two ways: Amazon Pages, which will enable users to purchase content by the page rather than by the book, and Amazon Upgrade, which will enable purchasers of print books to view their contents online for an additional fee.

Both of these programs will build on the technology from Search Inside the Book. It is unclear whether the increased amount of digitized text that Amazon will create as a result of these new programs will enable it to make that text discoverable by search engines.

It is worth noting that Amazon quietly purchased a French eBook technology company called Mobipocket in March 2005. Mobipocket's eBook platform for a variety of handheld devices is fairly popular in professional and technical publishing, an analog to eReader's platform for trade eBooks (cf. Rosenblatt 2005a). Amazon has done nothing (publicly) with Mobipocket's technology, which affirms that the future of online publishing is direct Internet rendering rather than downloads to closed devices.

An organization called the Open Content Alliance (OCA) formed in October 2005, shortly before the publishers filed their suit against Google. Yahoo and Internet Archive were the co-founders; now the membership also includes Microsoft's MSN, O'Reilly Media (a publisher of IT-related technical works and prominent open source advocate), and several archives and libraries. The intent of OCA is similar to Google Book Search, with one important difference: while Google Book Search has had an "opt out" policy toward publishers (i.e., publishers must notify Google if they do not wish their books to be scanned), OCA is "opt in" (publishers must give the OCA permission upfront to scan and digitize their books).

It is possible to view all of these initiatives as implementations of DRM or DRM-like mechanisms that are built for specific, narrow purposes. Google Book Search indexes the full text of books, controls access to the text by only exposing it a few sentences at a time, and facilitates commerce in rights to the text by passing users along to others via links.

Amazon's Search Inside the Book technology, meanwhile, controls access to text by only exposing it a page at a time. A precedent for this is ebrary, an online library service that was founded in 1999 with backing from Adobe and three major book publishers, and that now serves both schools and public libraries; ebrary lets users query large repositories of book content and view text, through a browser interface, a page at a time.

Amazon Pages, using the Search Inside the Book technology, facilitates commerce in rights internally by allowing users to purchase access to ranges of pages. Time will tell what kinds of mechanisms Yahoo, MSN, and other OCA members will use to provide controlled access to copyrighted content.

Publishers are effectively at the mercy of these narrow technologies and thus of the business models that they enable. Of course, it works both ways: these technology companies cannot offer online content without publishers' blessings. In Amazon's case, Amazon Pages arose out of a decision by Random House – a division of Bertelsmann AG and the world's largest trade publisher – to support page-at-a-time access rights via micropayments.

Publishers' responses
Internet-based discoverability and content display can be powerful forces for publishers if they harness them appropriately rather than simply letting technology companies take the reins. Two initiatives in Germany, announced during this year's Frankfurter Buchmesse (Frankfurt Book Fair) in October, represent attempts to do this. One comes from the publisher Verlagsgruppe Georg von Holtzbrinck; the other from the Börsenverein des Deutschen Buchhandels (German booksellers' trade association).

Holtzbrinck is developing a system it calls BookStore, which it will use for its own publishers' content but also offer to other publishers. BookStore will be an online text repository with its own e-commerce capabilities as well as the ability to make text available to search engines for indexing. BookStore is being developed by MPS Technologies, a subsidiary of Holtzbrinck's Macmillan unit based in the UK and India (cf. MPS Technologies 2005).

The Börsenverein is working on something similar, which it calls "Volltextsuche Online" (Full Text Search Online): a text repository that publishers can use for their own material and that enables searching across the repositories of all publishers that use the system. Search engines like Google and Yahoo would be able to search those repositories directly instead of scanning content into their own infrastructure, and the Börsenverein is in talks with search companies about this type of arrangement (cf. Börsenverein 2005).

The main difference between Holtzbrinck's BookStore and the Börsenverein's Volltextsuche Online is that the latter is oriented toward "federating" search for book content, so that companies like Google and Yahoo do not end up with monolithic collections of copyrighted material. BookStore is really more like an incremental improvement on online eBook retail system providers such as OverDrive, the improvement mainly being the system's ability to release full text to search engines for discoverability purposes, instead of just making abstracts and metadata available (as Amazon and its ilk do today).

It's about the rights
Unfortunately, both of these proposals miss the point. Once copyrighted content exists ‘’somewhere’’ on the Internet, it's no longer about the content – it's about the rights. If publishers want to maintain control over their own rights and supply chains in the Internet age, then they will need to take control of their ‘’rights’’ and how they make them available to distributors and retailers like Google, Amazon, Yahoo, and MSN. Then the content can be served up from wherever it is.

Right now, publishers grant or deny certain rights to online distributors in ad-hoc ways. In the case of Amazon and its new initiatives, the rights are bounded and well understood. But in the cases of Google Book Search and the OCA, the rights effectively pass out of publishers' control once they give the service provider the right to scan and index the content; their only recourse is contractual.

At this point, Google can simply provide links to other sites that presumably already have rights to sell publishers' product in pre-existing forms. The true power and flexibility of the Internet emerge once publishers can supply companies like Google with rights to digital content, which can be realized through interfaces to all kinds of devices and services.

In effect, this means that publishers should be supplying rights descriptions to online distributors in forms that they can handle – i.e., in machine-readable form. The publishing industry (at least in the US) started to look at this issue in the context of bundling rights with eBooks. In 2003, the OeBF Rights and Rules Working Group (RRWG) defined a rights expression language (REL) standard (cf. IDPF 2003) based on the ISO standard MPEG REL (cf. sources). The UK-based publishing industry e-commerce standards organization EDItEUR has also been working on developing rights-related standards for book content, with library usages particularly in mind.

The MPEG REL is a reasonable starting point, but it is not really designed for this purpose. It is designed to convey descriptions of rights and their attributes (e.g., identities of grantors, payment terms, identities of grantees) to end-users through their hardware devices or software. The language is not intended to automate rights aspects of distributor relationships.

Another standards initiative called the Content Reference Forum (cf. sources) is not only intended to address this particular problem but is also intended to be compatible with (and complementary to) MPEG REL. The CRF, which arose primarily out of the music industry, was created to automate rights processing aspects of multi-tiered content distribution networks. Its most important work product has been the Contract Expression Language (CEL), a machine-readable language that expresses distribution relationships along with rights, obligations, financial terms, and so on. Unfortunately, neither the OeBF RRWG nor the CRF have seen much activity since the end of 2003.

The publishing industry could revisit standards initiatives like the OeBF MPEG REL extensions, the CRF, and some of those cited in Brian Green's recent INDICARE article on EDItEUR initiatives (Green 2005). Holtzbrinck, for example, could then build standard rights and distribution terms expressions into its BookStore system.

There is an important precedent for this type of standards-based supply chain automation in the publishing industry: the ONIX standard for book product metadata (cf. sources), which many publishers use to feed product information to Amazon and other distributors and retailers, and which has substantially improved the efficiency of this process. ONIX contains fields for such things as book identifiers (e.g., ISBN, UPC, DOI), product metadata (e.g., price, minimum order quantity), physical characteristics (e.g., size, weight), and descriptions of content. The AAP steered the development of ONIX, and it is maintained by EDItEUR along with the Book Industry Study Group in the US. There is also a version of ONIX for serials content, such as academic journals.

At the same time, just defining standards for communicating rights information to online distributors is not enough. Publishers must be able to define and manage those rights for themselves first, so that they can express them mechanically in a REL or similar technology. Yet few publishers have viable internal databases of the rights that they are entitled (e.g., by contract) to offer; solving this problem can involve large-scale system development, process rationalization, and (in many cases) integration with legacy systems. Publishers must also think seriously about what rights they are willing to offer to online distributors, of the ones that they are able to offer. Random House's decision to offer per-page rights through Amazon is only a small step in that direction.

Bottom line
Throughout the development of the Internet, publishers have had various opportunities to take control and make the most of this hugely impactful new medium as it moves from physical commerce facilitator to content distribution and rendering medium. Developments like Google Book Search show that technology companies have the potential to force dramatic changes to publishers' business models and supply chains. Publishers must realize that once content is out there on the Internet, control over rights is the key to control over their industry's future. If they do not act soon, then Internet technology companies will take over their supply chains, they will be marginalized into lesser relevance in the content world, or both.

Sources

About the author: Bill Rosenblatt is president of GiantSteps Media Technology Strategies, a consultancy based in New York, USA that focuses on digital content technology issues for content providers and technology vendors. He is editor of the newsletter DRM Watch (http://www.drmwatch.com) and author of ‘’Digital Rights Management: Business and Technology’’ (John Wiley & Sons, 2001).

Status: first posted 21/11/05; licensed under Creative Commons
URL: http://www.indicare.org/tiki-read_article.php?articleId=152