Tuesday, March 31, 2009

The Google Book Search Settlement

I have been thinking about the impact of the impending Google Book Search settlement on libraries like Rollins.

There have been a number of good summaries of the settlement including Jonathan Band's for the ARL. Also good commentary from Robert Darton (and Paul Courant response), Lawrence Lessig, Siva Vaidhyanathan, and Mike Madison (who has links to others.) The ALA et al. have also decided to submit comments to the court. But here is a really short summary of the 200 page plus settlement. I would be delighted to hear any comments and corrections.

  1. The settlement will end the class action suit brought by the publishers and authors for copyright infringement of their rights in copyrighted works digitized by Google. It does not affect those materials digitized by Google that are in the public domain.
  2. Google will continue to digitize in-copyright books and will enable users to search the entire contents of these digitized books.
  3. Google will generate revenue from advertising and by selling the ability to see the fulltext of digitized books.
  4. Rightsholders can set the price of books,if they do not Google will set the price in a range between $1.99 and $29.99. 51%will be priced in $5.99. These prices can be changed based on sales data. the pricing structure will be renegotiated at regular intervals.
  5. A Book Rights Registry (BRR) will be created to distribute payments from Google to rightsholders when Google displays more than the "snippets" it currently displays for works in copyright.
  6. Google will retain 37% of the revenue, 63% will go to the BRR. Initially Google will pay $45,000,000 to the BRR for copyrighted materials scanned before January 2009.
  7. The settlement excludes periodicals, musical notation and lyrics, unpublished papers, books published in the U.S. that are not registered with the Copyright Office, and books published after January 5th, 2009.
  8. The settlement is organized in terms of three groups of books: those in the public domain, those in copyright but not commercially (out of print) and those in copyright and commercially available (in print.)
  9. Rightsholders can opt out of the settlement, can remove specific books from the Google system, or vary the basic rules of the settlement for specific titles. Band for one expects that publishers will do this for their titles that are in print and therefore the settlement will basically cover out of print, in copyright books. Google estimates this at about 70% of published works. However, the University of Michigan has received for an IMLS grant that aims to clarify the copyright status of books published after 1923 and before 1963 because they think a large proportion of these works are orphan works, or have fallen out of copyright because rights holders did not complete all the necessary formalities to ensure or renew copyright.
  10. Rightsholders of inserts (forewords, essays, tables, illustrations, etc.) in other works can choose not to have those displayed,though they cannot choose to have the whole book removed from the Google Book Project.
  11. The settlement is also organized in terms of three user groups: all users, public libraries and universities, and institutions.
  12. The settlement recognizes various types of libraries that have contributed books from their collection to the project: fully participating libraries, cooperating libraries,public domain libraries, and other libraries. Since Rollins has not contributed any books to the project none of these apply to us.
  13. The settlement also includes some non-exclusivity terms that mean that libraries and rights holders (publishers) can continue to digitize outside of the Google Book Project.
  14. The fully participating libraries will also be allowed to create two centers that will host the "research corpus." The Research Corpus is the entire contents of the Google Book Project and can be used -- on and off0site - for non-consumptive research (image analysis and text extraction, textual or linguistic analysis, and research on automated translation, indexing and search. Basically research that does not involve understanding the intellectual content of the text.
  15. Finally, Google has agreed to provide free search,the permitted displays,the Public Access Services (PAS) to public and academic libraries and and institutional subscriptions for 85% of the in copyright, but out of print digitized books in the project within five years.if not, the fully participating libraries or the BRR can find someone else to do it.

For details of what various users will be able to do and under what circumstances see this table. Many people are interested in research libraries, the libraries that have provided the copies of the books google is scanning, but what does this mean for Rollins and liberal arts college libraries like us if this settlement goes through?

There will be one PAS terminal in the Olin Library. If the price is right, which is by no means certain, we will be able to subscribe so that students, faculty, and staff can access the fulltext of the books in the database on-campus. The price is likely to be somewhat similar to the price of other large scale e-book vendors' collections like NetLibrary, ebrary, etc. but since the collection will be so much larger, it might well be totally unreasonable. We will know this before December 2009. But it is not at all clear whether we would be able to offer remote access. If we are able to subscribe, it is unlikely that we will simply load the records of such books into our OPAC because these millions of titles would overwhelm our collection of about 300,000 books. But we might well index the Google Book Search titles in a meta discovery tool like the upcoming Summon (from SerialsSolutions) so that users can easily find titles in Google Books while searching a wide array of sources. It will have a radical impact on interlibrary loan of books, aiding discovery in some cases and thus increasing interlibrary loan, and enabling some users to get what they need from a book without interlibrary loaning it, and thus decreasing interlibrary loan in others. But I don't think we will be able to ILL books from Google to users elsewhere. There will some ability to link to titles form Blackboard. It will also probably be an aid for us in weeding the collection, increasing the cases in which a little used book can be withdrawn because we can rely for a low level of use on the digitized copy in the Book Project. It probably will not have too much impact on new book purchases since digital access to in copyright and in print books will be severely limited. It could be a great opportunity for statewide cooperation. I wonder if Google would be prepared (or be able to persuade publishers) to negotiate access to every academic library in Florida or better yet every resident of Florida via the State Library? Publishers have shown a distinct lack of foresight in this regard so far, so I am not sure that is going to happen.

I have to end on a note of caution. I am glad to see the non-exclusivity terms of the agreement. This means that other digitization projects can continue. But this 800lbs gorilla in the room will definitely dampen others' enthusiasm to embark on mass digitization projects. But as David Courant of Michigan has pointed out, those haven't got too far anyway. If this agreement goes through, Google will have established a very strong market position as an early entrant; that, combined with its dominance of search, will make it tough to beat in the oh-so-unprofitable world of book distribution. There's the rub: we have created a monopoly player that will dominate one of the most important expressions of our cultural heritage and knowledge base -- books.

Imagine if we created just one library that had almost every book published before 2009 and that library was a publicly traded company that made money through advertising to its users and selling services associated with its collections. Well, we just did.

The immediate impacts on our work described above are just the beginning. I think this settlement and Google Book Search in general will have huge, long term, and tough to imagine effects on libraries and publishing.

3 comments:

Paul G. said...

Will the PAS terminal in Olin be a new machine, or one of the reference computers?

Jonathan said...

We don't know at this stage, but my guess would be a designated existing machine. This is a frustrating aspect of the settlement. Yet another example of publishers restricting access in an artificial way to protect their intellectual property that will be perceived by users (a la NetLibrary) as poor service from the library. It takes us all the way back to the 1980's and the OCLC terminal that some of us old fogeys may remember. The PAS will be moot if we can subscribe; in which case students and faculty will be able to access the full database form any computer behind our firewall.

kWikiBlog said...

I like the idea of “reasonableness” over never-ending IP rights for books. As an author, I’m comfortable that after my death, the rights to my works last only one more generation, until the death of my children. Authors write to share their ideas with others. The IP they create is theirs and should be protected, earn compensation if valuable, etc. But not forever.