By now most people have heard of the infamous Google Books Project and Google Library Project, as well as the subsequent—and still ongoing—lawsuits (the first, by the Association of American Publishers (AAP) and the Author’s Guild against Google in September 2005; and the second, by the Author’s Guild against HathiTrust in September 2011). These two projects have started a whirlwind of debate with regard to current United States copyright law and the doctrine of fair use. For those of us who want to learn more or need a quick refresher, here are the basics.


The Google Books Project is an effort by Google to digitize the world’s books and index them—making them searchable by anyone, anywhere. Many of these books are out of copyright and therefore in the public domain. For those that are still copyright protected, Google has either reached an agreement with the copyright holder to determine how much of the text, if any, should be allowed to preview, or in the case of orphan works—those for which the copyright holder cannot be found—Google operates under the assumption that if the copyright holder doth not protest (being missing or deceased), it’s okay to display the text in its entirety.


The Google Library Project is an extension of the Google Books Project. Google offered its scanning capability to HathiTrust, a coalition of university libraries that were attempting to digitize their collections, in return for retaining a copy of the scanned work for its own use in the Google Books Project. HathiTrust, according to its “Partnership Community” page, describes itself as “an international community of research libraries committed to the long-term curation and availability of the cultural record. Through their common efforts and deep commitment to the public good, the libraries support the teaching and learning activities of the faculty, students or researchers at their home institutions, and the scholarly needs of the broader public as well.” (On that page you can also find the list of over sixty participating institutions, which includes Columbia University, Cornell University, Harvard University, Princeton University, the University of California, and the University of Michigan.)

Google’s intentions for its Library Project is described on Google’s website:

The Library Project’s aim is simple: make it easier for people to find relevant books—specifically, books they wouldn’t find any other way such as those that are out of print—while carefully respecting authors’ and publishers’ copyrights. Our ultimate goal is to work with publishers and libraries to create a comprehensive, searchable, virtual card catalog of all books in all languages that helps users discover new books and publishers discover new readers.

And here enters the issue of fair use. In their Harvard Business Review case study, Google, Inc., authors Benjamin Edelman and Thomas Eisenmann explain, “Google argued that this scanning was fair use, not copyright infringement, both because it would make books easier to find and buy, and because Google showed only brief excerpts of in-copyright books.” Many authors and publishers argue that because the scanning creates an unauthorized copy of their work, (a copy does exist somewhere on Google’s massive database, even if it isn’t being displayed publicly) it is, indeed, copyright infringement.

Perhaps the largest issue with the doctrine of fair use, as it is described in the Copyright Act of 1976, is its susceptibility to interpretation and manipulation by the interests of corporations and artists alike. According to the act, “the fair use of a copyrighted work…for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright.”

Congress deliberately wrote the doctrine in such a way that the court could apply its subjective opinion on a case-by-case basis. The four—if somewhat vague—criteria that one may use to define fair use are as follows:

(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;

(2) the nature of the copyrighted work;

(3) the amount and substantiality of the portion used in relation to the copy-righted work as a whole; and

(4) the effect of the use upon the potential market for or value of the copy-righted work.

In the section immediately following fair use, Congress then outlines the right of libraries and archives to make no more than one copy of a work for the preservation of its collection and/or the scholarly, private study, or research use of its users. There are restrictions to this copying, however, including: the library or archive cannot profit from the copy in any way, the library or archive collection must be open and free to the public, and the copy must contain language that says the work is under copyright.

According to this definition, is what Google is doing with the Google Books Project and Google Library Project within fair use? If libraries have the right to make a copy of a work for the purposes of preservation and the scholarly and research use of its users, what’s the big deal that Google is the agent by which they accomplish that digitization? Without Google, surely this wouldn’t be possible, as no other corporate or government entity currently has the resources to attempt such a massive scanning operation. Even if Google is infringing on the copyright of works for which no copyright holder can be found by scanning and making publicly available the text, isn’t that a better alternative than allowing a large part of contemporary American literature to fade into obscurity?


