OPINION
Since 2004, when it announced agreements with several major research libraries to digitally copy books in their collections, defendant Google Inc. (“Google”) has scanned more than twenty million books. It has delivered digital copies to participating libraries, created an electronic database of books, and made text available for online searching through the use of “snippets.” Many of the books scanned by Google, however, were under copyright, and Google did not obtain permission from the copyright holders for these usages of their copyrighted works. As a consequence, in 2005, plaintiffs brought this class action charging Google with copyright infringement.
Before the Court are the parties’ cross-motions for summary judgment with respect to Google’s defense of fair use under § 107 of the Copyright Act, 17 U.S.C. § 107. For the reasons set forth below, Google’s motion for summary judgment is granted and plaintiffs’ motion for partial summary judgment is denied. Accordingly, judgment will be entered in favor of Google dismissing the case.
BACKGROUND
A. The Facts
For purposes of this motion, the facts are not in dispute. (See 9/23/13 Tr. 10-11, 15, 25-28 (Doc. No. 1086)).
Plaintiff Jim Bouton, the former pitcher for the New York Yankees, is the legal or beneficial owner of the U.S. copyright in the book Ball Four. Plaintiff Betty Miles is the legal or beneficial owner of the U.S. copyright in the book The Trouble with Thirteen. Plaintiff Joseph Goulden is the legal or beneficial owner of the U.S. copyright in the book The Superlawyers: The Small and Powerful World of the Great Washington Law Firms. (Google Resp. ¶¶ 1-3).
Google owns and operates the largest Internet search engine in the world. (Google Resp. ¶ 9). Each day, millions of people use Google’s search engine free of charge; commercial and other entities pay to display ads on Google’s websites and on other websites that contain Google ads. (Google Resp. ¶ 10). Google is a for-profit entity, and for the year ended December 31, 2011, it reported over $36.5 billion in advertising revenues. (Google Resp. ¶ 11).
2. The Google Books Project
In 2004, Google announced two digital books programs. The first, initially called “Google Print” and later renamed the “Partner Program,” involved the “hosting” and display of material provided by book publishers or other rights holders. (Google Resp. ¶¶ 13, 14). The second became known as the “Library Project,” and over time it involved the digital scanning of books in the collections of the New York Public Library, the Library of Congress, and a number of university libraries. (Clancy Decl. ¶ 5 (Doc. No. 1035); Google Resp. ¶¶ 25, 26, 27; Pl. Resp. ¶ 14).
The Partner Program and the Library Project together comprise the Google Books program (“Google Books”). (Google Resp. ¶ 15). All types of books are encompassed, including novels, biographies, children’s books, reference works, textbooks, instruction manuals, treatises, dictionaries, cookbooks, poetry books, and memoirs. (Pl. Resp. ¶ 6; Jaskiewicz Decl. ¶ 4 (Doc. No. 1041)). Some 93% of the books are non-fiction while approximately 7% are fiction.
In the Partner Program, works are displayed with permission of the rights holders. (Google Resp. ¶ 16). The Partner Program is aimed at helping publishers sell books and helping books become discovered. (Google Resp. ¶ 18). Initially, Google shared revenues from ads with publishers or other rights holders in certain circumstances. In 2011, however, Google stopped displaying ads in connection with all books. (Google Resp. ¶¶ 17,
As for the Library Project, Google has scanned more than twenty million books, in their entirety, using newly-developed scanning technology. (Google Resp. ¶¶ 28, 29). Pursuant to their agreement with Google, participating libraries can download a digital copy of each book scanned from their collections. (Google Resp. ¶ 30). Google has provided digital copies of millions of these books to the libraries, in accordance with these agreements. (Google Resp. ¶ 85). Some libraries agreed to allow Google to scan only public domain works, while others allowed Google to scan in-copyright works as well. (Google Resp. ¶ 36).
Google creates more than one copy of each book it scans from the library collections, and it maintains digital copies of each book on its servers and back-up tapes. (Google Resp. ¶¶ 40, 41). Participating libraries have downloaded digital copies of in-copyright books scanned from their collections. (Google Resp. ¶¶ 53, 54). They may not obtain a digital copy created from another library’s book. (Jaskiewicz Decl. ¶¶ 6, 8). The libraries agree to abide by the copyright laws with respect to the copies they make. (Clancy Decl. ¶ 5).
Google did not seek or obtain permission from the copyright holders to digitally copy or display verbatim expressions from in-copyright books. (Google Resp. ¶¶ 53, 54). Google has not compensated copyright holders for its copying of or displaying of verbatim expression from in-copyright books or its making available to libraries for downloading of digital copies of in-copyright books scanned from their collections. (Google Resp. ¶ 55).
3. Google Books
In scanning books for its Library Project, including in-copyright books, Google uses optical character recognition technology to generate machine-readable text, compiling a digital copy of each book. (Google Resp. ¶ 62; PI. Resp. ¶ 18; Jaskiewicz Decl. ¶ 3). Google analyzes each scan and creates an overall index of all scanned books. The index links each word or phrase appearing in each book with all of the locations in all of the books in which that word or phrase is found. The index allows a search for a particular word or phrase to return a result that includes the most relevant books in which the word or phrase is found. (Clancy Decl. ¶ 6; PI. Resp. ¶¶ 22-26). Because the full texts of books are digitized, a user can search the full text of all the books in the Google Books corpus. (Clancy Decl. ¶ 7; Google Resp. ¶ 42).
Users of Google’s search engine may conduct searches, using queries of their own design. (Pl. Resp. ¶ 10). In response to inquiries, Google returns a list of books in which the search term appears. (Clancy Decl. ¶ 8). A user can click on a particular result to be directed to an “About the Book” page, which will provide the user with information about the book in question. The page includes links to sellers of the books and/or libraries that list the book as part of their collections. No advertisements have ever appeared on any About the Book page that is part of the Library Project. (Clancy Decl. ¶ 9).
For books in “snippet view” (in contrast to “full view” books), Google divides each page into eighths- — each of which is a
Google takes security measures to prevent users from viewing a complete copy of a snippet-view book. For example, a user cannot cause the system to return different sets of snippets for the same search query; the position of each snippet is fixed within the page and does not “slide” around the search term; only the first responsive snippet available on any given page will be returned in response to a query; one of the snippets on each page is “black-listed,” meaning it will not be shown; and at least one out of ten entire pages in each book is black-listed. (Google Resp. ¶¶ 48-50; Pl. Resp. ¶¶35, 37-40). An “attacker” who tries to obtain an entire book by using a physical copy of the book to string together words appearing in successive passages would be able to obtain at best a patchwork of snippets that would be missing at least one snippet from every page and 10% of all pages. (Pl. Resp. ¶ 41). In addition, works with text organized in short “chunks,” such as dictionaries, cookbooks, and books of haiku, are excluded from snippet view. (Pl. Resp. ¶ 42).
4. The Benefits of the Library Project and Google Books
The benefits of the Library Project are many. First, Google Books provides a new and efficient way for readers and researchers to find books. (See, e.g., Clancy Decl. Ex. G). It makes tens of millions of books searchable by words and phrases. It provides a searchable index linking each word in any book to all books in which that word appears. (Clancy Decl. ¶ 7). Google Books has become an essential research tool, as it helps librarians identify and find research sources, it makes the process of interlibrary lending more efficient, and it facilitates finding and checking citations. (Br.. of Amici Curiae American Library Ass’n et al. at 4-7 (Doc. No. 1048)). Indeed, Google Books has become such an important tool for researchers and librarians that it has been integrated into the educational system — it is taught as part of the information literacy curriculum to students at all levels. (Id. at 7).
Second, in addition to being an important reference tool, Google Books greatly promotes a type of research referred to as “data mining” or “text mining.” (Br. of Digital Humanities and Law Scholars as Amici Curiae at 1 (Doc. No. 1052)). Google Books permits humanities scholars to analyze massive amounts of data — the literary record created by a collection of tens of millions of books. Researchers can examine word frequencies, syntactic patterns, and thematic markers to consider how literary style has changed over time. (Id. at 8-9; Clancy Decl. ¶ 15). Using Google Books, for example, researchers can track the frequency of references to the United States as a single entity (“the United States is”) versus references to the United States in the plural (“the United States are”) and how that usage has changed over time. (Id. at 7). The ability to determine how often different words or phrases appear in books at different times “can provide insights about fields as diverse as lexicography, the evolution of grammar, collective memory, the adoption of technology, the pursuit of fame, censorship, and historical epidemiology.” Jean-Baptiste Michel et al, Quantitative Analy
Third, Google Books expands access to books. In particular, traditionally under-served populations will benefit as they gain knowledge of and access to far more books. Google Books provides print-disabled individuals with the potential to search for books and read them in a format that is compatible with text enlargement software, text-to-speech screen access software, and Braille devices. Digitization facilitates the conversion of books to audio and tactile formats, increasing access for individuals with disabilities. (Letter from Marc Maurer, President of the National Federation for the Blind, to J. Michael McMahon, Office of the Clerk (Jan. 19, 2010) (Doc. No. 858)). Google Books facilitates the identification and access of materials for remote and underfunded libraries that need to make efficient decisions as to which resources to procure for their own collections or through interlibrary loans. (Br. of Amici Curiae American Library Ass’n at 5-6).
Fourth, Google Books helps to preserve books and give them new life. Older books, many of which are out-of-print books that are falling apart buried in library stacks, are being scanned and saved. See Authors Guild v. Google Inc.,
Finally, by helping readers and researchers identify books, Google Books benefits authors and publishers. When a user clicks on a search result and is directed to an “About the Book” page, the page will offer links to sellers of the book and/or libraries listing the book as part of their collections. (Clancy Decl. ¶ 9). The About the Book page for Ball Four, for example, provides links to Amazon.com, Barnes & Noble.com, Books-A-Million, and Indie-Bound. (See Def. Mem. at 9). A user could simply click on any of these links to be directed to a website where she could purchase the book. Hence, Google Books will generate new audiences and create new sources of income.
As amici observe: “Thanks to ... [Google Books], librarians can identify and efficiently sift through possible research sources, amateur historians have access to a wealth of previously obscure material, and everyday readers and researchers can find books that were once buried in research library archives.” (Br. of Amici Curiae American Library Ass’n at 3).
B. Procedural History
Plaintiffs commenced this action on September 20, 2005, alleging, inter alia, that Google committed copyright infringement by scanning copyrighted books and making them available for search without permission of the copyright holders. From the outset, Google’s principal defense was fair use under § 107 of the Copyright Act, 17 U.S.C. § 107.
After extensive negotiations, the parties entered into a proposed settlement resolving plaintiffs’ claims on a class-wide basis. On March 22, 2011, I issued an opinion rejecting the proposed settlement on the grounds that it was not fair, adequate, and reasonable. Authors Guild v. Google Inc.,
Thereafter, the parties engaged in further settlement discussions, but they were unable to reach agreement. The parties proposed and I accepted a schedule that called for the filing of plaintiffs’ class certification motion, the completion of discovery, and then the filing of summary judgment motions. (See 9/16/11 Order (Doc.
Plaintiffs filed their class certification motion and Google filed its motion to dismiss the Authors Guild’s claims. On May 31, 2012, I issued an opinion denying Google’s motion to dismiss and granting the individual plaintiffs’ motion for class certification. Authors Guild v. Google Inc.,
On June 9, 2012, I issued an order resetting the briefing schedule for the summary judgment motions. (6/19/12 Order (Doc. No. 1028)). The parties thereafter filed the instant cross-motions for summary judgment. Before the motions were fully submitted, however, the Second Circuit issued an order on September 17, 2012, staying these proceedings pending an interlocutory appeal by Google from my decision granting class certification. (9/17/12 Order (Doc. No. 1063)).
On July 1, 2013, without deciding the merits of the appeal, the Second Circuit vacated my class certification decision, concluding that “resolution of Google’s fair use defense in the first instance will necessarily inform and perhaps moot our analysis of many class certification issues.” Authors Guild, Inc. v. Google Inc.,
On remand, the parties completed the briefing of the summary judgment motions. I heard oral argument on September 23, 2013. I now rule on the motions.
DISCUSSION
For purposes of these motions, I assume that plaintiffs have established a prima facie case of copyright infringement against Google under 17 U.S.C. § 106. See Feist Publ’ns, Inc. v. Rural Tel. Serv. Co.,
A. Applicable Law
Fair use is a defense to a claim of copyright infringement. The doctrine permits the fair use of copyrighted works “to fulfill copyright’s very purpose, ‘[t]o promote the Progress of Science and useful Arts.’ ” Campbell v. Acuff-Rose Music, Inc.,
The fair use doctrine is codified in § 107 of the Copyright Act, which provides in relevant part as follows:
[T]he fair use of a copyrighted work, ... for purposes such as criticism, comment, news reporting, teaching (including multiple copies for classroom use), scholarship, or research, is not an infringement of copyright. In determining whether the use made of a work in any particular case is a fair use the factors to be considered shall include—
(1) the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes;
(2) the nature of the copyrighted work;
(3) the amount and substantiality of the portion used in relation to the copyrighted work as a whole; and
(4) the effect of the use upon the potential market for or value of the copyrighted work.
17 U.S.C. § 107.
The determination of fair use is “an open-ended and context-sensitive inquiry,” Blanch v. Koons,
A key consideration is whether, as part of the inquiry into the first factor, the use of the copyrighted work is “trans-formative,” that is, whether the new work merely “supersedes” or “supplants” the original creation, or whether it:
instead adds something new, with a further purpose or different character, altering the first with new expression, meaning, or message; it asks, in other words, whether and to what extent the new work is “transformative.”
Campbell,
B. Application
I discuss each of the four factors separately, and I then weigh them together.
1. Purpose and Character of Use
The first factor is “the purpose and character of the use, including whether such use is of a commercial nature or is for nonprofit educational purposes.” 17 U.S.C. § 107(1).
Google’s use of the copyrighted works is highly transformative. Google Books digitizes books and transforms expressive text into a comprehensive word index that helps readers, scholars, researchers, and others find books. Google Books has become an important tool for libraries and librarians and cite-checkers as it helps to identify and find books. The use of book text to facilitate search through the display of snippets is transformative. See Perfect 10, Inc. v. Amazon.com, Inc.,
Similarly, Google Books is also transformative in the sense that it has transformed book text into data for purposes of substantive research, including data mining and text mining in new areas, thereby opening up new fields of research. Words in books are being used in a way they have not been used before. Google Books has created something new in the use of book text — the frequency of words and trends in their usage provide substantive information.
Google Books does not supersede or supplant books because it is not a tool to be used to read books. Instead, it “adds value to the original” and allows for “the creation of new information, new aesthetics, new insights and understandings.” Leval, Toward a Fair Use Standard, 103 Harv. L.Rev. at 1111. Hence, the use is transformative.
It is true, of course, as plaintiffs argue, that Google is a for-profit entity and Google Books is largely a commercial enterprise. The fact that a use is commercial “tends to weigh against a finding of fair use.” Harper & Row,
Accordingly, I conclude that the first factor strongly favors a finding of fair use.
2. Nature of Copyrighted Works
The second factor is “the nature of the copyrighted work.” 17 U.S.C. § 107(2).
3. Amount and Substantiality of Portion Used
The third factor is “the amount and substantiality of the portion used in relation to the copyrighted work as a whole.” 17 U.S.C. § 107(3). Google scans the full text of books — the entire books— and it copies verbatim expression. On the other hand, courts have held that copying the entirety of a work may still be fair use. See, e.g., Sony Corp. of Am. v. Universal City Studios, Inc.,
On balance, I conclude that the third factor weighs slightly against a finding of fair use.
4. Effect of Use Upon Potential Market or Value
The fourth factor is “the effect of the use upon the potential market for or value of the copyrighted work.” 17 U.S.C. § 107(4). Here, plaintiffs argue that Google Books will negatively impact the market for books and that Google’s scans will serve as a “market replacement” for books. (Pl. Mem. at 41). It also argues that users could put in multiple searches, varying slightly the search terms, to access an entire book. (9/23/13 Tr. at 6).
Neither suggestion makes sense. Google does not sell its scans, and the scans do
To the contrary, a reasonable factfinder could only find that Google Books enhances the sales of books to the benefit of copyright holders. An important factor in the success of an individual title is whether it is discovered — whether potential readers learn of its existence. (Harris Decl. ¶ 7 (Doc. No. 1039)). Google Books provides a way for authors’ works to become noticed, much like traditional in-store book displays. (Id. at ¶¶ 14-15). Indeed, both librarians and their patrons use Google Books to identify books to purchase. (Br. of Amici Curiae American Library Ass’n at 8). Many authors have noted that online browsing in general and Google Books in particular helps readers find their work, thus increasing their audiences. Further, Google provides convenient links to booksellers to make it easy for a reader to order a book. In this day and age of online shopping, there can be no doubt but that Google Books improves books sales.
Hence, I conclude that the fourth factor weighs strongly in favor of a finding of fair use.
5. Overall Assessment
Finally, the various non-exclusive statutory factors are to be weighed together, along with any other relevant considerations, in light of the purposes of the copyright laws.
In my view, Google Books provides significant public benefits. It advances the progress of the arts and sciences, while maintaining respectful consideration for the rights of authors and other creative individuals, and without adversely impacting the rights of copyright holders. It has become an invaluable research tool that permits students, teachers, librarians, and others to more efficiently identify and locate books. It has given scholars the ability, for the first time, to conduct full-text searches of tens of millions of books. It preserves books, in particular out-of-print and old books that have been forgotten in the bowels of libraries, and it gives them new life. It facilitates access to books for print-disabled and remote or underserved populations. It generates new audiences and creates new sources of income for authors and publishers. Indeed, all society benefits.
Similarly, Google is entitled to summary judgment with respect to plaintiffs’ claims based on the copies of scanned books made available to libraries. Even assuming plaintiffs have demonstrated a prima facie case of copyright infringement, Google’s actions constitute fair use here as well. Google provides the libraries with the technological means to make digital copies of books that they already own. The purpose of the library copies is to advance the libraries’ lawful uses of the digitized books consistent with the copyright law. The libraries then use these digital copies in transformative ways. They create their own full-text searchable indices of books, maintain copies for purposes of preservation, and make copies available to print-disabled individuals, expanding access for them in unprecedented ways. Google’s ac
To the extent plaintiffs are asserting a theory of secondary liability against Google, the theory fails because the libraries’ actions are protected by the fair use doctrine. Indeed, in the HathiTrust case, Judge Baer held that the libraries’ conduct was fair use. See Authors Guild, Inc. v. HathiTrust,
CONCLUSION
For the reasons set forth above, plaintiffs’ motion for partial summary judgment is denied and Google’s motion for summary judgment is granted. Judgment will be entered in favor of Google dismissing the Complaint. Google shall submit a proposed judgment, on notice, within five business days hereof.
SO ORDERED.
Notes
. When pressed at oral argument to identify any factual issues that would preclude the award of summary judgment, plaintiffs' counsel was unable to do so. (Id. at 25-26).
. "Google Resp.” refers to Google’s Responses and Objections to plaintiffs' Statement of Undisputed Facts in Support of Their Motion for Partial Summary Judgment (Doc. No. 1077). “Pl. Resp.” refers to plaintiffs’ Response to Google's Local Rule 56.1 Statement (Doc. No. 1071). I have relied on the parties’ responses to the statements of undisputed facts only to the extent that factual statements were not controverted.
. These estimates are based on studies of the contents of the libraries involved. (Def. Mem. at 7 (Doc. No. 1032) (citing Brian Lavoie and Lorcan Dempsey, Beyond 1923): Characteristics of Potentially In-Copyright Print Books in Library Collections, 15-D-Lib 11/12 (2009), available at http://www.dlib.org/dlib/ november09/lavoie/l llavoie.html (last visited November 12, 2013)). The numbers are not disputed. (See 9/23/2013 Tr. at 26).
. The parties agree that the second factor plays little role in the ultimate fair use determination. (Pl. Mem. at 36 n. 18 (Doc. No. 1050); Def. Mem. at 25). See On Davis v. Gap, Inc.,
