The Twitter commentariat lined up to share their views, pitting authors, on one side, against internet libertarians on the other, equating the lawsuit to burning down the Great Library of Alexandria, yelling about rentiership, and insisting that all the information, in any case, wants to be free.
The lawsuit started in 2020 but has surfaced recently into the rather febrile social media discourse because, after two years of wrangling, both sides have now filed for summary judgment. As is often the case, there is a chasm between the discourse and the underlying reality. The filing from the publishers is here, and from the Internet Archive is here.
So what is it about? Who is involved, and what's at stake, online and off?
A consortium of four large publishing houses has brought the action: Hachette Book Group, HarperCollins Publishers, John Wiley & Sons, and Penguin Random House. They're big boys of the publishing world, and you almost certainly know their names. Their legal action is supported by the Association of American Publishers and the Authors Guild.
You've probably also heard of the Internet Archive (IA). They run the Wayback Machine (this lawsuit isn't about that, though the IA says that if they lose the lawsuit, the financial viability of the IA itself and, consequently, of the Wayback Machine may be at risk). Their legal defense is being led by the Electronic Frontier Foundation (known as the EFF), who you may also have heard of. They're the Amnesty International of the internet regarding privacy, free speech, and the legal protection of open-source projects.
The lawsuit concerns the Open Library, an IA project founded in 2010, and specifically its use of a process known as Controlled Digital Lending (CDL). The principle behind CDL is that a digital copy is made of a book in a library collection (generally by scanning). That copy is then lent out as a time-limited, DRM-protected digital download on a one-to-one basis in place of some or all of the library's print copies. If a library owns five print copies of a title, it should only use five copies simultaneously. If five digital copies are out on loan, all five print copies should be unavailable to consult or borrow.
Library lending in the US rests on a legal principle called the First Sale Doctrine. This establishes that once you have bought a copy of a copyrighted work (a book, for example), the copyright holder can't impose any further restrictions on what you do with the physical object you now own. You can sell it to someone else, put it on display, cut it up to make Christmas decorations, or set fire to show your disapproval of its contents, and none of these is copyright infringement (though burning books may make you an idiot). Having bought a book, a library can loan it out without any additional liabilities or restraint by the copyright holder.
The First Sale Doctrine does not entitle you to reproduce, sell, or create derivative works based on the copyright contents. For example, you can't buy a book, translate it, and publish your translation, or record it as an audiobook and stick it on Audible, even for free. Both cases amount to creating an infringing derivative work. The IA's summary judgment filing claims that their digitization and archiving, instead, create a transformative work and that transformative work is inherently protected as fair use.
The argument underpinning CDL is that lending digital books "as print" in this way (even if the process of scanning the contents to create a digital copy does create an unlicensed derivative work) should be considered fair use because, in effect, it's a "no harm, no foul" action. Superficially, this is a compelling argument. CDL has been widely adopted by public and institutional libraries, frequently through digitization partnerships with the Open Library; in fact, its popularization and its invention can be attributed to the OL.
This case is the first legal test of the principle. A white paper and position statement on CDL, published in 2018 and co-authored by the Internet Archive's Policy Counsel Lila Bailey, concluded that while there are legal gray areas, it's likely that CDL would be considered fair use, mainly where it is used to allow access, for example, to books that are copyright expired, out of print, or so-called "orphan works" whose legal copyright owner cannot be identified. In these cases, the fact the copyright holder isn't actively attempting to exploit the work themselves would make it very hard to claim that someone lending out a digital version of dubious legality on a limited basis was causing them any financial loss.
The white paper notes such lending would most likely not be legal if lending was generating income (via rental charges or advertising, for example) if the use of appropriate DRM didn't robustly prevent ongoing dissemination of the digital copies, or if digital copies were being loaned of books not owned (or not owned in sufficient number) as print copies by the lending library. It concedes that the fair-use, public benefit arguments for loaning recently published, in-copyright works of fiction in this way may well be weaker than for non-fiction and reference titles.
The plaintiffs (we'll call them the publishers, to avoid this article reading like a Law & Order script) complain that the Open Library's digitization and lending operations amount to copyright infringement on an industrial scale. If we accept the definition of CDL above, that sounds like a hyperbolic claim. Meanwhile, the Open Library's defenders make an equally hyperbolic counter-claim that the lawsuit "aims to criminalize library lending" itself.
So why has the lawsuit been brought now, and what about the Open Library specifically led to the publishers taking this step?
Digging through the court filings and the Internet Archive's blog, there would seem to be several aspects of the Open Library operation – some of which are inherent to CDL, some existing parts of the OL "ecosystem" before 2020, and some of which specifically have to do with the OL's actions in connection with its National Emergency Library during the first COVID-19 lockdowns – which might reasonably raise questions about the way the Open Library has been operating.
It's worth noting the publishers don't like CDL, full-stop (it allows libraries to avoid paying separately to license official eBooks for digital loans) and would ideally like it struck down. But they also argue that even if the court decides that CDL, in principle, is fair use, the version implemented by the OL isn't.
There are ways in which CDL is inherently not "like" loaning print copies, even if it's done with scrupulous care. A single digital copy of a popular book can be passed almost instantly from one borrower to the next, with no downtime, which means a CDL library can achieve the same number of loans from a smaller number of owned copies than a print library can. Print library books suffer significant wear and tear, and popular titles must be replaced regularly with fresh copies as they wear out or become unacceptably annotated. In contrast, those used as collateral for CDL loans could sit indefinitely undamaged in the stacks. (According to the CDL FAQ, they can even be eradicated, though not resold or donated.) Libraries often purchase more expensive hardcover editions of books, whereas books held as CDL collateral might as well be paperbacks. All these factors mean that a library lending on a CDL basis buys fewer, cheaper copies of books from the publisher than one lending the same books in print.
Do those things represent sufficient harm to justify declaring CDL unlawful and unprotected by fair use? We should find out reasonably soon, and that decision has implications beyond the Open Library and the Internet Archive itself.
Is it true, as the EFF asserts, that the Open Library's activity is "fundamentally the same as traditional library lending, and poses no new harm to authors or the publishing industry"?
You might imagine that if the OL offered, for example, one hundred copies of Harry Potter and the Goblet of Fire, it must have one hundred crisp, clean copies all stored tidily in a warehouse somewhere. But you would likely be wrong. The first clue is in how the EFF describes the activities of the Open Library, saying that it "only permits patrons to check out as many copies as the Archive and its partner libraries physically own" (emphasis mine). The IA's website is not incredibly forthcoming on who these "partner libraries" are. Still, OL is using as collateral not just books in its collection but also those on the shelves and in the stacks of partner organizations. How many of those hundred notional books are we talking about? How certain can the OL be that those books are not physically available at any given time? How reasonable is the OL to lend a digital copy using a different organization's holdings as collateral?
But at least the books IA does own have been bought in the same way a library would buy a print copy for lending. That is not the case, either. You only need to look at a handful of previews of books on the Open Library website to stumble upon one digitized from a withdrawn library book (library stamps don't lie). In 2019, the IA trumpeted a commercial link-up with Better World Books. There's a load of buzzword-heavy guff about "newfound synergies." The details of the commercial arrangement are murky, but reading between the lines: Better World Books will be supplying secondhand books, many of them at the end of their library lifespan, to Open Library for digitization, and it's fair to assume, as a free or very low-cost source of literal van-loads of "collateral" copies.
Does it make a legal difference if the print copy held as collateral is not in lendable condition? What about a copy acquired in secondhand condition and then shredded (since, after all, destruction of the physical copy is permitted)? It seems possible the OL could be lending titles it doesn't own, having had a single copy pass temporarily through its hands for digitization, based on copies it believes partner libraries hold and on copies whose ownership transferred briefly to the OL from Better World Books on their way to landfill.
In 2020, the Open Library launched the National Emergency Library in response to the first COVID-19 lockdowns. Claiming that the unprecedented circumstances justified lifting their lending controls, they uncoupled from the fixed one-to-one owned-to-loaned ratio inherent to CDL, lending unlimited copies of any given title at once. The publishers argue there was no legal justification for this action and that it represents large-scale copyright infringement. Other than some hand-waving that all of the books locked up in closed public libraries should somehow be counted against their digital lending and an appeal to the exceptional circumstances of the pandemic amounting to fair use, IA does not have any defense to this complaint. The National Emergency Library ceased operating earlier than IA had intended in response to this lawsuit being filed.
While the Open Library presents itself as a non-profit- David lining up against the mighty big-publishing-Goliath, the digitization services it provided to libraries between 2011 and 2020 yielded over $30 million in revenue. Founder Brewster Kahle is also reported to have earned hundreds of millions of dollars personally from licensing the scanning technologies developed by the Open Library digitization project to businesses such as Amazon.
The claim by the publishers against the Internet Archive is for damages totaling $19 million. However, how much the IA may have to pay if found at fault will depend significantly on whether they can convince the court they acted in good faith and believed their actions were legal. The Open Library homepage solicits donations from visitors about the lawsuit, claiming "the right for libraries to lend books is being threatened."