Exemption to Section 1201 Liability for Text and Data Mining

Lightbulb icon
On behalf of Authors Alliance, the Clinic successfully petitioned the Copyright Office for an exemption to Section 1201 liability for researchers engaging in text and data mining of literary works and films. As part of a triennial proceeding, the Clinic drafted a petition, responded to opponents, and appeared before the Copyright Office to make the case for such research. The Copyright Office largely granted the pro-posed exemption in October 2021.

Section 1201 of the Digital Millennium Copyright Act (“DMCA”) prohibits circumventing technological protection measures on copyrighted works in digital form. As an outlet for fair use, Congress created a triennial review process by which people can petition for an exemption from liability under 1201 for a specific use and class of work. This requires a showing that, among other things, the proposed use is likely to be lawful and that the digital lock makes the lawful use impractical.

Text and data mining (“TDM”) is a shorthand phrase for numerous computational techniques that can be used on datasets to make inferences about the underlying works. For example, TDM could be used to determine the frequency of a particular gender in all popular novels in the 18th century, or to analyze the use of color in a particular director’s films. Relying on Authors Guild v. HathiTrust and Authors Guild v. Google, the Clinic argued that such research projects are likely to be fair use. Although they involve copying entire works, the purpose is transformative as the goal is to gather information about the works rather than what is contained in the works.

In addition, the technological protection measures, combined with the threat of liability under 1201, are the cause of such research going undone. Researchers could theoretically scan thousands of physical books and turn them into text files, but the costs associated with such work is often prohibitive for humanities research. Additionally, existing collections of works that are available for data mining research are either incomplete or lack important research tools. This leaves circumvention as the only true option for many researchers.

Despite opposition from the content industry, the Office largely recommended adoption of the proposed exemption. It includes some limitations, however, including on who may perform such research and the data security precautions researchers must take.

Administrative and Regulatory Filings

Mar 10, 2021
Dec 14, 2020
Sep 08, 2020