Google Ngram and “Information Privacy”



Google NGram is a database that permits statistical analysis of the frequency of use of specific words and phrases in books. The database draws on nearly 5.2 million books from a period between 1500 and 2000 A.D. that have been digitized by the the Google Library Project. With use of the web-based NGram Viewer, it is then possible to create a graphical year-by-year representation of how often a phrase has been used in books.  

In our recent work, The PII Problem, we drew on the NGram viewer to gain a sense of peaks and valleys in policymakers’ attention to “information privacy” from 1950 to 2000.

In this article, we find that this graphic analysis of references to “information privacy” largely correlates with our sense of the development of this area of law. Early use of the term was driven by concern about mainframe computers and their ability to change how data could be organized, accessed and searched.

How did this story then develop during the latter part of the 1970s? After a decline in interest in privacy after enactment of the Privacy Act of 1974, a renewed societal focus in the United States about information privacy began in the early 1980s. Part of this attention was driven, in turn, by the arrival of George Orwell’s titular year, 1984. A flurry of media reports and articles marked this occasion with an analysis of new threats to privacy.