By Chris Hoofnagle, The Berkeley Blog
Both users of bMail and the campus itself have never received a clear
answer to a simple question: Is Google subjecting data in Google Apps
for Education to data analysis or mining for purposes unnecessary for
technical rendition of service?
A recently-filed lawsuit suggests that Google is indeed applying
analysis to our messages, but masking this behavior by not showing users
advertising. In Fread v. Google, students from the University of Hawaii and the University of the Pacific allege:
“…Google does not serve targeted content-based
advertising to Google Apps EDU users. Google nonetheless extracts the
content and meaning from Plaintiffs’ and Class Members’ Sent and
Received e-mail messages and uses that content for various purposes and
From this reading, Google collects, extracts, and/or generates
metadata consisting of “PHIL Clusters” (Probabilistic Hierarchical
Inferential Learner). “PHIL Clusters” represent the meaning inferred
from particular words or phrases in Plaintiffs’ and the Class’ Received
and Sent e-mails.
Systems such as PHIL, or similar systems, learn “concepts” by learning an explanatory model of text.
Thus, Google’s use of PHIL’s concepts are designed and supposed to
model the actual ideas that occur in Plaintiffs and Class members’ mind
in creating e-mail content.
Why should we believe these plaintiffs? For one, they are represented by attorneys who are litigating another case concerning Gmail, Dunbar v. Google.
In the discovery process, these attorneys could have learned about this
data analysis. We, the users of bMail, however, will remain in the
dark, because Google has sealed much of the record in these cases.
As educational institutions, we are under a duty to supervise and control Google’s maintenance and use of educational records. We cannot do this without having clear answers about Google’s processes. The Fread
lawsuit gives us an opportunity to do so: Google’s filings with the
court are under oath and scrutinized by the expert plaintiff lawyers in
the case. As a System, we could call upon Google to provide us with
unreadacted versions of these materials.