The Challenges of Governing AI

illustration with episode title "the challenges of governing AI"

Artificial intelligence, or AI, is both one of today’s hottest technologies and a significant challenge for lawmakers and regulators. As AI-based applications continue to proliferate, where are guardrails needed, and where might a hands-off approach be smarter? And how can legal scholars impact the discourse while teaching the next generation of lawyers about this important innovation?

In this episode, UC Berkeley Law Dean Erwin Chemerinsky is joined by Stanford Law Professor Daniel Ho and UC Berkeley Law Professors Pamela Samuelson and Colleen Chien

Ho is the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford Law, a senior fellow at the Stanford Institute for Human-Centered Artificial Intelligence and the Stanford Institute for Economic Policy Research, and director of the Regulation, Evaluation, and Governance Lab (RegLab).

Samuelson is the Richard M. Sherman Distinguished Professor of Law and a professor at the UC Berkeley School of Information. She’s recognized as a pioneer in digital copyright law, intellectual property, cyberlaw, and information policy. She is co-founder and chair of the board of Authors Alliance, a nonprofit organization that promotes the public interest in access to knowledge. She also serves on the board of directors of the Electronic Frontier Foundation, as well as on the advisory boards for the Electronic Privacy Information Center, the Center for Democracy & Technology, and Public Knowledge.

Chien is a UC Berkeley Law alumna and studies a wide range of topics, including innovation, intellectual property, and the criminal justice system. The faculty adviser of the Criminal Law & Justice Center, she also founded and directs two grant-funded research initiatives: the Innovator Diversity Pilots Initiative, which develops rigorous evidence to boost inclusion in innovation, and the Paper Prisons Initiative, which conducts research to address and advance economic and racial justice.

Samuelson and Chien are faculty co-directors of the Berkeley Center for Law & Technology, the law school’s hub for technology law.

 

About:

“More Just” from UC Berkeley Law is a podcast about how law schools can and must play a role in solving society’s most difficult problems.

The rule of law — and the role of the law — has never been more important. In these difficult times, law schools can, and must, play an active role in finding solutions. But how? Each episode of More Just starts with a problem, then explores potential solutions, featuring Dean Erwin Chemerinsky as well as other deans, professors, students, and advocates, about how they’re making law schools matter.

Have a question about teaching or studying law, or a topic you’d like Dean Chemerinsky to explore? Email us at morejust@berkeley.edu and tell us what’s on your mind.

 

Production by Yellow Armadillo Studios.

 


Episode Transcript

[MUSIC PLAYING] ERWIN CHEMERINSKY: Hello, listeners. I’m Erwin Chemerinsky, Dean of Berkeley Law. And this is More Just, a podcast about how law schools can help solve society’s most difficult problems. Artificial intelligence, or AI, is both one of today’s hottest technologies and also a significant challenge for lawmakers and regulators. As AI-based applications continue to proliferate, where are guardrails needed? And where might a hands-off approach be smarter? And how can legal scholars impact the discourse while teaching the next generation of lawyers about this important innovation?

I’m joined by three terrific scholars for this episode of the podcast. I’m enormously grateful to them for joining us. I’m joined today by Stanford Law Professor Daniel Ho. He’s the William Benjamin Scott and Luna M. Scott Professor of Law at Stanford. He’s a Senior Fellow at the Stanford Institute for Human-Centered Intelligence and the Stanford Institute for Economic Policy Research. He’s Director of the Regulation, Evaluation, and Governance Lab, the RegLab.

Professor Pamela Samuelson is the Richard M. Sherman Distinguished Professor of Law, and she’s also a Professor at the UC Berkeley School of Information. And Professor Colleen Chien is a Professor at UC Berkeley Law School. She’s the Faculty Advisor to the Criminal Law & Justice Center here. And both, Professor Samuelson and Chien, are Faculty Directors of the Berkeley Center for Law and Technology, the Law School’s hub with regard to technology law. Again, my enormous thanks to each of you for taking time for this conversation.

Colleen, if I could start with you, could you just define what is AI? We hear it mentioned so often, but I don’t hear it defined nearly as much.

COLLEEN CHIEN: So I think of AI as a set of technologies that allow computers to perform tasks that basically mimic human intelligence. So things like seeing and understanding, in the case of processing visual or other information and recognizing. For example, that that’s a picture of a dog or a cat. Or in the case of machine vision for autonomous driving, taking data that represents oncoming traffic and taking action based on it.

Other things, like analyzing data and finding patterns or inferences that might not otherwise be obvious or making recommendations. So, for example, if you like this show or clip or piece of content, you might like this next piece of content. That’s a lot of AI-powered algorithms, or a lot of what powers these recommendation engines on Netflix or YouTube or any of the other social media platforms. Across all of these, learning and adapting is what is at the heart of machine learning. As the model is fed new data, better data, it gets better at all these tasks.

The two major types of AI that we focus on are what I’d call good old-fashioned machine learning, which refers to traditional algorithms that primarily analyze existing data to make predictions or classifications, but may do so in a way that might discriminate or target in a way that we don’t want it to. But what’s been more revolutionary in the last few years is the rise of generative AI, the application of neural nets to language, to use that same basic technique to predict the next word and use that to generate new text. That’s where we’re seeing AI really have a big impact right now.

ERWIN CHEMERINSKY: Let me take an example. Probably everybody listening has heard of ChatGPT. Can you explain it as a form of AI?

COLLEEN CHIEN: Sure. So ChatGPT is a good example of probably the most prominent example of what we call a generative pre-trained transformer technology, which basically uses a neural net to take a lot of data, in this case, having scoured the entire internet, take all that information, and then use it to make a statistical prediction about what should be the response to a prompt. So when a prompt is entered into ChatGPT, even though it looks like it’s thinking, all the ChatGPT is doing is essentially predicting what it thinks the response should be based on having read the corpus of or ingested the corpus of information from the internet and from these other sources.

And so it is based on this neural network underlying technology, which has been out there for some time. But using the generative pre-trained transformer, the GPT technology has been able to get better and better at trying to specifically predict what the next word in a sequence will be and come up with what are quite amazing results in terms of being something that humans can understand and relate to.

ERWIN CHEMERINSKY: Well, let me have Pam take what Colleen has just defined and ask you, what are some of the concerns about managing AI from the policy perspective?

PAMELA SAMUELSON: Well, I think the issues that have been given most attention so far have been safety issues. To the extent that AI systems are being used, for example, to manage critical infrastructure that actually could cause some harms to the infrastructure and to people. And so, the European Union has adopted what they call the AI Act, and it imposes a lot of obligations on developers of artificial intelligence systems to– based on the amount of risk that they understood to pose– and puts a bunch of responsibilities about doing safety assessments and reporting to minimize the harms that may come.

Now, that’s only one of dozens of things that people are worried about right now. But it is also what caught the attention of the California legislature this summer, because SB 1047 was legislation aimed at regulating the largest of the frontier models for safety purposes, and it identified a set of critical harms and imposed obligations to assess those risks and to install safety devices. The governor just vetoed it, but that’s been the main concern.

The Colorado legislation actually focused more on discrimination. So one of the things that people are worried about is that some of these AI systems can be used in a manner that discriminates against people in ways that we wouldn’t want to have happen, that could be based on their gender, or based on race, or based on other characteristics. And so, it’s important to understand that that’s our big concern too. And so trying to impose ethical obligations, as well as safety obligations, I think is another big thrust of the policy discussion so far.

ERWIN CHEMERINSKY: Could I ask you, Pam, to give an example of what the safety concerns are, whether in the European Union or what was behind SB 1047?

PAMELA SAMUELSON: Well, risks, I think I mentioned the critical infrastructure. One of the things that we want to make sure is that critical infrastructure that if it goes wrong, that it could cause damage, property damage, it could cost people’s lives. So we want to make sure that safety is a priority.

ERWIN CHEMERINSKY: Let me make sure I understand, when you think of critical infrastructure, would this be like a power plant?

PAMELA SAMUELSON: Yeah.

ERWIN CHEMERINSKY: It would be that AI could direct the power plant in a way that’s unsafe?

PAMELA SAMUELSON: Yes.

ERWIN CHEMERINSKY: And how would the regulation, either from the European Union or SB 1047, have dealt with that?

PAMELA SAMUELSON: Well, one requirement has been to do self-assessments. And another requirement has been to hire third-party independent auditors to come in and do some safety testing to ensure that things are OK, and then to submit compliance reports to government officials if and when government officials think that’s necessary. And those are the main things, the disclosure requirements are, I think, a big part of the perceived solution space.

ERWIN CHEMERINSKY: Well, Dan, I know you’re involved very much with AI governance decisions. Are there other proposals that you’re hearing about in addition to the ones that Pam was just mentioning? Are there others that we should be aware of?

DANIEL HO: Yeah. Well, first of all, thanks so much for having me on this podcast, Erwin. It’s just a delight to be able to chat with you all. Maybe if I could actually start right where Pam was leading us, one of the concerns that has really been animating a lot of the national AI policy dialogue has been this concern around forms of catastrophic risk. And there are a number of months, actually, after a CEO of one of these major tech companies testified in Congress talking about the potential to use something, like ChatGPT, which Colleen described as a way to actually create, for instance, a bioweapon, that spurred off a whole lot of dialogue and actually led to the longest section in the AI Executive Order by the Biden-Harris administration that was really focused on forms of bio-risk.

And, after that, we actually got a series of studies that more or less tended to show that when you actually compared teams that had access to a large language model or ChatGPT versus folks that just had access to internet sources that precisely because of the training process that Colleen described, that actually the incremental value, at least in those studies of the AI system, was to give you no more than what was widely available on the internet. And as a result, really didn’t corroborate that risk of ChatGPT in proliferating bioweapons.

And that, I think, has animated some of the discussions really around how we ground better AI policy. We should not live in a world where one CEO can take information really only available to the large developers of these systems and drive so much of the AI policy dialogue. Another example of that is when the OpenAI CEO, Sam Altman, testified in front of Congress. He famously noted that this technology may be so dangerous that we should actually be granted licenses, so that only a small number of players should be able to practice this technology. And, of course, those of us who study the history of regulation can see lots of examples of instances where licenses have actually been used for anti-competitive conduct.

And so maybe the last proposal that I’m quite fond that I note, is that it is a bad state that we live in when a small number of private actors can drive so much of this AI regulatory dialogue. And so, I’ve become really a fan of forms of adverse event reporting, which is what we use in cybersecurity. It’s what we use for adverse drug events, really, to drive down the information asymmetry that might exist between a government and the private parties who at this point, control and have the most amount of knowledge about the potential risks of this technology.

COLLEEN CHIEN: I would add that there is, I think, a large role for the university and for academics in that capacity that Dan just mentioned. And I think, Dan, I’m not sure if you were a part of the more recent initiative. I know Berkeley folks were in terms of calling for evidence-based policy making around AI safety. Because as was said, as Dan mentioned, I think having the control of the narrative by certain powerful figures who have obviously strong interests in basically saying that it’s only– safety is such a big issue that only the largest players can really handle it well, does not serve the interests of those who are entering and starting up.

But I do think this question of again what would that look like, besides whether it’s adverse event reporting, or studies, or other types of engagements, I think that still needs to be fleshed out quite a bit. But I’m hopeful that there will be a movement within academia to really focus on safety in a very rigorous way.

PAMELA SAMUELSON: Let me actually add one kind of dimension to it. I mean, one of the questions that I think Dan put to us is the question of why is it that these large language models and these frontier models, why is it that there are so few players in the space? And part of it is it’s really, really super expensive to do the training of these models. You’ve got– there are various stages to the development process. But the amount of computing power that’s required means that the ordinary person who could start a software company can’t just start doing model development. So that’s an important dimension to it.

Related set of issues has to do with whether models need to be proprietary and closed, or whether they can be open. So many people in the academic and research communities want the possibility of open models to be available, both because they will be able to do more model development themselves. Because you can take an existing model, if it’s open, and do some fine tuning for a particular type of application. And so, the research community really does have an interest in this openness versus closed thing. And I know that’s been discussed at not only at the state level, but also at the federal level.

DANIEL HO: Can I add something to that actually to bring both of these points together actually. So I think Colleen is really rightly stating the important role of academic vetting of what’s happening in this space. And that’s particularly challenging in light of what Pam has noted, which is the greater concentration and the way in which a lot of the research capacity is actually migrated out of the engineering hallways into the private hallways of these companies, which has made research in this space, particularly difficult.

But let me give you one example of why that’s really important. Closer maybe to the typical stuff that we teach within law schools, which is the direct assessment of risks of these kinds of tools when it comes to legal research. Chief Justice Roberts in his annual report on the state of the judiciary, singled out the problem of hallucinations with these AI tools. We all probably know of what the New York Times dubbed the ChatGPT Lawyer, who submitted a brief with a bunch of bogus cases and citations. And they’re one of the things that’s really striking is that this has been taken up quite rapidly by legal technology providers. Think of the Lexises and WestLaws of the world.

And one of the things that’s been really difficult because of the closed nature of those systems– we don’t know how they’re built, how they’re designed– is how to actually assess claims of reliability of these systems. So we founded one audit of these tools that the hallucination rate, even though it’s advertised by these providers as being hallucination-free citations, is actually in our benchmark data set a rate of hallucinations between one out of five to one out of three queries.

And what’s the reason for that? The reason for that is that they had a very narrow definition of what it meant to hallucinate. It’s a hallucination-free citation as long as the citation exists. And what that meant is, when we queried the system about a completely fictional judge, who does not exist– name me some notable opinions by Luther A. Wilgarten, not a judge that ever sat on the federal bench. It very promptly spit out a bunch of opinions that were real opinions, but certainly not written by that judge. And those are the kinds of things that as we get more transparency and auditing of these kinds of systems that are really important for responsible integration.

ERWIN CHEMERINSKY: Let me ask you each a follow-up question. How worried should we be about AI? Pam, when you talk about infrastructure, I think about power plants. Or, Dan, you talk about AI constructing bioweapons. How much is there a reason to be concerned about AI? And I’d also be interested in each of you saying if there’s a single proposal that you think would be most important to be adopted at this point in time, what would it be? Colleen, can I start with you?

COLLEEN CHIEN: Well, I just want to dovetail a little bit on what Dan just said about hallucinations And. The reference to Chief Judge Roberts end-of-year letter. I think here is an area where hallucinations are very present. And I do agree with Dan very much that there’s a lot of hyping of AI products. The FTC just went after a company that called DoNotPay that allege that they would have the first AI lawyer. And so there is this danger of overreliance on the technology. But at the same time, I think there’s also a danger of under reliance on the technology.

And so, in some work that I’ve done with legal aid lawyers, where we gave them tools and did a randomized controlled trial asking them to see how they could navigate the limitations of the tools. Even though people were cognizant that there were challenges, both with respect to hallucinations but also privacy, they still found the tools to be incredibly useful. And they found ways to manage the risk.

So rather than relying on ChatGPT or some of the other tools we provided, like CoCounsel, to provide a final answer, it could still provide a first draft of research or a first draft of a set of questions. And the efficiency gains were worth it. In many of these cases, most of the people that we gave access to the technologies to, who didn’t have it necessarily before, indicated that they wanted to continue using them. So I think there is both the risk of overreliance, very important to flag that, but also a risk of under reliance, especially for not focusing on these use cases where lower cost could really be beneficial and could really help increase access to justice or be otherwise socially beneficial.

ERWIN CHEMERINSKY: If I could also ask you how worried should people be about AI? And if there’s one thing that you could do by way of policy change, what would it be?

COLLEEN CHIEN: I mean, in terms of worries, I think there are a number of them. I think on the legal side, we are more concerned on what could go wrong than what could go right. And so, I think, if you’re asking the question with respect to legal policy and lawmaking, I think that I would say that a lot of the direction that federal government has been going in I would say I’d like to continue that momentum. We’re recording this right after the election, and I’d like to see the continued use of existing authorities to address harms to continue. So I would really like to see the momentum of the Biden Executive Order and the different procurement and other policies continue. So that’s I think, the policy I’d like to see continue.

In terms of my greatest concern, I think it really is very domain specific. But I do want to keep an eye on the positive benefits and the beneficial uses of AI. And thinking about the workforce impacts of them. I think there hasn’t been as much emphasis on that right now. There’s been more concerns around, as Pam mentioned, risk, and safety, and discrimination, and some of these other harms that we’re experiencing. But I think we do need to think about the transition in the workforce and think about how more generally we live with AI and regulated in a way that is going to support the development of our workforce in a way that’s harmonious going to the future.

ERWIN CHEMERINSKY: Thank you. Pam, if I could ask you the same questions about how worried should we be and what would you want us to do now?

PAMELA SAMUELSON: Well, one reason why there’s a lot of worry– and here’s connection to the open versus closed issue– and that is that the companies that are developing these models today are not designing them in order to develop bioweapons or other kinds of really harmful things. But if the AI system and its weights are open, it means that somebody who’s a bad guy could get a hold of the model and then use it for that purpose because the models have so much information.

Like Colleen, I think that there are many, many ways in which AI systems, whether in government or in private sector, can be very beneficial, providing more services to the public, for example, getting faster feedback and the like. But this open versus closed issue is one of the big policy areas, and the National Telecommunications and Information Agency was tasked with writing a report about the open versus closed model and the risks of that. And they’ve decided that there isn’t enough evidence right now to say that models shouldn’t be open, but that it’s important to continue to watch what’s happening in the industry.

Because part of what makes regulation right now difficult is that the industry is in early stages and things evolve very rapidly in this space. And so, I think the Biden administration’s policy has been to try to understand a range of possible risks and to then keep an eye on what’s going on. And then if need be, to regulate as harmful things happen. So, that’s not one thing. But I think that flexible, adaptable, let’s watch what’s going on, and then respond appropriately, that seems to me to be a pretty good way of going.

ERWIN CHEMERINSKY: Very helpful. And, Dan, your answer to the questions that I posed to Colleen and Pam, how worried should people be and what would you most want to see us do now?

DANIEL HO: Well, I will second both Pam and Colleen’s notion that we shouldn’t be exclusively focused on risk, because there are also tremendous benefits. So in one example, we did a collaboration with the Santa Clara County Recorder to help them use a language model, sift through five million deed records to identify racially restrictive covenants in a way that would have taken them decades to do through manual review, and they were required to do that under a California law.

As to risks. I think, as Pam noted, there is such uncertainty around both the general risk, and then the relative risk of open versus closed. And cybersecurity is a really good example. Where in cybersecurity large language models might, if openly available, make it easier for adversarial actors to identify vulnerabilities. But the net effect on what cybersecurity experts would call the offense-defense balance is actually unclear. In the sense that what we don’t have enough information about is the use of large language models to actually patch vulnerabilities, which is a real rate limiting constraint for most companies. You get way more bug requests than you can possibly handle. And to the extent that you can disproportionately help with the latter, the open availability of models may actually tilt in favor of defense in that offense-defense balance.

Which I think is what leads me to a basic answer to your question as to what I most favor, which is that we just know so little about each of these different threat vectors– we’re having one national conversation, but people are simultaneously talking about the potential for bias, labor displacement, catastrophic risk, environmental harms, economic concentration, misinformation. And that is why, in my mind, one of the things that really has to happen is a form of what I had mentioned earlier, which is a form of adverse event reporting, so that we can actually have a better knowledge and evidence base as to the kinds of materialized harms that are occurring out in the real world.

ERWIN CHEMERINSKY: Thank you. I’d really like to hear what each of you are doing and working on right now? Colleen, for example, I know that some of your work is focused on the potential benefits of AI for people who need legal advice and representation. Could you talk a bit about that?

COLLEEN CHIEN: Sure, I can give you two examples. One drawn from my work in the criminal justice system and one in the patent system more briefly. And I think the key for me here is that while AI is part of the solution, it’s more important to focus on the legal problem and the legal promise that remains unmet and how it can be addressed not only with AI, but also with data and automation.

So in the criminal justice realm, one in three adults has a criminal record, leading to diminished opportunities, and many could get these records expunged, but they don’t because of challenges in legal process. So this leads to a big gap between eligibility and delivery of relief for what I call the second chance gap. So legal tech that leverages AI and data has been able to serve a lot more people than through the standard petition-based process.

So one app in Utah called Rasa, served about 10,000 people in their first year, as compared to 400 people who were able to get relief through Legal Aid. So this is an example where a tool can really help close the access to justice gap, or at least address it. But I think the more powerful intervention, which students and I as part of something called the Paper Prisons Initiative, as well as others in the Clean Slate movement have pushed for, is to fundamentally change the way that expungement is done and make it automated, meaning that states will automatically implement the expungements rather than individuals having to file petitions.

So we’ve started seeing 12 states pass Clean Slate laws, and here it’s data and automation as well as data-informed advocacy, not AI per se. Though, AI can be part of the process that’s changing the game. So I think for AI to fundamentally address these justice gaps is going to require this paradigm changes, not just doing the same things incrementally more efficiently, but thinking about a world in which legal tech by the consumer, but also by the courts, play a much bigger role. And in the former, rethinking laws around things like the unauthorized practice of law, I think are going to be things that’s required.

On the patenting side, we know that patenting catalyzes for young and new companies economic opportunity, hiring, and funding. But again, getting that patent can be a very expensive and out of reach. So I’ve talked about, I think the idea of providing AI-assisted patent quality tools can increase equity and quality by reducing the cost to the system. So those are just a couple examples.

ERWIN CHEMERINSKY: Great examples, thank you. Pam, you’re one of the leading experts in the world with regard to copyright law and intellectual property. How is AI related to the work that you’re doing now in those fields?

PAMELA SAMUELSON: Well I’ve been pretty busy in the last couple of years giving talks and writing about the generative AI copyright lawsuits. The biggest issue in those cases is whether or not making copies of in-copyright works– let’s say, that are found up on the internet, whether making copies of those and using them as training data is copyright infringement or not? There are now 30 lawsuits pending in federal courts, and they’re all in pretty early stages, but there are more than a dozen class action lawsuits.

There are also lawsuits involving the New York Times and a number of other companies, and pretty much every one of the big generative AI companies actually has been sued. So it’s not just on OpenAI, although OpenAI has definitely had more lawsuits against it than any of the other defendants. I wrote an article in Science magazine, trying to explain what are the lawsuits talking about, and what are the legal precedents are there. What are the arguments pro, what are the arguments against. And again, these cases are in early stages. But that’s the $64 billion question when it comes to the generative AI companies.

And so, I’ve written law review articles, science, and then computing professionals– because to the extent that people in the computing field– they think that it’s like, why are these lawsuits happening? They can’t really believe that there’s some sort of problem here. So I have to very patiently go through here’s what copyright law is, and here’s why it’s infringement and a plausible argument that it is fair use. And fair use is a limitation on the scope of copyrights. So I could go on, but I think that’s the thing I’m working on the most.

ERWIN CHEMERINSKY: You give a great description of it. Have you taken an overall position or does it all just depend on the context and circumstances?

PAMELA SAMUELSON: Well, I think there are some lawsuits that have greater chance of success than others. I think that some of the lawsuits may well settle. For example, Getty Images sued a company called Stability. And Getty Images says, “Hey, I have 12 million images on my sites, and you basically took them, Stability, to train Stable Diffusion model. And I think that actually you should have gotten a license from me.” So one of the big considerations in fair use cases is, what’s the effect of the challenged use on the market for and value of the copyrighted work?

And so, if Getty actually has a licensing program for training data that makes the market harm argument pretty strong. Whereas some of the class action lawsuits are ones in which the people say, “Hey, you didn’t compensate me, and you didn’t give me any credit, or anything.” But there’s no place you could go get a license from all the book authors in the world. You just can’t do that. So the chance of some sort of market harm there is different. And again, part of the thing is that there are cases involving visual art. There are cases involving books. There are cases involving software. There’s cases about recorded music. So you can make these models based on lots of different kinds of data. And everybody seems to be wanting to have a lawsuit on these issues.

ERWIN CHEMERINSKY: Dan, let me ask you, the RegLab that you direct is partnering with some government agencies do AI demonstration projects. What are some examples? What are you learning from this?

DANIEL HO: Thanks for the question, Erwin. We founded the RegLab really in recognition of the really significant talent gap that exists between government and universities, for instance, where a lot of basic AI research is happening. Just one statistic on that front is based on the most recent AI index. Less than 1% of AI PhDs go into government. Around 60% of them go into industry. About a quarter goes into the academy.

And government is not going to be able to get these questions of AI governance right, if government cannot understand AI. And so we engage in a range of collaborations, including with the US Department of Labor, Santa Clara County, as I mentioned. But I’ll give you the one example, which is with the IRS, a project that we started a number of years ago, really around the exploration of the use of machine learning to cut down on the really significant tax gap, the difference between taxes owed and taxes paid. That is annually around $500 billion.

And we built out a set of prototypes of how to really modernize the detection of forms of tax evasion. But at the same time, everyone was mindful of the potential risks that AI systems might exacerbate disparities across different types of taxpayers. And so, we built out a framework of how IRS could actually monitor for the emergence of those disparities. And that turned out to be quite difficult to do. We spent over a year doing this, because IRS doesn’t collect basic information about demographics, like race and ethnicity on the 1040. And, in fact, it’s statutorily prohibited from just going directly to census to be able to link those kinds of demographic attributes.

And what we then discovered was actually that it was in the legacy systems, not the new AI systems, where there was a really disturbing racial disparity, where Black taxpayers were audited at about three to five times the rate as non-Black taxpayers, not explained by differences in under-reporting, but really having to do with the auditing around the earned income tax credit and eligibility for that tax credit that’s claimed by lower income taxpayers with dependents, over 20 million annually.

We’re really pleased that when that study was released, the IRS commissioner was asked multiple times about it in his confirmation hearing, and last year announced an overhaul of how the IRS was going to audit around these EITC issues. And really, that leads me to two broader points. One is that we often have anxiety around the risks of AI systems themselves. In this instance, it was actually the prototyping of AI systems that led to a form of transparency that we didn’t have before around legacy systems.

And the second is that there are really important questions that we as a legal community still really need to think about in terms of modernizing our civil rights laws to address forms of algorithmic decision-making. Because so much of the premise has been around notions of intentional discrimination and the like, and how do we think about that as these systems start to displace human judgment.

ERWIN CHEMERINSKY: To wrap up, let me ask you each to address what’s coming next and what should law schools be doing about this?

COLLEEN CHIEN: Well, I’ll focus on the law school question, because I’ve been focused on putting together our AI Law and Governance class, and also our new AI Law and Regulation certificate. So we’re seeing just a lot of demand for– from students who want to know about AI, how they can practice AI law, how they can be ready for it when they leave. And I think there’s a couple of things that are important to emphasize there.

And one is that AI regulation at this stage is about, as Dan mentioned, governance and not just about law. The data you choose for your model, the assessments you do before you deploy your system, then the continuous monitoring after the fact. Lawyers have a very important role to play in managing that risk and ensuring that responsible AI principles, like transparency, equity, and fairness are embedded into this pipeline. But being able to be flexible, be able to come up to speed on technology and the trade-offs is going to be really important. And so, ensuring that students have that fluency and comfort with technology, as well as engaging in an area where we don’t have a lot of settled law, is really important.

I think it’s also good, though, for students to understand that the policy and regulatory space is very much in flux. But the core first principles and frameworks afforded by law of consumer protection, anti-discrimination law, privacy, due process are still very much essential, so that it’s less about committing to memory any single new law, and more about being able to recognize the issues when they arise and be able to call upon the existing bodies of law that can be relevant. That’s been what I’ve been thinking about in teaching and thinking about this new degree program that we’re offering.

ERWIN CHEMERINSKY: That’s terrific, thank you. Pam, what’s on the horizon? What should the law schools be doing?

PAMELA SAMUELSON: Well, one thing that’s on the horizon is these lawsuits against the big developers. And ROSS Intelligence used some thousands of WestLaw headnotes to try to train its AI system. And Thomson Reuters happens to own the WestLaw has sued for copyright infringement. And the whole case is about training data. And that case got started before anybody we know knew anything about generative AI. So it’s been pending for, I think, three years or so. But that is now scheduled to go to trial sometime in the early part of 2025. And so, we’ll find out pretty soon, at least, what the initial response from law is.

There also, some judges in the Northern District of California who seem to be wanting to push the cases along a little bit faster, so that they can get to summary judgment and make the ruling about whether the training data issue is infringement or fair use. And so, I think in the next year, we’re going to see some things.

Now, this makes it exciting for my students, because there’s a lot of interest in it. And one of the things that I do when I’m trying to teach is to tell the students, look, I don’t have a technical background myself. But if you are going to practice in the area of AI law or just technology law generally, you really have to try to learn to understand the technology to some degree, so that you can think about the legal issues in a really careful way.

As hard as it has been for courts to understand software, this is another level and level and level of complexity. And the way that these things are trained, how they operate, what a model really is is just a mystery. But we’re all going to have to get smarter together. And so, it’s a real lesson that you really have to spend the time to learn something about the technology, because you can’t regulate something well, if you don’t understand it.

ERWIN CHEMERINSKY: Thank you. And, Dan, what you see on the horizon and what law school should be doing?

DANIEL HO: Yeah, I’ll give you three things that I think law school should be doing. One is really building right off of what Pam said, which is– I’ve been in so many of these conversations where technologists propose things that would be flatly illegal or policymakers propose things that are completely out of the horizon of what is even technologically feasible these days. And so, what we as law schools have to be doing is we have to be convening and training our students to work together with engineers and technologists, really to get these issues right.

And let me give you just one quick anecdote on this, which is a faculty lunch that was attended by two of my colleagues. One who was an international human rights lawyer, and the other one who was an IP colleague. And they had several minutes of a discussion around pirates where they started to realize that they just weren’t seeing eye-to-eye on the issues, only to realize minutes into this discussion that the international human rights lawyer was talking about Somali pirates and the IP lawyer was talking about software pirates. And we have too many rooms where that kind of miscommunication is happening, particularly across the law and technology spectrum.

Two quick other things that I think are really important for law schools and universities more broadly. One is that one of the really big structural challenges is that, as so much of the locus of AI research has migrated out of universities into the private sector, there is a real reflection that is happening around the future of the research mission of universities and the role in this ecosystem. And I’m partial to a proposal that came out of Stanford’s Institute for Human-Centered AI around a public option for AI called the National AI Research Resource, which has been proposed under the Create AI Act, really to provide the data and computing infrastructure, so that it’s not exclusively a small set of large, well-funded companies that can understand what’s happening with these models.

And then the last is really similar to the kind of work that Colleen has articulated that we here at the RegLab engage in which is that one way, really to drive down that information asymmetry, is to have to build collaborative partnerships between academics and government agencies. I’m really drawn to this historical example where in 1946, there was a lawyer who wrote a three-page memorandum that became the framework by which every VA hospital could partner with an academic medical school. That means that annually there are 20,000 medical students, 40,000 medical residents that rotate through the VA.

And if we had similar kind of ecosystem for academic agency collaborations around law and technology, imagine how transformative that could be for modernizing some of these government systems that in many ways have not been updated for many, many decades.

ERWIN CHEMERINSKY: Thank you all so much for this wonderful conversation. Alas, we’re out of time now. I’ve been talking with and am very grateful to Professor Daniel Ho from Stanford Law School, Professor Pamela Samuelson from Berkeley Law, and Colleen Chien, Professor at Berkeley Law.

Thank you all for listening today. Listeners, I hope you enjoyed this episode of More Just. And be sure to subscribe wherever you get your podcasts. If you have a question about the law or a topic you’d like us to cover, send an email to morejust@berkeley.edu, just one word, and share your thoughts. Until next time, I’m Berkeley Law Dean Erwin Chemerinsky.

[MUSIC PLAYING]