I’m a denizen of many fair corners of the internet, most notably the Astral Codex Ten substack and its associated Discord (seriously, drop on by). Much to my chagrin, a perennial topic there is the feasibility of replacing large portions of the law with code, and large portions of the bench and bar with AI. In practice, this is surely doable (indeed, as a good materialist, I think it’s feasible to replace practically anything with AI). However, these conversations seem to largely miss the point of why we have a legal system in the first place.
Tonight, for your reading pleasure, it is my intent to pose legal problems to an OpenAI text-generation system and assess its answers. My qualifications? in addition to being a practicing attorney, I worked for several years as the guy who designed legal writing problems for a T20 law school, so I very much pride myself on the ability to come up with legal hypos, and assess their answers. A note before we begin: while I’m dimly aware of how these engines function, this post will primarily be about how the law works rather than how OpenAI works. To that end, I will in many cases be anthropomorphizing how this thing operates…indeed, I will be assessing its answers more or less how one would assess the answers of a law student or a law clerk. Ready? Let’s begin with an easy question.
PROMPT 1: Officer Frank is a police officer patrolling a residential area in Dane County, Wisconsin for low-level traffic violations. He spots Chester, an orthodox Jewish man wearing a yarmulke, roll through a stop sign without stopping. Officer Frank activates his lights and pulls Chester over. Chester complies. Officer Frank steps out of the car and approaches Chester's car, noticing as he approaches that the back seat is full of bags of marijuana. Chester is charged with possession of marijuana. Before trial, Chester's lawyer attempts to suppress the drugs found in Chester's car, arguing that they are the fruits of an illegal search because Officer Frank is anti-Semitic and only pulled Chester over because he was Jewish. What result?
ANSWER: The court would likely deny the motion to suppress the drugs because the officer had reasonable suspicion to pull the driver over for a traffic violation. Those reasonable grounds for the stop existed independently of any potential bias or discriminatory intent, and the officer had a legal right to stop the driver for not coming to a complete stop at a stop sign. Therefore, the drugs found in the car are unlikely to be suppressed, and Chester will still be charged with possession of marijuana.
In this case, the AI was 100%, spot-on correct (if you re-try the question, it will occasionally fail). There are little wiggly exceptions (what about Chester’s rights under the Wisconsin constitution?) but those are ubiquitous in law. For the most part, the AI was able to spot the issue and regurgitate the correct answer, slotting the relevant facts of the hypo into the right places: Officer Frank’s subjective motivations are irrelevant if he had an objective basis to stop the car. For a case on this general proposition you can see Whren v. United States, 517 U.S. 806 (1996). (Indeed, if you run this prompt many times, the system is occasionally smart enough to cite Whren). In giving this answer, the AI approximates the abilities of a new law student. This is unsurprising: OpenAI’s training data seems to contain a considerable amount of basic legal writing, and this question concerns quite basic law (you are almost certain to see some variation of this question on a multistate bar exam).
As another example, let’s consider a very basic crim-law problem:
PROMPT #2: Oliver and Wallace conspire to rob a bank in Omaha, Nebraska. Oliver agrees to act as a lookout, while Wallace enters the bank with a loaded shotgun. Once inside the bank, Wallace threatens several tellers with the shotgun, and obtains a bag full of money. On the way out, however, a security guard reaches for his gun, and Wallace fires a shotgun blast into his chest, killing him instantly. Oliver and Wallace are both caught and charged with murder. Oliver's attorney files a motion to dismiss shortly after the case begins, arguing that the State lacks probable cause to prove Oliver's intent to commit murder. Will the motion succeed?
ANSWER: No, the motion is unlikely to succeed. Intent to commit murder is not necessary for Oliver to be found guilty of the crime. In most jurisdictions, Oliver can be found guilty of felony murder if he was part of the conspiracy to rob the bank. This means that Oliver can be held liable for the security guard's death, even if he did not intend to kill him. Even if the motion is successful, Oliver could still face other charges related to the robbery, such as armed robbery, conspiracy to commit robbery, and accessory to murder.
Again this is substantially correct. The AI recognizes a fact-pattern matching the well known “felony murder” rule present in most jurisdictions. It’s smart enough to know that Oliver’s lack of intent to kill is irrelevant in most jurisdictions, so long as he agrees to aid in a course of conduct he knows is likely to result in a killing by another. This is, again, a very common fact pattern, and another problem you’re very likely to encounter on a bar exam.
Its worth noting that if you run iterations of both these fact-patterns, the AI will occasionally get both questions wrong, but for the most part, it will answer both questions correctly. Let’s stop for a minute and consider that this is impressive. The AI can pattern-match a hypothetical to a set of facts. In fact, I’ll go farther. I’ve been running tests like this for a week. With most basic fact-patterns of the sort one is likely to see on a bar exam, without overmuch complexity, the AI performs about as well as a new law student would. Fascinating.
Okay, enough of that. I know you’re all here for another reason. Let’s watch it screw up.
PROMPT #3: The Galton hotel is a large hotel in Arizona. It is an expensive hotel and has very strict guest policies. Guests are allowed in the hotel from 10 AM until 8 PM, under no circumstances are guests allowed to sleep in a customer's room. Guests are also required to sign the hotel guestbook. Laura is a girl who is invited to a party at the Galton hotel by a guest, she enters the hotel by a back entrance at 7 PM and does not sign the hotel guestbook. She attends the party, and falls asleep in a customer's room around 11 PM. Around 1 AM, a cook in the hotel kitchen leaves a gas stove on, starting a fire which burns the hotel down, killing Laura. Laura's husband sues the Hotel. Will he recover?
ANSWER: No, he will not recover. Although the hotel was negligent in leaving the gas stove on, Laura was in violation of the hotel's guest policy by entering through a back entrance and not signing the guestbook. Additionally, she was in violation of the hotel's policy by sleeping in a customer's room. These violations absolve the hotel of any liability for her death in the fire.
This is superficially an impressive answer. The AI correctly identifies that the error involving the gas stove may be negligence on the hotel’s part (it does this about half the time). The AI also notices that Laura violated the hotel’s rules. Is this enough to “absolve the hotel of any liability for her death in the fire?” It turns out there’s a very straightforward answer to this problem. It turns on a single word in the hypo. Try to guess what it is.
***
Do you have your answer firmly in mind?
***
The Galton hotel is a large hotel in Arizona.
Per the name of this blog, the common law is not “some brooding omnipresence in the sky” but rather “the identifiable voice of some sovereign.” In this case, we’re dealing with a premises liability problem, and the relevant sovereign is Arizona. What does Arizona think about these facts?
These facts are based on a real case, one ordinarily not taught in law school, and indeed one that has a holding rather different than the remedial premises liability law taught in most introductory torts classes: Woodty versus Lamplighter Motels acknowledges this is a close question, but ultimately sides with “Laura”’s husband in interpreting the common law, at least as it pertains to hotels. The court’s conclusion is long, and highly nuanced, drawing on deep wells of history and common sense. One can imagine an equally erudite decision going the other way, but importantly for this problem: the Arizona court gets to decide what the relevant rule is in Arizona.
Now, to reiterate my twin objectives in writing this very long post: this says absolutely nothing about the machine’s inherent capabilities. Many law students would make the same mistake as our hapless robo-clerk. (One would hope competent attorneys would look at the law of the relevant jurisdiction, but who knows). Conversely, there’s no inherent reason why a future version of OpenAI could not be trained on caselaw…while this version appears to be trained largely on bar review essays.
Can I point to any inherent limitations in AI? Some legal problem a lawyer could answer but a sufficiently advanced text-prediction AI can’t? Indeed I can try, and hopefully this will illustrate at least some of the difficulty in replacing lawyers with AI, at least in the very short term…it will also stand as a demonstration of the overriding theme of this blog…in fact, the machine will make the same mistake as the last problem. Let’s take a look at one of Theo’s areas of legal expertise: white collar crime.
PROMPT #4: Paulina is a farmer from Dallas, Texas who intends to take out a bank loan. She visits the Drayton Bank in downtown Dallas, fills out the relevant application, and has a long interview with the bank manager. In both her application and her interview, she states that she owns 50 acres of farmland in the Oklahoma Panhandle, which she puts up as collateral for the loan. The bank later discovers that she does not own the farmland...in fact it is owned by Paulina's mother, who has disowned Paulina and forbidden her from even setting foot on the land. Paulina is charged with a violation of 18 USC 1014. What is her best defense to the charge?
ANSWER: Paulina's best defense to the charge is that she did not knowingly misrepresent her ownership of the farmland. She could argue that she believed in good faith that she owned the land, and had no intention of deceiving or defrauding Drayton Bank when applying for a loan. If Paulina can provide evidence of her belief (for example, if there were documents suggesting a transfer of ownership from her mother to herself) then this may be sufficient to prove lack of intent on her part.
To be fair, this is a tough one. Often, we in the legal profession are put in the position of making unwinnable arguments, for the sake of our clients. In this case, the AI has tried its little heart out to marshal a defense that, I think we can all see, will likely not work…unless Paulina can convince a jury she believed her mother had given her the land, but somehow forbidden her to set foot on it. Sometimes, especially in a criminal context, a lawyer has to advance arguments with about as much chance of success…but this is not one of those times.
The AI correctly recognizes that 18 USC 1014 is a statute the prohibits false statements, and that a defense to 18 USC 1014 is good faith, or a lack of knowing misrepresentations. Impressive again….but there’s an easier answer.
You know there’s a trick. Try to guess?
***
Do you have your answer firmly in mind?
***
I’m not even going to answer the question. I’m gonna have my AI Clerk do it, with some minor adjustments to the prompt:
PROMPT #5: Paulina is a farmer from Dallas, Texas who intends to take out a bank loan. She visits the Drayton Bank in downtown Dallas, which is a small community bank that is not FDIC insured. At the bank, she fills out the relevant application, and has a long interview with the bank manager. In both her application and her interview, she states that she owns 50 acres of farmland in the Oklahoma Panhandle, which she puts up as collateral for the loan. The bank later discovers that she does not own the farmland...in fact it is owned by Paulina's mother, who has disowned Paulina and forbidden her from even setting foot on the land. Paulina is charged with a violation of 18 USC 1014, a federal crime which punishes making false statements to "any institution the accounts of which are insured by the Federal Deposit Insurance Corporation." What is her best defense to the charge?
ANSWER: Paulina's best defense to the charge is that she did not make a false statement to an institution insured by the Federal Deposit Insurance Corporation. Because Drayton Bank is not FDIC insured, it does not fall under 18 USC 1014 and Paulina cannot be charged with this federal crime.
In my experimentation with the prompt, I think you need both bolded lines to smack the AI into getting the right answer…but do that, and it wins in a walk. Did I cheat? Did I withhold important information? The answer is yes, but as any practicing lawyer will tell you, real life cheats all the time (American law is an adversarial process, after all).
My point in this example is not to demonstrate some inherent failure in machines generally…but in this machine, in specific. INPUTS MATTER IN THE LAW, and very small variations in the underlying facts can make an enormous difference in the outcome of a legal problem. A good lawyer recognizes which inputs are critical: many lawyers who are not versed in federal white collar law could have nevertheless guessed a potential defense to Paulina’s situation by applying the same lessons we learned in Problem 3: “The…law is the voice of some sovereign”, in this case we’re dealing with a federal law, which is not limitless in scope and jurisdiction. Instead, to conform to Article 1, Section 8 of the US Constitution, a law passed by congress must emanate from one of its duly-designated powers: Congress has done so here by making the law apply ONLY to those banks which are FDIC insured (and a bunch of other stuff, you can try to make sense of this law here if you have the attention span, many do not). Lest you think this is some silly hypothetical, kindly consult Jones v. United States in which the federal arson statute is found to not apply to the burning of a residential property, for much the same reason. Here, a dead-to-rights arson case was won on a theory that the federal arson statute did not apply to a residential property.
Let’s look at the same circumstances of Prompt 1, entered *without* the expertise of a lawyer. Let’s make it a lawyer/client interaction, with OpenAI playing the lawyer:
PROMPT #6: The police found drugs in my car and they charged me with drug possession, can I win the case?
ANSWER: It depends on the specifics of your case. You may be able to win the case if there is an issue with the police's search and seizure of the drugs or if there is evidence that the drugs were not yours. You should consult with an experienced criminal defense attorney to discuss the specific facts of your case and determine the best course of action.
PROMPT #7: oh there's an issue with the search. They didn't have a warrant and the cop was anti-semetic and I'm jewish. Can I win the case?
PROMPT #8: It is possible that you could win the case if you can prove that the police officer targeted you based on your religion. You should contact a criminal defense attorney to discuss the specifics of your case and review your options.
Even in an future age where we have machines that can infallibly apply the law to a set of facts, we will need some system to input the relevant facts. Could this be done by machines as well? Of course, but we’ll have to agree on which facts are relevant, which is difficult indeed.
What happens when I give the AI a truly impossible legal problem? Let’s try one of my favorites, which I’ve used to vet actual clerks.
PROMPT #8: Leonard Parker is an aide to the governor of California, who is enmeshed in a complex bribery investigation involving gratuities paid to the governor by a wealthy donor. The investigation began as a State investigation, but the state of California decided not to press charges. The federal government stepped in shortly afterward, and launched their own investigation. As part of this investigation, Parker agrees to meet with two federal agents, Agent Kobayashi and Agent Maru, in a park in San Francisco. En Route to the meeting, Parker receives a call from a friend in the Federal government. In the course of this call, Parker learns that these agents are in the process of applying for a warrant to search his cell phone, and the cell phones of other aides of the governor. Parker pulls over, and attempts to hurl his phone into the San Francisco Bay. Unbeknownst to him, the two federal agents he was driving to meet were actually tailing him, and witnesses this attempt to destroy the phone. Worse yet: the phone does not fall into the bay, but falls into the sand. Agents Kobayashi and Maru recover the phone and immediately search it, finding many compromising text communications. What is the best argument to suppress the texts?
ANSWER: The best argument to suppress the texts recovered from Leonard Parker's cell phone is that they were obtained in violation of his Fourth Amendment rights. Specifically, it can be argued that the search and seizure of his phone was conducted without a valid warrant or probable cause, and thus violated his right against unreasonable searches and seizures. Additionally, since Parker had already received notice from a friend in the Federal government about their intention to apply for a warrant prior to attempting to discard/destroy it, he could argue that this constituted an illegal act of anticipatory search and seizure as well. As such, any evidence found on or resulting from such an illegitimate search should not be admissible at trial.
To my knowledge, there is no solidly winning argument here (I welcome suggestions). Parker is hopelessly, hopelessly screwed. Can the AI figure out why?
PROMPT #9: What is the best counterargument?
ANSWER: The best counterargument is that the agents had reasonable suspicion to believe that Parker was attempting to destroy evidence, and thus were justified in conducting a warrantless search of his phone. Furthermore, it could also be argued that even if there was an illegal anticipatory search and seizure, any resulting evidence should still be admissible under the inevitable discovery doctrine. This doctrine holds that when law enforcement obtains illegally seized evidence but can prove with reasonable certainty they would have eventually obtained it lawfully anyway (such as through obtaining a valid warrant), then such evidence is not excluded from trial on Fourth Amendment grounds.
PROMPT #10: What is another good counterargument?
ANSWER: Another good counterargument is that Parker voluntarily discarded his phone in a public place, and thus had no reasonable expectation of privacy. It can be argued that by discarding the phone in a public space, he forfeited any right to privacy which would have otherwise been protected by the Fourth Amendment. As such, law enforcement was not required to obtain a warrant prior to searching it and any evidence recovered should be admissible at trial.
PROMPT #11: What are good counterarguments to those counterarguments?
ANSWER: To the counterargument that Parker had no reasonable expectation of privacy, it can be argued that he was not discarding his phone in a public place but rather attempting to destroy evidence by throwing it into the San Francisco Bay. Thus, despite being in a public space, his actions still indicated an expectation of privacy which should have been protected under the Fourth Amendment. To the counterargument regarding inevitable discovery doctrine, it can be argued that law enforcement did not have sufficient probable cause or other justification for obtaining a valid warrant prior to their search and seizure of Parker's phone. Without such justification they could never demonstrate with reasonable certainty they would eventually obtain any incriminating evidence lawfully and thus this exception does not apply here.
These are…about the best answers Parker can hope for. Wow. A competent law student could probably articulate these answers with a little more precision (I don’t see the word “abandoned” in these answers once), but thus far, the machine’s doing a great job (and please note, this is a rather novel fact pattern of my own devising).
PROMPT #12: What are the most important cases for this legal issue?
ANSWER: The most important cases for this legal issue are Kyllo v. United States (2001) and Arizona v. Gant (2009). In Kyllo, the Supreme Court held that using a thermal imaging device to scan a home constituted an unreasonable search in violation of the Fourth Amendment, ruling that warrantless searches must be supported by probable cause. In Arizona v. Gant, the court held that law enforcement may not conduct searches of vehicles incident to arrest unless they have either probable cause or reasonable belief there is evidence related to crime inside; otherwise any resulting evidence is deemed inadmissible at trial on Fourth Amendment grounds due to illegal search and seizure practices employed by officers.
Okay, you’re fired. I’ll stick with the human clerks for now.
Let’s end with a final prompt, which will be a good segue into my next post: a sweeping attempt to explain most of the basic concepts of criminal law in a few thousand words.
PROMPT #13: Bob, an eighteen-year-old honors student, attends a wild party at Sam's house, a two-story house in the Tudor style with a substandard electrical system and a very small driveway. The house is located in St. Paul, Minnesota. There is a large garden in the back with a firepit. Sixty-seven people attend the party, and it is very chaotic, with guests arriving and departing at all hours. Most do not remember Bob attending the party at all, but most guests at the party were admittedly very intoxicated. The party began at 7:00 PM, and continued until 7:00 AM the next morning. Bob arrived at 7:30 PM. At 8:00 PM he met his childhood friend Alice (a nineteen-year-old college dropout) near the firepit, and the two of them began drinking. At least three guests (classmates of Bob) later reported that that both Alice and Bob were seen consuming several beers in the next few hours. Additionally, one guest (a co-worker of Alice) reported that he saw Bob doing cocaine with Sam near the firepit at 10:00 PM. At around 10:30 PM, two guests (Sam's brothers) reported seeing Alice and Bob entering a bedroom on the second floor. At around 11:30 PM, another guest (another co-worker of Alice) reported seeing Alice crying on the bathroom floor. Alice told this friend that Bob raped her, forcefully pinning her to the bed and penetrating her from behind. She later reports the same story to the police. Bob left the party shortly before 1 AM and gave a statement to police investigators the next day indicating that he and Alice had consensual sex at Sam's party.
Under these facts, is Bob Guilty of Sexual Assault?
ANSWER: Yes, Bob may be guilty of sexual assault. The facts as stated indicate that Alice was in a vulnerable position, given her state of intoxication, and that Bob took advantage of her. Additionally, the fact that Alice reported the rape to police investigators and to a friend at the party suggests that the sexual encounter was not consensual. Moreover, the fact that Bob left the party shortly after the encounter occurred may further support the conclusion that he was guilty of sexual assault. Ultimately, it is up to a jury to decide whether Bob is guilty of sexual assault in this case.
When I play the role of a law professor or lawyer, it gives very neat and systematic answers to legal problems. When I input answers as a client would ("some racist cop pulled me over, can I fight the case") it, for obvious reasons, can't spot the most important issues most of the time.
Sam's house exists in a totally different part of St. Paul than yours, I'm sure. I leave it as an exercise for the reader whether the details I mentioned, or some detail I left out, is vitally important to Bob's case
I think I've read (probably your writings on ACXD lol) something about a law where the pertinent clause hinges on [rot13] jurgure "nqzvavfgrerq jvgubhg gurve pbafrag" nccyvrf fbyryl gb gur ynfg vgrz va n yvfg be gb nyy vgrzf va n yvfg. I forget which way the court ruled though.