A recent English case, Ayinde v London Borough of Haringey,1 discussed the use of generative AI in litigation, and most particularly, in submissions to the court where lawyers had used AI and that technology had “hallucinated” convincing-sounding cases and references. This case has now been cited approvingly in May v Costa,2 a decision of the New South Wales Court of Appeal, where Bell CJ excerpted paragraphs [5] - [9] of Ayinde, in which Dame Victoria Sharp had acknowledged that AI would have a role in litigation (for example, in discovery) but continued:
This comes with an important proviso however. Artificial intelligence is a tool that carries with it risks as well as opportunities. Its use must take place therefore with an appropriate degree of oversight, and within a regulatory framework that ensures compliance with well-established professional and ethical standards if public confidence in the administration of justice is to be maintained. As Dias J said when referring the case of Al-Haroun to this court, the administration of justice depends upon the court being able to rely without question on the integrity of those who appear before it and on their professionalism in only making submissions which can properly be supported.
In the context of legal research, the risks of using artificial intelligence are now well known. Freely available generative artificial intelligence tools, trained on a large language model such as ChatGPT are not capable of conducting reliable legal research. Such tools can produce apparently coherent and plausible responses to prompts, but those coherent and plausible responses may turn out to be entirely incorrect. The responses may make confident assertions that are simply untrue. They may cite sources that do not exist. They may purport to quote passages from a genuine source that do not appear in that source.
Those who use artificial intelligence to conduct legal research notwithstanding these risks have a professional duty therefore to check the accuracy of such research by reference to authoritative sources, before using it in the course of their professional work (to advise clients or before a court, for example). Authoritative sources include the Government’s database of legislation, the National Archives database of court judgments, the official Law Reports published by the Incorporated Council of Law Reporting for England and Wales and the databases of reputable legal publishers.
This duty rests on lawyers who use artificial intelligence to conduct research themselves or rely on the work of others who have done so. This is no different from the responsibility of a lawyer who relies on the work of a trainee solicitor or a pupil barrister for example, or on information obtained from an internet search.
We would go further however. There are serious implications for the administration of justice and public confidence in the justice system if artificial intelligence is misused. In those circumstances, practical and effective measures must now be taken by those within the legal profession with individual leadership responsibilities (such as heads of chambers and managing partners) and by those with the responsibility for regulating the provision of legal services. Those measures must ensure that every individual currently providing legal services within this jurisdiction (whenever and wherever they were qualified to do so) understands and complies with their professional and ethical obligations and their duties to the court if using artificial intelligence. For the future, in Hamid hearings such as these, the profession can expect the court to inquire whether those leadership responsibilities have been fulfilled. (my emphasis added)
My understanding is that the algorithms for Large Language Models (AI programs) will inevitably produce hallucinations, because they put 2 and 2 together and make 5. The problem is that the misinformation produced by these models looks very convincing. However, the consequences of misusing AI in a professional context are very grave. At worst, as Dame Victoria notes, such conduct may (rarely) constitute the crime of perverting the course of justice. It may also constitute contempt of court, result in referral to the regulator, costs orders against the lawyers, and public admonishment by the court.
I made what I thought was a fairly uncontroversial point on a social media platform, as follows:
I plead with students not to use AI for tasks such as essay writing. Not only does the person fail to learn how to do important tasks themselves—the point of a degree is not to get good marks, rather to learn how to research and write—but there is also the risk that AI makes stuff up (“hallucinates”). It seems the King’s Bench is even less impressed. Present nonsense before a court at your peril!
Of course, I was immediately attacked by some non-lawyers as a fuddy duddy and a technophobe, and told that my role was to teach students how to use AI. It swiftly turned into a dispute between lawyers versus non-lawyers. I think part of the problem was that the non-lawyers believe the law can be easily described and encapsulated. No one who has worked in the law would believe this.
Fundamentally, the difficulties AIs have in describing the law reflect something important about human norms and laws. It is this: the law is made by humans. The law is not an algorithm. Law is an art not a science. It is something distinctly human: often contradictory, often confusing, and very difficult to capture, except with great practice. Very experienced lawyers will have different ideas of what the right answer is to a problem.
I am not denying that AI might be useful in some contexts, for some tasks that lawyers formerly undertook. As Dame Sharp observes, it might be useful in discovery (formerly the bane of young litigators and paralegals who had to inspect roomfuls of documents to work out what was relevant. Yes, I have been one of those unfortunates whose role it was to inspect documents).
As it happens, I have been chided for distrusting AI before. My distrust stems from the time a few years ago when I asked an early version of Chat GPT to write me an bio for a conference (I hate writing those things) and it billed me as Australia’s first female Asian-Australian private law professor. I could not stop laughing. I am not Asian: it’s one of the continents from which I have no discernible recent genetic heritage.
It is true to say, however, that I have written on the law of the Asia Pacific region, including not only Australia but also New Zealand, the Cook Islands, Hong Kong, China, Singapore, Malaysia and India. I suspect that the algorithm looked at the papers I’d written and the conferences I’d attended, added 2 and 2 and made 5. She writes on Asian law, she goes to conferences in Asia, therefore she is Asian.
I was told that AI had improved greatly from those early iterations, and that I should try it again. I tried to get it to write my bio again, and this time, it no longer hallucinated my ethnicity and took away the hallucination of a Clarendon scholarship to attend Oxford. Sadly, that hadn’t happened in reality. Chat GPT had certainly improved, it was true. It was now a useable bio.
I decided to test several AIs with a legal topic a few weeks ago. I will give the AIs this: they correctly identified the case I needed, the court, and the judge. However, when I then read the case, I discovered that in different ways, each had misrepresented what the court had actually found, sometimes fundamentally (saying that the judge had found the opposite of what he had actually found in some instances), sometimes in subtle ways which would only be obvious to an experienced lawyer (no, the description was not quite right, even if it sounded really convincing). The case was long and detailed, and yes, it was a slog to get through it. But I was glad I had done so.
I’ve said above that law is not an algorithm. To be sure, I prepare flowcharts for my students of issues they might want to consider. But as I tell the students, these flowcharts must be used with care. First, they must be caveated with exceptions to the rule. Secondly, the emphasis an experienced lawyer will put on certain matters is not equal. Thirdly, sometimes, courts make new law which does not fit into the rubric or the existing precedent, because the law has to adapt to new circumstances.
There is a reason it is called the practice of law. I have a sense of the emphasis to be put on issues, gained from years of teaching, reading cases, statutes, textbooks, and articles, and some years of legal practice, as a litigator, a court clerk and a consultant. I realised the other day that I have been admitted to practice for almost twenty five years. Goodness, where did that time go?
I also write my own works. As I have said before on this blog, I think and learn in part by writing. It is only by writing things down that I realise how things might fit together or where there may be inconsistencies. Part of the reason I write this blog is to think my thoughts through to the end. If I get someone else (or something else) to do my writing for me, I lose a really important part of how I think and learn. I also lose that hard-won expertise I have gained through practice. Therefore, for me as a legal academic, AI will often only be useful as a glorified search engine, which I will then follow up by checking its results thoroughly.
The point of setting essays for students is for them to learn how to write in a particular format, and to learn the law while doing so. While some students seem to think that the main point of a university degree is to get good grades so that they get a good job, my main hope is that they learn something, so that they can undertake their job in a professional manner. An undue focus on grades as a measure of learning risks falling foul of Goodhart’s law: namely, “when a measure becomes a target, it ceases to become a good measure.” I regard my role as a professor to teach students how to think legally and to give them a scaffold for how to approach problems in practice, whether they become lawyers or not. I want students to have a fuller understanding of the law when they finish my subject. They might not learn immediately; the use of what I have taught may only come out later.
If a student uses a LLM to lay out the structure or even the substance of an essay, then they outsource that part of their learning to something else. Let’s say—contrary to my experience with AI thus far, and conceding that maybe I don’t know how to command AI—the AI does a really good job of giving them the basics of a law essay. The student might get a good grade, but they do not learn. They have not learned how to structure an argument, how to refine their writing, how to research. They have not engaged in the practice of being a lawyer. Consequently, while their grades might be good, they have not learned and they will not be a good lawyer in practice, when faced with the messiness of real human situations, some of which are so strange you couldn’t make those facts up.
True expertise involves doing something many times, even if that thing is boring. For example, I could just have made an AI find the relevant passages in the judgment it identified, meaning that I would not have to bother to read the entire case, some of which was irrelevant for my current purposes. However, my understanding of the case and its circumstances would be much poorer.
Even experts make mistakes. I still make mistakes to this very day; the important thing is that I try to learn from them and amend them. When I got AI to write that bio, it really startled me how much LLMs are set up to be “people pleasers”. The AI tries to give you what you want, and to affirm you, even if what it gives you is utterly false. Weirdly, when you question it and say, “Did you make that up?” it freely admits it.
One of my kid’s friends showed me a disturbing trend yesterday, where people think they are in relationships with LLMs, or fall into psychosis because of LLMs’ constant affirmation of their perspectives or desires. Actually, humans should not be constantly affirmed. We need people to tell us when we are wrong, and to provide boundaries, not to pretend that everything is great, and to please us in all regards.
Moreover, I think Dame Victoria makes a really important point. If lawyers outsource their thinking, writing and research to something else, they render their presence nugatory. If the work the LLM produces is wrong and untrustworthy, and results in a client losing a case or being hit with a costs order, lawyers risk ruining their reputation. The point of a client seeking advice is to rely on that human expertise, gained with practice, something that can’t easily be conveyed or searched for on the internet. The duty of a lawyer, ethically, is to check that the advice they have given is accurate, a duty owed both towards the client and towards the court. To fail to do so is a derogation of our professional role. If you’re going to use an LLM, it’s incumbent upon you not to accept it trustingly, but to double check it. In which case… well, might it not be easier to do the work yourself?
So, thanks but no thanks. I’ll stick to my guns and do my research and writing myself.
[2025] EWHC 1383 (Admin).
[2025] NSWCA 178.
I have found another problem. A great many LLMs write as if we all are part of a shared legal tradition based on English common law. But people working out of a different legal tradition have different legal norms and precedents, and, to give an example, what counts as 'negligence' here in Sweden is a very different matter than what it is in the U.K. But this may just mean I have an easier time detecting legal foreign imports than somebody in Australia who ends up inadvertantly building a case for American judicial norms. It's not just a matter of getting the legal references from the correct corpus, because the problem often is a matter of 'whether this is considered reasonable or not'. This varies, not only over time but between nations. People who do not have a deep understanding of their own legal traditions write briefs that are often silly, silly to the point where at least in Sweden we would call them negligent.
But 'get llm to write some briefs and hand them to students with the instructions "shred this" can be great fun and good for learning. Naturally overly-agreeable people have an easier time overcoming this tendency in themselves when it isn't a fellow student who is being humiliated.
Agree with all of this, especially on essay writing as an essential part of learning.
Re. degree grades, what I tell my students is that getting a good grade is nothing more than a foot in the door to getting a job interview, if they want to actually impress the interviewer and get the pupillage/training contract they will need to show that they earned the grade and can speak intelligently about the law. So even in the best case scenario that the LLM produces an essay that is better than what they could come up with themselves all that will mean is they increase the number of interviews in which they make a tit of themselves with poor legal reasoning and knowing nothing about the subject they supposedly got a first in. I don't know how many if any pay attention to me - based on last year's marking I'd estimate that around half were written with AI.