News and Events

¡AI, caramba! Do you see what AI see?

17/01/2024

The problem of AI in legal proceedings? Hallucinations

In a First-Tier Tribunal (FTT) Tax Chamber hearing, a litigant-in-person sought to rely on nine ‘authorities’ that transpired to be ‘hallucinations’ from an AI chatbot such as ChatGPT. The full judgment can be found here: Felicity Harber v The Commissioners for His Majesty’s Revenue and Customs [2023] UKFTT 1007 (TC).

Ms Felicity Harber appealed a penalty by HMRC on the basis that she had a reasonable excuse because of her mental health, and because it was reasonable for her to be ignorant of the law. At the first hearing in August 2023, Ms Harber - who was representing herself - sought to rely on a written response which included nine cases that appeared to support her defence. Due to an administrative error, this written response was not before the Tribunal or HMRC. Counsel for HMRC swiftly pointed out that she had not been able to locate the cases on the FTT website. The matter was adjourned, to allow time for the Tribunal and HMRC to consider the written response and to allow Ms Harber to explain where she found these cases.

At the return hearing, both HMRC and the Tribunal agreed that the cases relied on could not be located on the FTT website or other legal databases. However, some of the cases relied on by Ms Harber did bear resemblance to genuine FTT cases, for example by having similar names and fact patterns, but incorrect years and different outcomes. Ms Harber initially stated that she was provided with the cases by a ‘friend in a solicitors' office’ but when questioned on whether they could have been generated by AI, she stated that it was ‘possible’.

The FTT concluded that the cases relied on by Mrs Harber were ‘hallucinations’, where AI systems produce “highly plausible but incorrect results” because they anticipate “the text that should follow the input…but do not have a concept of reality”[1]. In reaching this conclusion, the FTT was assisted by the US case of Mata v Avianca 22-cv-1461(PKC) in which two barristers sought to rely on fake cases generated by ChatGPT. In that case, the Judge ordered copies of the full decisions prompting the barristers to go back to ChatGPT and ask for the “whole opinion”. In both Mata v Avianca and Ms Harber’s case, the tribunals identified stylistic flaws that would not normally appear in judgments. In Ms Harber's case, the FTT pointed to the use of the American spelling of ‘favor’ and the ‘frequent repetition of identical phrases’ (at [6]).

Whilst the FTT accepted that Ms Harber did not know the cases were not genuine, it nevertheless set out the potential harms caused by the use of AI in this manner. The Judgment affirms the view taken by Judge Kastel in Mata v Avianca that,

“Many harms flow from the submission of fake opinions. The opposing party wastes time and money in exposing the deception. The Court's time is taken from other important endeavors [sic]. The client may be deprived of arguments based on authentic judicial precedents. There is potential harm to the reputation of judges and courts whose names are falsely invoked as authors of the bogus opinions and to the reputation of a party attributed with fictional conduct. It promotes cynicism about the legal profession and the…judicial system. And a future litigant may be tempted to defy a judicial ruling by disingenuously claiming doubt about its authenticity.”

Ms Harber’s case is a paradigm example of precisely the harms that can be caused where, due to misinformation, litigants-in-person are under the misguided impression that they have a case or defence worth pursuing. Here, the reliance placed on these cases (compounded with an administrative error) necessitated two hearings and hours of additional work, on what was on the face of it, a straightforward case.

The Internet has long been littered with misinformation, especially when it comes to the law, and the advent of AI ‘chatbots’ may have made this misinformation that much easier for litigants-in-person to access. Without any legal knowledge or background it would be difficult for a lay person to verify the information they receive. This will especially be the case as AI systems improve, and the stylistic flaws reduce.

With that being said, in the right hands, AI could be - and has already been - a useful and efficient tool in legal practice. On 12 December 2023 Judicial Guidance on Artificial Intelligence was published, signifying a cautious step into the modern future. The guidance emphasises the limitations of AI tools, advising that they are “a poor way of conducting research to find new information you cannot verify”, often heavily based on US law and “inevitably reflect errors and biases in its training data”. The guidance goes on to say that AI tools can nevertheless be a useful way of obtaining, “material you would recognise as correct but have not got to hand.”

It is in exactly this fashion that Birss LJ supported the use of AI tools such as ChatGPT. Birss LJ, addressing the Law Society, stated that he asked ChatGPT to summarise an area of law and included the answer in his judgment. Birss LJ placed emphasis on ‘taking full responsibility’ for any AI generated material you may rely on, which is far easier where you have an ability to ‘recognise an answer as being acceptable.’[2]

What then, for those who are not able to identify that an AI answer is incorrect? With the numbers of litigants-in-person showing no sign of receding, and the use of AI increasing, legal practitioners and Judges will have to be on the lookout for indications that a case may be a ‘hallucination’. These include instances where case summaries have been used in place of full law reports, where there are stylistic flaws and where the case cannot be located on legal databases. Practitioners must be alert to cases that bear the same name as a genuine case, but report a different outcome or reasoning.

This article was written by 1st Six Pupil Alannah Kavanagh, without the assistance of any AI chatbots.

[1] Risk Outlook report: the use of artificial intelligence in the legal market, 20 November 2023

[2] Bianca Castro and John Hyde, ‘Solicitor condemns judges for staying silent on “woeful” reforms’ (The Law Gazette, 14 September 2023) < https://www.lawgazette.co.uk/news/solicitor-condemns-judges-for-staying-silent-on-woeful-reforms/5117228.article>