In brief
- AI firm Anthropic has been ordered by a judge to respond after its expert allegedly cited a non-existent academic article in a $75M copyright lawsuit.
- The citation aimed to support claims that Claude rarely reproduces lyrics, but plaintiffs said it was a “complete fabrication,” likely produced using Claude itself.
- The case adds to mounting legal pressure on AI developers, with OpenAI, Meta, and Anthropic all facing lawsuits over training models on unlicensed copyrighted material.
An AI expert at Amazon-backed firm Anthropic has been accused of citing a fabricated academic article in a court filing meant to defend the company against claims that it trained its AI model on copyrighted song lyrics without permission.
The filing, submitted by Anthropic data scientist Olivia Chen, was part of the company’s legal response to a $75 million lawsuit filed by Universal Music Group, Concord, ABKCO, and other major publishers.
The publishers alleged in the 2023 lawsuit that Anthropic unlawfully used lyrics from hundreds of songs, including those by Beyoncé, The Rolling Stones, and The Beach Boys, to train its Claude language model.
Chen’s declaration included a citation to an article from The American Statistician, intended to support Anthropic’s argument that Claude only reproduces copyrighted lyrics under rare and specific conditions, according to a Reuters report.
During a hearing Tuesday in San Jose, the plaintiffs’ attorney Matt Oppenheim called the citation a “complete fabrication,” but said he didn’t believe Chen intentionally made it up, only that she likely used Claude itself to generate the source.
Anthropic’s attorney, Sy Damle, told the court Chen’s error appeared to be a mis-citation, not a fabrication, while criticizing the plaintiffs for raising the issue late in the proceedings.
Hallucination Nation: How Crazy Is Your Favorite AI Model?
For all the transformative and disruptive power attributed to the rise of artificial intelligence, the Achilles’ heel of generative AI remains its tendency to make things up. The tendency of Large Language Models (LLMs) to "hallucinate" comes with all sorts of pitfalls, sowing the seeds of misinformation. The sphere of Natural Language Processing (NLP) can be dangerous, especially when people cannot tell the difference between what is human and what is AI generated. To get a handle on the situat...
Per Reuters, U.S. Magistrate Judge Susan van Keulen said the issue posed “a very serious and grave” concern, noting that “there’s a world of difference between a missed citation and a hallucination generated by AI.”
She declined a request to immediately question Chen, but ordered Anthropic to formally respond to the allegation by Thursday.
Anthropic did not immediately respond to Decrypt’s request for comment.
Anthropic in court
The lawsuit against Anthropic was filed in October 2023, with the plaintiffs accusing Anthropic’s Claude model of being trained on a massive volume of copyrighted lyrics and reproducing them on demand.
They demanded damages, disclosure of the training set, and the destruction of infringing content.
Universal Music Group Sues Anthropic Claiming ‘Widespread Infringement’
To curb the unauthorized use of its music to train AI models, Universal Music Group (UMG) is leading a lawsuit against Claude AI developer Anthropic, marking the first major lawsuit by the music industry against an AI developer. UMG is seeking $75 million in damages. Joining UMG in the lawsuit against Anthropic are Concord Music Group, ABKCO, Worship Together Music, Plaintiff Capital CMG, and others listed as “publishers” who collectively accuse the AI developer of feeding the songs of artists t...
Anthropic responded in January 2024, denying that its systems were designed to output copyrighted lyrics.
It called any such reproduction a “rare bug” and accused the publishers of offering no evidence that typical users encountered infringing content.
In August 2024, the company was hit with another lawsuit, this time from authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who accused Anthropic of training Claude on pirated versions of their books.
GenAI and copyright
The case is part of a growing backlash against generative AI companies accused of feeding copyrighted material into training datasets without consent.
OpenAI is facing multiple lawsuits from comedian Sarah Silverman, the Authors Guild, and The New York Times, accusing the company of using copyrighted books and articles to train its GPT models without permission or licenses.
OpenAI Trained AI Models on Copyrighted Work, Says NYT Lawsuit
The New York Times has launched a lawsuit against OpenAI and Microsoft, alleging that millions of its articles were improperly used to train AI models which now stand as direct competitors in the information and news landscape. The lawsuit says OpenAI was “using The Times’s content without payment to create products that substitute for The Times and steal audiences away from it.” This legal action highlights a growing concern over the use of copyrighted material in the development of artificial...
Meta is named in similar suits, with plaintiffs alleging that its LLaMA models were trained on unlicensed literary works sourced from pirated datasets.
Meanwhile, in March, OpenAI and Google urged the Trump administration to ease copyright restrictions around AI training, calling them a barrier to innovation in their formal proposals for the upcoming U.S. “AI Action Plan.”
In the UK, a government bill that would enable artificial intelligence firms to use copyright-protected work without permission hit a roadblock this week, after the House of Lords backed an amendment requiring AI firms to reveal what copyrighted material they have used in their models.
Generally Intelligent Newsletter

