There had been much concern on how courts would look at the use of large amounts of copyrighted material by AI companies to train their models, but judges — thus far — seem to be ruling that it amounted to fair use.
Days after Anthropic won a lawsuit against three authors over the use of their books to train its AI models, Meta has scored a similar victory. A total of 13 authors, including Sarah Silverman and Ta-Nehisi Coates, had argued that the company had breached copyright law by using their books without permission to train its AI system. But the judge dismissed the lawsuit, saying that Meta hadn’t broken the law through such use.


US district judge Vince Chhabria, in San Francisco, said that the authors had not presented enough evidence that Meta’s Llama AI would cause “market dilution” by flooding the market with work similar to theirs. As a consequence Meta’s use of their work was judged a “fair use” – a legal doctrine that allows use of copyright protected work without permission – and no copyright liability applied.
The judge also said that authors hadn’t made the right arguments in their case against Meta. The authors had claimed that that users of Llama could reproduce text from their books, and that Meta’s copying harmed the market for licensing copyrighted materials to companies for AI training. The judge said this argument didn’t hold because authors are not entitled to monopolize the market for licensing books for AI training. However the judge said that had the authors argued and presented evidence that Meta’s Llama AI models risked rapidly flooding their markets with competing AI-generated books that could indirectly harm sales, they’d have had a stronger case.
Earlier this week, another judge had ruled that Anthropic hadn’t broken the law when it had trained its Claude model on over 7 million books. The judge however, had taken issue with how Claude had used pirated copies of the books to train its models, and said that the company would stand trial for using these illegitimate copies.
These two victories indicate which way the wind is blowing in the questions of copyright in the training of AI models. Most AI models have not only used up nearly all of the material on the internet to train their models, but have also used up vast amounts of books and research papers. There had been concerns that AI companies might need to pay stiff penalties — or a share of their profits — for using these books without compensating the authors in any way. But given how the outputs of AI models differ substantially from the books they’re trained on, and are in essence a combination of millions of books that they models have ingested, it appears that at least for the moment, judges are siding with AI companies in these crucial judgements.