AI vs. Copyright: Meta's Legal Battle Over Training Data

Court rules Meta used copyrighted books legally for AI training

A federal judge recently granted summary judgment to Meta in a landmark copyright case involving book authors and LLM training data. The ruling, issued on June 25, 2025, by Judge Vince Chhabria of the Northern District of California in Case 3:23-cv-03417-VC, determined that Meta's utilization of copyrighted books for training its Llama large language models constituted fair use.

Background of the Case

The case, Kadrey v. Meta Platforms Inc., originated when authors such as Sarah Silverman, Junot Díaz, and Andrew Sean Greer filed a lawsuit against Meta for allegedly downloading their books from shadow libraries without permission. Meta acquired the books through BitTorrent protocols between October 2022 and early 2024 after unsuccessful licensing negotiations with publishers.

Key Findings and Analysis

Judge Chhabria found Meta's use of 666 books from thirteen authors to be substantial and commercial. However, he determined that the transformative nature of training artificial intelligence models outweighed concerns about potential market harm. The ruling applies exclusively to the specific plaintiffs involved and does not shield Meta from future copyright claims by other authors.

Google announces that 'fair use and text and data mining ...

The court's decision was based on the four-factor fair use test, with Judge Chhabria emphasizing the transformative nature of Meta's use. The judge highlighted the fundamental difference between human reading and AI training, noting that Meta's Llama models ingest text to learn statistical patterns, distinct from human consumption of literature.

Implications and Future Considerations

The ruling has significant implications for AI development and copyright law. It underscores the importance of transformative use in fair use determinations and sets a precedent for future AI copyright cases. While Meta's victory was substantial, the court's ruling is specific to the named plaintiffs, allowing other authors to pursue their own claims.

Industry analysts have noted that the ruling could impact how digital marketing platforms approach AI development and training data acquisition. It also signals a shift towards market dilution arguments in AI copyright cases, with a focus on the impact of AI-generated content on original works.

Does Fair Use Apply To Training Generative AI Systems? In a “Pre ...

As technology companies navigate the intersection of AI and intellectual property law, the ruling provides guidance on fair use analysis but also highlights the need for continued vigilance and proactive licensing arrangements with content creators. The evolving legal landscape surrounding AI and copyright suggests ongoing uncertainty for developers training models on copyrighted material.