Meta vs. Copyright: The Debate Over Pirated Books for AI

Published On Tue Feb 11 2025
Meta vs. Copyright: The Debate Over Pirated Books for AI

Meta accused of pirating 82 terabytes of books from 'shadow libraries' for AI training

For sophisticated AI chatbots to exist, they need to be trained on large swaths of data. However, the legality of obtaining this data has come into question for companies behind these AI chatbots.

Accusations of Copyright Infringement

Companies utilizing AI technology have been accused of stealing copyrighted data to train their AI models, ultimately enhancing their sophistication and increasing the price for access to the AI. Obtaining datasets legally typically involves paying licensing fees and complying with copyright regulations set by the data owners. However, some companies choose to pirate datasets instead of going through the legal process.

Meta accused of using pirated books for AI | Digital Watch Observatory

OpenAI and Meta are among the companies facing copyright lawsuits for alleged illegal data acquisition practices. Recently, Meta, led by Mark Zuckerberg, came under fire for reportedly obtaining 82 terabytes of books from 'shadow libraries' for AI training purposes.

Meta's Controversial Actions

The lawsuit against Meta reveals that the company downloaded content from 'shadow libraries' like Anna's Archive, Z-Library, and LibGen without proper authorization. One Meta researcher expressed concerns about using pirated material, highlighting ethical considerations.

In a court hearing, leaked details from Meta employees suggest discussions about concealing their IP addresses through VPNs when torrenting data. Mark Zuckerberg's statements during a meeting in January 2023, allude to a push for advancing AI technology, possibly at the cost of ethical boundaries.

Meta accused of downloading torrents of 81.7TB of pirated books to train AI

Ethical Concerns and Legal Ramifications

Meta's involvement in copyright infringement has sparked debates within the AI research community. Some researchers argue against the use of pirated material, emphasizing the importance of ethical standards in AI development.

As legal battles unfold, companies like Meta face scrutiny for their data acquisition practices. The implications of using pirated material extend beyond legal repercussions, impacting the integrity and credibility of AI technology.

To read more about related allegations, click here.

To stay updated on the latest tech news, subscribe to the daily TweakTown Newsletter. By subscribing, you acknowledge our Privacy Policy.