Unveiling the Use of Public Data by Meta and OpenAI

Published On Sat May 24 2025

The Use of Public Data by Meta and OpenAI

The Irish Data Protection Authority has given the green light to Meta's intention of using public data for training its AI models. As of May 27, 2025, Meta plans to utilize the public data of its platform users, including posts on Instagram and Facebook, for AI training purposes. Users have the option to object to this data usage by May 26, although the extent of this objection remains uncertain.

If users choose not to object, their public posts will be used for AI training. It's important to note that Meta will not access or utilize private messages for training its AI models. Messages on WhatsApp and Instagram direct messages will remain end-to-end encrypted and private. However, conversations with Meta's AI chatbot will be utilized for training purposes.

Should the GDPR Prohibit AI? - Truth on the Market

Concerns and Actions

Consumer protection organizations and data protection entity Noyb are taking steps to address concerns regarding the use of public data by Meta. Despite Meta's assurance that private messages will not be accessed, some organizations are wary of the implications of using public posts for AI training.

Meta Resumes E.U. AI Training Using Public User Data After ...

It's worth mentioning that other tech giants like Google and OpenAI also utilize public data for training their AI models. OpenAI, for instance, emphasizes the use of freely available content from the internet for their models, including videos from platforms like Instagram and YouTube.

Regulatory Framework and Challenges

The General Data Protection Regulation (GDPR) stipulates the rights of individuals regarding their data, including the right to rectify and access their personal information. However, current AI models present challenges in implementing these rights effectively.

Meta argues that the data of EU citizens is essential for creating AI models that reflect local languages and cultures. Without access to such data, AI models may not effectively serve the diverse needs of EU citizens.

Conclusion

As the debate continues on the use of public data for AI training, users are faced with the choice of benefiting from AI advancements or safeguarding their data privacy. This discussion extends beyond Meta to other major players in the AI industry such as OpenAI and Google.