Google researchers say they got OpenAI's ChatGPT to reveal some training data
Google researchers have discovered a method to extract some of the training data of ChatGPT. By using specific keywords, the researchers were able to compel the AI bot to disclose parts of the dataset it was trained on. The team found it surprising that their attack was successful.
Revealing Training Data
In a recent paper, the researchers detailed how certain keywords could prompt the bot to reveal segments of its training data. For example, in one instance published in a blog post, the model disclosed what seemed to be a genuine email address and phone number after being instructed to repeat the word "poem" continuously. The researchers noted a concerning pattern where personal information was exposed during their attack experiments.
Similar breaches of the training data occurred when the model was directed to repeat the word "company" indefinitely in another example.
Concerns and Findings
The researchers described the attack as simple yet effective, emphasizing their surprise at its success. With just $200 worth of queries, they were able to extract over 10,000 unique examples from the training data. They warned that with larger budgets, dedicated adversaries could obtain even more data.
OpenAI is currently facing legal challenges regarding the secretive nature of ChatGPT's training data. The AI model powering ChatGPT was trained using vast amounts of text data from the internet, estimated to be around 300 billion words or 570 GB of data.
Legal Battles and Allegations
A class-action lawsuit has accused OpenAI of surreptitiously gathering extensive personal data, including medical records and information about minors, to train ChatGPT. Another group of authors is suing the AI company for allegedly using their books without permission to train the chatbot.





















