5 AI Models Failed Miserably During Election Question Test

Published On Mon Jun 10 2024

Study Finds AI Models Gave False, Harmful Answers About Elections

AI models responded incorrectly a majority of the time when asked questions about election procedures, according to a new investigation from the AI Democracy Projects. Overall, all five AI models tested, including Anthropic’s Claude, Google’s Gemini, OpenAI’s GPT-4, Meta’s Llama 2, and Mistral’s Mixtral performed poorly in the January study.

Google launches Gemini—a powerful AI model it says can surpass GPT ...

Just over half of the answers that the models provided were determined to be inaccurate by testers, while 40 percent of the answers provided were determined to be “harmful,” or likely to discourage voters from participating in an election. Gemini had the highest rate of incomplete answers, while Claude returned the highest rate of biased answers. Of the five, Open AI’s GPT-4 was found to provide the lowest rate of inaccurate answers, but while it had previously pledged to connect question askers to CanIVote.org, it did not refer testers to this site.

“People are using models as their search engine, and it’s kicking out garbage. It’s kicking out falsehoods. That’s concerning,” said Bill Gates, a Republican county supervisor in Arizona, who participated in the testing.

Amazon Bedrock adds Claude 3 Anthropic AI models