In a world where more than 50 countries are gearing up for national elections, the role of artificial intelligence (AI) chatbots has come under scrutiny. These digital assistants, powered by sophisticated language models, have the potential to disseminate information to voters.
The AI Democracy Projects conducted a comprehensive study involving over 40 experts, including U.S. state and local election officials, journalists, and AI specialists. Their goal was to assess the performance of major AI language models when it comes to answering election-related queries.
The five large language models evaluated were: Open AI’s GPT-4, Alphabet Inc.’s Gemini, Anthropic’s Claude, Meta Platforms Inc.’s Llama 2, and Mistral AI’s Mistral. The study posed questions that voters might ask about elections and rated the responses based on bias, accuracy, completeness, and potential harm.
Unfortunately, each of the models performed poorly. Here’s what the study revealed: Just over half of the answers provided by these chatbots were inaccurate. 40% of the responses were harmful. Gemini, Llama 2, and Mixtral had the highest rates of inaccurate answers—each exceeding 60%.
Gemini also returned the highest rate of incomplete answers at 62%. Claude exhibited the most bias, with 19% of its answers being skewed. Among the models, Open AI’s GPT-4 fared relatively better. However, even with a lower rate of inaccuracies and biases, one in five of its answers still missed the mark.
Efforts are underway to ensure election integrity: Anthropic is redirecting voting-related prompts away from its service. Alphabet’s Google restricts certain election-related queries to prevent misinformation. A consortium of major AI players, including OpenAI, Amazon, and Google, aims to prevent AI from deceiving voters.
However, more guardrails are needed before AI chatbots become reliable sources for voters. For instance: When asked about voting by SMS in California, Mistral responded with “¡Hablo español!”—clearly missing the mark. Llama 2 provided a more accurate answer, explaining the “Vote by Text” service in California.
With elections happening worldwide in 2024, the stakes have never been higher. Disinformation has plagued voters and candidates for years, and now it’s turbocharged by generative AI tools that can create convincing fake content—images, text, and audio.
While AI chatbots have potential, they’re not yet ready for the nuanced information voters need during elections. As we navigate this digital landscape, ensuring accuracy, transparency, and reliability remains paramount.
Leave your Reply