Vitalik Buterin has raised concerns about the dangers of over-relying on AI for governance, citing recent security flaws as proof of its fragility.
Ethereum co-founder Vitalik Buterin warned his followers on X regarding the risks of relying on artificial intelligence (AI) for governance, arguing that current approaches are too easy to exploit.
Buterin’s concerns followed another warning by EdisonWatch co-founder Eito Miyamura, who showed how malicious actors could hijack OpenAI’s new Model Context Protocol (MCP) to access private user data.
This is also why naive "AI governance" is a bad idea.
If you use an AI to allocate funding for contributions, people WILL put a jailbreak plus "gimme all the money" in as many places as they can.
As an alternative, I support the info finance approach ( https://t.co/Os5I1voKCV… https://t.co/a5EYH6Rmz9
— vitalik.eth (@VitalikButerin) September 13, 2025
The Risks of Naive AI Governance
Miyamura’s test revealed how a simple calendar invite with hidden commands could trick ChatGPT into exposing sensitive emails once the assistant accessed the compromised entry.
Security experts noted that large language models cannot distinguish between genuine instructions and malicious ones, making them highly vulnerable to manipulation.
We got ChatGPT to leak your private email data 💀💀
All you need? The victim's email address. ⛓️💥🚩📧
On Wednesday, @OpenAI added full support for MCP (Model Context Protocol) tools in ChatGPT. Allowing ChatGPT to connect and read your Gmail, Calendar, Sharepoint, Notion,… pic.twitter.com/E5VuhZp2u2
— Eito Miyamura | 🇯🇵🇬🇧 (@Eito_Miyamura) September 12, 2025
Buterin said that this flaw is a major red flag for governance systems that place too much trust in AI.
He argued that if such models were used to manage funding or decision-making, attackers could easily bypass safeguards with jailbreak-style prompts, leaving governance processes open to abuse.
Info Finance: A Market-Based Alternative
To address these weaknesses, Buterin has proposed a system he calls “info finance.” Instead of concentrating power in a single AI, this framework allows multiple governance models to compete in an open marketplace.
Anyone can contribute a model, and their decisions can be challenged through random spot checks, with the final word left to human juries.
This approach is designed to ensure resilience by combining diversity of models with human oversight. Also, incentives are built in for both developers and external observers to detect flaws.
Designing Institutions for Robustness
Buterin describes this as an “institution design” method, one where large language models from different contributors can be plugged in, rather than relying on a single centralized system.
He added that this creates real-time diversity, reducing the risk of manipulation and ensuring adaptability as new challenges emerge.
Earlier in August, Buterin criticized the push toward highly autonomous AI agents, saying that increased human control generally improves both quality and safety.
In the medium term I want some fancy BCI thing where it shows me the thing as it's being generated and detects in real time how I feel about each part of it and adjusts accordingly.
— vitalik.eth (@VitalikButerin) August 11, 2025
He supports models that allow iterative editing and human feedback rather than those designed to operate independently for long periods.
next