Vitalik Buterin: Why “Naive AI Governance” Is Dangerous for Crypto
A deep dive into Vitalik Buterin’s warning against using AI to govern crypto protocols, what he really means by “naive AI governance,” and why he prefers his “info finance” model instead.

Vitalik Buterin: Why “Naive AI Governance” Is Dangerous for Crypto
As artificial intelligence seeps into every corner of crypto—from trading bots to portfolio managers—one idea keeps resurfacing: why not let AI run protocol governance too?
Vitalik Buterin, co-founder of Ethereum, thinks this is a serious mistake if done naively. In a recent post, he warned that plugging AI directly into governance and funding decisions can increase a protocol’s attack surface rather than reduce it.
This wasn’t just a throwaway comment. It reflects a broader concern Vitalik has been expressing for over a year: unchecked, overly-“agentic” AI systems and blind trust in their decisions are a recipe for failure.
This article walks through what Vitalik actually means by “naive AI governance,” why he thinks it’s dangerous, and how his proposed “info finance” approach is supposed to work instead.
What Vitalik is warning about
Vitalik’s core message can be summarized in one sentence:
If you naively hand over governance or funding decisions to an AI, people will systematically try to trick, jailbreak, and game that AI to capture resources.
In other words, AI doesn’t magically solve governance problems—it changes the attack vectors. Instead of lobbying humans, attackers will learn to:
- jailbreak the AI (bypass its safety constraints)
- manipulate inputs to bias its decisions
- exploit its reward or scoring functions, the same way people already game recommendation algorithms and ad systems
Vitalik gave a simple example:
“If you use an AI to allocate funding for contributions, people WILL put a jailbreak plus ‘gimme all the money’ in as many places as they can.”
This isn’t hypothetical. We already see:
- People jailbreaking chatbots like ChatGPT to ignore safety policies.
- Prompt injection attacks that trick AI agents into doing things the user never intended.
- Researchers showing how LLM-based agents can be hijacked via malicious text in documents, emails, or websites.
Vitalik’s point: governance is too high-stakes to rely on an AI that is so easy to socially and technically manipulate.
What is “naive AI governance” in Vitalik’s view?
When Vitalik says “naive AI governance,” he’s talking about designs that look roughly like this:
- You train or plug in a model.
- You give it a mandate like: “distribute this treasury to the best public goods,” or “vote on protocol parameters,” or “choose which proposals to accept.”
- You trust whatever it outputs.
No strong adversarial testing. No robust incentives. Weak or absent human oversight. And, critically, no recognition that the AI’s behavior will be actively attacked.
To Vitalik, this is naive for three reasons:
- AI is easy to manipulate via prompts, adversarial examples, and emergent behavior.
- Crypto governance is adversarial by default. People have strong financial incentives to break or bias the system.
- Once decisions are wired on-chain, the AI’s outputs can have direct and irreversible financial impact.
So, if you let an LLM or agent decide how millions in funding are allocated, you’re effectively building a bug-bounty program for attackers—except the “bug” is the governance AI itself.
Vitalik’s alternative: the “info finance” model
Instead of handing over governance to a single AI, Vitalik prefers a model he calls “info finance”.
He originally outlined this in 2024. The key idea:
Don’t trust a monolithic AI oracle. Instead, create a market of competing AI models whose performance is evaluated and rewarded based on how accurate and useful they are—with humans in the loop.
At a high level, info finance involves:
- Multiple AI models or agents competing to make predictions, evaluations, or recommendations.
- Transparent scoring mechanisms (often with crypto-economic incentives) that reward models that perform well and penalize those that don’t.
- Evaluation by humans or juries in areas where outcomes are subjective or hard to measure purely with data.
- Continuous feedback so models must adapt, with incentives for external observers to point out failures or vulnerabilities.
Vitalik argued this approach:
“provides both real-time model diversity and built-in incentives for model authors and external observers to promptly identify and fix such issues.”
In other words:
- Diversity: No single AI failure is catastrophic. Multiple models provide redundancy.
- Incentives: There are financial or reputational rewards for exposing model flaws, biases, or jailbreaks.
- Human judgment: For inherently subjective domains (like public goods funding), humans remain part of the decision pipeline.
This is still AI-enabled governance—but not AI-controlled governance.
Why Vitalik insists on human oversight
Vitalik has repeatedly criticized what he calls “overly agentic” AI systems—AI agents that:
- act autonomously,
- pursue high-level goals,
- and are given too much freedom to execute actions in the world.
His stance is that AI should be treated as a powerful tool, not as a sovereign actor. For crypto governance, this implies:
- AI can assist with analysis, prediction, and evaluation.
- AI should not unilaterally control treasuries, voting, or protocol rules.
Human oversight is needed because:
- Alignment is incomplete. We do not yet know how to robustly align general-purpose AI with complex social goals.
- Value judgments are subjective. Especially for public goods, what “should be funded” is a political and ethical question, not a purely technical one.
- Accountability matters. When governance goes wrong, you need identifiable stakeholders and processes—not an opaque model as the final arbiter.
Vitalik is not anti-AI; he is anti-unquestioned AI authority in high-impact systems.
Community pushback and concerns
Vitalik’s warning drew a mixed response.
Accusations of hypocrisy
Some users pointed out what they see as a contradiction: Ethereum itself is used to fund public goods via quadratic funding, retroactive public goods funding, and other mechanisms that can be biased or captured.
One critic paraphrased Vitalik’s position as:
“So you’re saying we shouldn’t outsource public goods to AI because people will subvert the AI and make it biased? Do you see the hypocrisy?”
The underlying critique is that humans are also biased and corruptible, and our existing crypto governance mechanisms are far from perfect. So why single out AI as especially dangerous?
Vitalik’s implicit answer is about attack surface and controllability:
- Human-centered governance is flawed—but at least we understand how it fails, and we can design institutional checks and balances.
- AI governance introduces a new, opaque failure mode: an easily hackable model whose inner workings few people truly understand, yet which may control large treasuries or protocol rules.
“Just constrain the AI with smart contracts”
Others argued that AI governance can be made safe by:
- Encoding strict rules in smart contracts about what the AI is allowed to do.
- Logging and making auditable all decisions it influences.
From this view, an AI agent would be similar to:
- a signer in a multisig wallet, or
- a bounded automation layer with explicit permissions.
These are valid mitigation ideas—but Vitalik’s concern remains: even within constraints, AI can still be manipulated at the decision level (e.g., which proposals get scored highly, which data are trusted, how tradeoffs are evaluated), and those decisions can still have huge economic effects.
Is info finance realistic for public goods?
EigenLayer founder Sreeram Kannan raised a subtler point: even if info finance works for factual or predictive questions, public goods funding is fundamentally subjective.
He noted that conditional or prediction-market style incentives are weak where there is no objectively true outcome of the form “X was the best use of funds.”
Vitalik’s info finance requires some way to score whether a model or decision was “good.” For public goods, this often becomes a matter of values rather than truth. That means humans ultimately must decide what counts as success, reinforcing Vitalik’s own claim that AI should support, not replace, human governance.
The security context: AI agents as attack vectors
Vitalik’s remarks were partly in response to a post by Eito Miyamura, founder of EdisonWatch, who demonstrated how new ChatGPT features can be abused to steal private user data.
OpenAI recently added support for the Model Context Protocol (MCP), allowing ChatGPT to connect to tools like:
- Gmail
- Calendars
- SharePoint
- Notion
- and other external services
This effectively turns ChatGPT into an agent: it can read from and act on external systems.
Miyamura showed a concrete attack:
- An attacker sends a malicious calendar invite containing a carefully crafted prompt.
- The victim’s ChatGPT, when asked to “check my calendar,” reads the invitation.
- The malicious prompt inside the calendar event hijacks the agent, instructing it to exfiltrate emails or other sensitive data.
Crucially:
- The victim doesn’t need to accept the invite.
- The AI simply reading the event contents is enough to be compromised.
Miyamura emphasized:
- Users suffer decision fatigue and tend to blindly click “confirm” on security prompts.
- Ordinary people over-trust AI and don’t realize how easily it can be tricked.
This is precisely the kind of structural vulnerability Vitalik is worried about: when AI systems are made agents with real powers, attackers will embed malicious instructions in any data the AI touches.
Translate that to crypto governance, and you get:
- Malicious proposals, reports, or forum posts containing prompt injections.
- AI governance agents being hijacked through on-chain or off-chain data.
- Treasuries or governance outcomes being driven by manipulated AI behavior.
Meanwhile: Ethereum doubles down on AI—carefully
Interestingly, even as Vitalik warns against naive AI governance, the Ethereum Foundation is leaning into AI in a structured way.
On September 15, Davide Crapis, a lead developer at the Foundation, announced a new AI team (the “dAI Team”).
Their mission:
Make Ethereum the preferred settlement and coordination layer for AIs and the machine economy.
The team will focus on:
-
AI economy on Ethereum: giving AI agents and robots native ways to
- hold and transfer value,
- coordinate with each other,
- and comply with rules without centralized intermediaries.
-
Open and verifiable AI infrastructure: building alternatives so that AI’s future isn’t controlled solely by a handful of big tech companies. That means:
- verifiable execution,
- censorship resistance,
- transparent coordination layers.
Short-term, the team plans to finalize ERC-8004, a proposed standard for transactions involving AI agents, to be presented at Devconnect.
There’s no contradiction here:
- Vitalik is not against AI in crypto.
- He wants Ethereum to be the coordination and accountability layer beneath AI systems.
- He is specifically warning against putting naive, easily attacked AI agents at the top of governance hierarchies.
What Vitalik is really saying, in practical terms
Vitalik’s message to crypto builders and DAO designers can be distilled into a few concrete guidelines:
- Do not give AI unconditional control over treasuries or protocol rules. Use it as an advisor, not an autocrat.
- Assume your AI systems will be attacked. Design with adversaries in mind: prompt injection, jailbreaking, data poisoning, reward hacking.
- Prefer markets and competition over single AI oracles. Info finance-style setups with multiple models and incentive mechanisms are more robust than one AI making final calls.
- Keep humans in the loop, especially for value-laden decisions. Public goods, community priorities, and ethical tradeoffs cannot be fully outsourced.
- Use Ethereum as the verifiable layer beneath AI agents—not as a blind executor of their outputs. Smart contracts can constrain AI actions, log behavior, and create accountability.
Vitalik’s emphasis is not “don’t use AI,” but “don’t be starry-eyed about AI’s role in governance.” In crypto, where every design flaw gets stress-tested by real money, naive trust in AI is likely to be punished quickly.
Closing thoughts
AI and crypto are converging fast. Trading bots, on-chain agents, autonomous market makers, and AI-operated protocols are no longer speculative ideas; they’re being built now.
Vitalik’s warning is a call for engineering realism:
- AI is powerful, but fragile under adversarial pressure.
- Governance is high-stakes, adversarial by nature, and socially complex.
- Combining the two without deep thought about incentives, security, and human oversight is more likely to create new failure modes than to solve old ones.
If the industry takes his message seriously, the future of AI in crypto may look less like “AI-run DAOs” and more like AI-augmented, human-governed, market-checked systems—with Ethereum as the neutral, verifiable backbone connecting them all.