Last month, a Chief Risk Officer at a major bank asked me a question that should terrify every financial institution deploying AI agents:
“How many AI agents are running in production across our organization right now?”
Nobody in the room could answer.
Not approximately. Not within an order of magnitude. The honest answer was: we don’t know.
This bank had agents running on AWS Bedrock for customer service. Azure AI for document processing. Internal teams had deployed LangChain agents on Kubernetes. A vendor had embedded agents in their SaaS product. The data science team was experimenting with CrewAI.
Each deployment worked. Each was approved through its own process. Each had its own monitoring. None talked to each other.
Welcome to the fragmentation tax.
The Multi-Everything Reality
Let’s be honest about where we are. Enterprise AI in 2026 is not single-cloud, single-framework, single-vendor. It’s multi-everything:
- AWS: Bedrock Agents, SageMaker
- Azure: AI Agent Service, OpenAI Service
- GCP: Vertex AI Agents, Gemini
- Open Source: LangChain/LangGraph, CrewAI/AutoGen
- Vendor Embedded: SaaS Copilots, Platform Agents
- Internal: Custom Agents, Research POCs
This isn’t poor planning. It’s rational behavior.
Business units choose tools that solve their problems. AWS shop? Bedrock is the path of least resistance. Microsoft ecosystem? Azure AI integrates seamlessly. Data science team comfortable with Python? LangChain it is.
Each decision makes local sense. The aggregate makes governance nearly impossible.
The Five Costs of Fragmentation
The fragmentation tax isn’t one cost. It’s five, compounding:
1. Visibility Cost
You can’t govern what you can’t see. When agents span multiple platforms:
- No unified inventory. How many agents? Doing what? Who owns them?
- No cross-platform metrics. Each system has its own dashboards, its own definitions of “success.”
- No aggregate risk view. Risk in one system might be acceptable. Risk across all systems? Unknown.
I’ve seen banks spend months just trying to catalog their AI deployments. The catalog is outdated before it’s finished.
2. Compliance Cost
Regulators don’t care about your multi-cloud strategy. They care about outcomes:
- Can you explain how a decision was made?
- Can you demonstrate the agent was tested?
- Can you prove it’s being monitored?
- Can you show the audit trail?
When each platform has its own logging format, its own retention policy, its own access controls - answering these questions becomes a manual, error-prone process.
The EU AI Act requires “appropriate levels of transparency” and documentation for high-risk AI systems. The OCC’s Model Risk Management guidance (SR 11-7) demands comprehensive model inventory and validation. Neither was written for a world where “models” are actually autonomous agents spread across six platforms.
3. Security Cost
Each agent deployment is an attack surface:
- Prompt injection - Can external inputs manipulate agent behavior?
- Data exfiltration - What can agents access? What can they leak?
- Privilege escalation - Can agents acquire capabilities beyond their design?
- Supply chain - What about the models and frameworks underneath?
Each platform has security controls. But security gaps live at the seams - where Platform A hands off to Platform B, where internal agents call external APIs, where vendor agents access internal data.
Fragmentation multiplies seams. Seams multiply risk.
4. Operational Cost
Running agents requires operational capability:
- Monitoring for failures, anomalies, drift
- Incident response when things go wrong
- Capacity planning and cost management
- Version control and rollback
Each platform requires its own operational expertise. That’s six different monitoring systems, six incident runbooks, six cost dashboards, six deployment pipelines.
Most organizations don’t have one mature AI operations practice. They certainly don’t have six.
5. Strategic Cost
This is the cost nobody talks about: lock-in by default.
When your customer service agents run on Bedrock, your document agents on Azure, and your analytics agents on Vertex - you haven’t avoided lock-in. You’ve achieved maximum lock-in. You’re locked into everyone.
Switching costs compound. Integration debt accumulates. Each platform becomes load-bearing.
Three years from now, when a better option emerges - or when a vendor changes pricing, or when a regulator mandates change - you won’t be able to move. You’ll be paying the fragmentation tax forever.
Why CSP-Native Solutions Don’t Solve This
Every cloud provider now offers agent governance capabilities. AWS has Bedrock Guardrails. Azure has AI Content Safety. GCP has Vertex AI’s responsible AI toolkit.
These are useful. They’re also insufficient. Here’s why:
They Only See Their Own Platform
AWS guardrails don’t monitor your Azure agents. Azure content safety doesn’t see your LangChain deployments. Each vendor’s solution creates another silo.
Their Incentives Are Misaligned
Let’s be direct: Cloud providers make money when you use their services. Their governance tools are designed to make you comfortable using more of their platform, not to help you govern across platforms or migrate away.
This isn’t malicious. It’s just business. But it means CSP governance tools will never optimize for your ability to leave - which is exactly what genuine governance requires.
They Don’t Address the Architectural Gap
The real governance challenge isn’t within platforms. It’s between them:
- When a Bedrock agent calls an Azure API, who’s monitoring that interaction?
- When an internal agent uses a vendor’s embedded copilot, where’s the audit trail?
- When policies differ between platforms, which one applies?
CSP tools are control plane for their data plane. You need a control plane for all your data planes.
The Watchtower Imperative
What banks actually need is something different: a unified agent control plane that sits above individual platforms.
We call this the Watchtower architecture. Not because it’s clever branding, but because it describes the function: a high vantage point with visibility across the entire landscape.
The Watchtower doesn’t replace platform-specific tools. It orchestrates them. Think of it as the governance layer that CSPs can’t build because they have the wrong incentives, and most enterprises won’t build because they don’t realize they need it - until they do.
What a Watchtower Actually Does
A proper agent governance layer provides four core capabilities:
1. Universal Observability
Every agent, every platform, one view:
- Agent registry. What agents exist? Where do they run? Who owns them?
- Behavioral telemetry. What are agents doing? What decisions are they making?
- Cross-platform tracing. When Agent A calls Agent B on a different platform, follow the thread.
- Anomaly detection. Not just logging - active monitoring for drift, sandbagging, unexpected behavior.
2. Policy Enforcement
Consistent rules regardless of platform:
- Behavioral boundaries. What can agents do? What’s forbidden?
- Data access controls. What data can agents see? What can they emit?
- Escalation rules. When must agents defer to humans?
- Policy inheritance. Enterprise policies cascade to business units, to individual agents.
3. Trust Scoring
Not all agents deserve equal autonomy:
- Continuous evaluation. Agents earn trust through consistent, correct behavior.
- Dynamic permissions. High-trust agents get more autonomy. Low-trust agents get tighter bounds.
- Regression detection. When an agent starts behaving differently, trust adjusts automatically.
4. Runtime Control
Governance isn’t just observation. It’s intervention:
- Kill switches. Immediately halt a misbehaving agent, anywhere.
- Behavior steering. Adjust agent behavior without redeployment.
- Gradual rollout. New agents start constrained, expand as they prove reliable.
The Tension Triangle
Here’s where it gets hard. Building a Watchtower forces you to confront a fundamental tension:
Agility: Business units want to deploy agents fast. They want autonomy. They don’t want to wait for committee approval.
Governance: Risk and compliance need oversight. They need to prove control. They need audit trails.
Cost: Finance needs efficiency. They can’t justify three parallel governance teams for three cloud platforms.
Most organizations optimize for one corner and suffer on the other two:
- Optimize for agility → shadow AI everywhere, compliance scrambling to catch up
- Optimize for governance → innovation bottleneck, business units route around controls
- Optimize for cost → understaffed, incidents waiting to happen
The Watchtower’s job is to find the balance point - and it’s different for every organization, every use case, every risk appetite.
Why Banks Need to Act Now
If you’re in financial services, you’re facing a specific set of pressures:
Regulatory scrutiny is intensifying. The OCC, Fed, and FDIC are asking harder questions about AI governance. The EU AI Act creates explicit obligations for high-risk AI systems. Regulators aren’t waiting for you to figure out multi-cloud agent governance. They expect you to have figured it out.
Agent proliferation is accelerating. Every vendor is embedding agents. Every business unit wants them. The number of agents in your organization is growing faster than your ability to govern them.
The window is closing. Right now, you might have 20 agents. In two years, you’ll have 200. In five years, 2000. Building governance infrastructure when you have 20 agents is hard. Building it when you have 2000 is nearly impossible.
What Comes Next
This is Part 1 of a five-part series on building agent governance for financial services. In the coming posts, we’ll cover:
- Part 2: Anatomy of an Agent Control Plane - The technical architecture. What to build, what to buy, what to avoid.
- Part 3: The Autonomy Spectrum - How to balance business unit freedom with enterprise governance. (Hint: it’s not a binary.)
- Part 4: Economics of Agent Operations - Cost cascading, trust-based routing, and ROI-driven governance.
- Part 5: Reference Architecture - A concrete, implementable design. Diagrams, integration patterns, decision framework.
The fragmentation tax is real, and it’s compounding daily. The question isn’t whether you’ll pay it - you already are. The question is whether you’ll keep paying it, or start building the infrastructure to escape it.
The Watchtower won’t build itself. But neither will multi-cloud agent sprawl wait for you to catch up.