The Agent Watchtower, Part 5: Reference Architecture

We’ve covered the why (Part 1), the technical pillars (Part 2), the governance model (Part 3), and the economics (Part 4). Now let’s make it concrete.

This post provides a reference architecture for enterprise agent governance. Not concepts - specifications. The goal: something you can actually build.

Architecture Overview

The Agent Watchtower consists of five core layers:

Platform Adapters - Connect to AWS, Azure, GCP, OSS frameworks
Core Services - Registry, policy engine, trust scoring
Observability Pipeline - Telemetry collection, processing, storage
Control Plane - Runtime enforcement, intervention capabilities
Interface Layer - APIs, dashboards, integrations

Each layer is independent and can be implemented incrementally.

Layer 1: Platform Adapters

Adapters translate between platform-specific APIs and the unified control plane.

Adapter Responsibilities

Each adapter must implement:

Discovery: Find agents on the platform
Registration: Sync agent metadata to registry
Telemetry: Collect and forward observability data
Policy: Translate and apply policies
Control: Execute runtime interventions

AWS Bedrock Adapter

Discovery:

List Bedrock agents via AWS SDK
Poll for changes (or use EventBridge)
Extract agent configuration, guardrails

Telemetry:

Enable Bedrock tracing
Forward to observability pipeline
Parse Bedrock-specific trace format

Policy:

Map Watchtower policies to Bedrock guardrails
Configure content filters, denied topics
Set up CloudWatch alarms

Control:

Invoke UpdateAgent for config changes
Use CloudWatch for alerts
Lambda for kill switch execution

Azure AI Adapter

Discovery:

List AI deployments via Azure SDK
Monitor via Azure Resource Graph
Extract deployment configuration

Telemetry:

Enable Azure AI tracing
Forward via Event Hubs
Parse Azure-specific format

Policy:

Map to Azure AI Content Safety
Configure responsible AI settings
Integrate with Azure Policy

Control:

Azure SDK for deployment updates
Azure Monitor for alerts
Azure Functions for interventions

Open Source Adapter (LangChain/LangGraph)

Discovery:

Service mesh integration (Kubernetes)
Process registration on startup
Configuration from environment

Telemetry:

LangChain callbacks/LangSmith
OpenTelemetry instrumentation
Custom middleware for traces

Policy:

SDK-level policy enforcement
Proxy for pre/post processing
Custom guardrail implementations

Control:

Kubernetes for deployments
Feature flags for behavior
Service mesh for traffic control

Layer 2: Core Services

Agent Registry Service

The single source of truth for all agents.

Data Model:

Agent {
  id: UUID (unique across platforms)
  external_id: String (platform-specific ID)
  platform: Enum (AWS, Azure, GCP, OSS, Internal)

  name: String
  version: String
  description: String

  owner_team: String
  owner_bu: String
  contacts: [Contact]

  risk_tier: Enum (Critical, High, Medium, Low)
  data_classification: Enum
  regulatory_scope: [String]

  capabilities: [Capability]
  tools: [Tool]
  data_sources: [DataSource]

  status: Enum (Active, Suspended, Deprecated)
  autonomy_level: Enum (L1-L5)

  created_at: Timestamp
  updated_at: Timestamp
  last_active: Timestamp
}

API Operations:

POST /agents - Register new agent
GET /agents/{id} - Get agent details
PUT /agents/{id} - Update agent
DELETE /agents/{id} - Deregister agent
GET /agents?filters - Search/list agents
POST /agents/{id}/suspend - Suspend agent
POST /agents/{id}/activate - Activate agent

Policy Engine Service

Evaluates policies and returns decisions.

Policy Structure:

Policy {
  id: UUID
  name: String
  scope: Enum (Enterprise, Domain, Agent)
  scope_target: String (domain name or agent ID)

  rules: [Rule]

  priority: Integer
  enabled: Boolean

  created_by: String
  created_at: Timestamp
  updated_at: Timestamp
}

Rule {
  condition: Expression
  action: Enum (Allow, Deny, Escalate, Modify)
  parameters: Map
}

Evaluation Flow:

Collect applicable policies (enterprise + domain + agent)
Order by priority
Evaluate conditions against context
Return first matching action (deny-by-default)

API Operations:

POST /policies - Create policy
GET /policies/{id} - Get policy
PUT /policies/{id} - Update policy
DELETE /policies/{id} - Delete policy
POST /evaluate - Evaluate request against policies

Trust Scoring Service

Calculates and maintains trust scores for agents and teams.

Trust Score Components:

Behavioral score: Based on observed behavior (hallucination rate, policy compliance, escalation patterns)
Performance score: Reliability, latency, error rates
Compliance score: Audit findings, violation history
Maturity score: Team certifications, operational capability

Scoring Algorithm:

trust_score = (
  w1 * behavioral_score +
  w2 * performance_score +
  w3 * compliance_score +
  w4 * maturity_score
) * decay_factor(time_since_last_incident)

Weights (default): w1=0.35, w2=0.25, w3=0.25, w4=0.15
Decay: 0.95^(weeks_since_incident)

API Operations:

GET /trust/{agent_id} - Get agent trust score
GET /trust/team/{team_id} - Get team trust score
POST /trust/{agent_id}/incident - Record incident (lowers score)
GET /trust/{agent_id}/history - Get score history

Layer 3: Observability Pipeline

Data Flow

Collection: Adapters collect platform telemetry
Ingestion: Kafka/Kinesis for high-throughput ingestion
Processing: Stream processing for real-time analytics
Storage: Time-series DB + object storage + search index
Analysis: ML models for anomaly detection

Telemetry Schema

AgentEvent {
  event_id: UUID
  agent_id: UUID
  timestamp: Timestamp

  event_type: Enum (Request, Response, ToolCall, Error, Escalation)

  request: {
    input: String (masked)
    input_tokens: Integer
    metadata: Map
  }

  response: {
    output: String (masked)
    output_tokens: Integer
    latency_ms: Integer
    confidence: Float
  }

  tool_calls: [{
    tool_name: String
    parameters: Map (masked)
    result: String (masked)
    success: Boolean
  }]

  policy_evaluation: {
    policies_evaluated: [String]
    decision: Enum
    escalated: Boolean
  }

  cost: {
    inference_cost: Decimal
    tool_cost: Decimal
    total_cost: Decimal
  }
}

PII Masking

All telemetry passes through PII detection and masking before storage.

Named entity recognition for names, addresses
Pattern matching for SSN, credit cards, etc.
Configurable masking (hash, redact, tokenize)
Reversible tokenization for authorized access

Anomaly Detection

ML models running on the telemetry stream:

Behavioral drift: Agent responses changing over time
Sandbagging detection: Agent performing differently under observation
Topic clustering: Detecting out-of-scope conversations
Confidence calibration: Are confidence scores predictive?

Layer 4: Control Plane

Runtime Enforcement

Pre-invocation checks:

Policy evaluation (should this request proceed?)
Rate limiting (quota check)
Circuit breaker (is agent healthy?)

During execution:

Tool call interception (are tools allowed?)
Data access monitoring (what’s being accessed?)
Timeout enforcement

Post-execution checks:

Output validation (policy compliance)
PII scan (no leakage)
Confidence threshold (escalation needed?)

Intervention Capabilities

Kill Switch:

Immediately halt agent
Options: single agent, agent type, all agents in domain
Configurable: hard stop vs. graceful drain

Behavior Modification:

Update confidence thresholds
Enable/disable specific tools
Adjust escalation rules
Modify prompt templates

Traffic Control:

Route to different agent versions
Canary deployments
A/B testing
Gradual rollout

Emergency Response

Automated response (configurable):

Trigger	Response
Trust score below threshold	Increase human oversight
Anomaly score spike	Alert on-call, reduce autonomy
Policy violation	Suspend agent, notify owner
Confidence consistently low	Escalate all requests

Manual response:

SOC dashboard for real-time status
One-click kill switch per agent/domain
Incident workflow integration

Layer 5: Interface Layer

REST API

All services expose REST APIs with:

OpenAPI specifications
JWT authentication
RBAC authorization
Rate limiting
Audit logging

Event Streaming

Kafka/EventBridge topics for:

Agent registration events
Policy changes
Trust score updates
Anomaly alerts
Incident notifications

Dashboard

Executive view:

Agent inventory summary
Risk distribution
Cost trends
Incident summary

Operations view:

Real-time agent status
Performance metrics
Anomaly alerts
Intervention controls

Governance view:

Policy compliance
Audit trail
Trust score trends
Autonomy level distribution

Integrations

SIEM: Forward security events ITSM: Incident creation IAM: Authorization sync CI/CD: Deployment gates Cost Management: Chargeback data

Implementation Roadmap

Phase 1: Foundation (Weeks 1-4)

Week 1-2:

Deploy registry service
Implement manual agent registration
Basic API and authentication

Week 3-4:

Deploy first adapter (start with most common platform)
Automated agent discovery
Basic telemetry collection

Deliverable: Registry of all agents with manual classification

Phase 2: Observability (Weeks 5-8)

Week 5-6:

Deploy observability pipeline
Telemetry storage and search
Basic dashboards

Week 7-8:

PII masking
Cost attribution
Performance metrics

Deliverable: Visibility into agent behavior and costs

Phase 3: Policy (Weeks 9-12)

Week 9-10:

Deploy policy engine
Define enterprise policies
Basic policy evaluation

Week 11-12:

Domain-specific policies
Pre-invocation enforcement
Policy violation alerts

Deliverable: Policy enforcement for high-risk scenarios

Phase 4: Control (Weeks 13-16)

Week 13-14:

Runtime control plane
Kill switch implementation
Manual interventions

Week 15-16:

Automated responses
Trust scoring service
Autonomy level enforcement

Deliverable: Full runtime control capability

Phase 5: Optimization (Weeks 17-20)

Week 17-18:

Anomaly detection models
Trust Cascade implementation
Cost optimization routing

Week 19-20:

Advanced analytics
Self-service onboarding
Full integration suite

Deliverable: Production-ready agent governance platform

Key Decisions

Decisions you’ll need to make during implementation:

Build vs. Buy:

Control plane core: Build (competitive differentiation)
Observability storage: Buy (commodity infrastructure)
Policy engine: Consider OPA (open source, mature)
Adapters: Build (platform-specific)

Deployment:

Where does control plane run? (Own cloud, multi-cloud, vendor-hosted)
Latency requirements? (Real-time vs. near-real-time)
Data residency requirements? (Regional deployment)

Organizational:

Who owns the platform? (AI CoE, Platform team, Security)
Who defines policies? (Federated model recommended)
Who operates it? (SRE, dedicated team)

The Bottom Line

This reference architecture provides a blueprint, not a prescription. Your implementation will differ based on:

Which platforms you use
Your existing infrastructure
Your risk appetite
Your team capabilities

The principles remain constant:

Know what agents exist (Registry)
See what they do (Observability)
Define what they should do (Policy)
Enforce it (Control)

Start small. Build incrementally. Optimize continuously.

The Watchtower isn’t a destination. It’s a capability that grows with your agent deployment. Build the foundation now, and you’ll be ready for whatever comes next.

This concludes The Agent Watchtower series. For implementation support, contact the Rotascale team.

Architecture Overview

Layer 1: Platform Adapters

Adapter Responsibilities

AWS Bedrock Adapter

Azure AI Adapter

Open Source Adapter (LangChain/LangGraph)

Layer 2: Core Services

Agent Registry Service

Policy Engine Service

Trust Scoring Service

Layer 3: Observability Pipeline

Data Flow

Telemetry Schema

PII Masking

Anomaly Detection

Layer 4: Control Plane

Runtime Enforcement

Intervention Capabilities

Emergency Response

Layer 5: Interface Layer

REST API

Event Streaming

Dashboard

Integrations

Implementation Roadmap

Phase 1: Foundation (Weeks 1-4)

Phase 2: Observability (Weeks 5-8)

Phase 3: Policy (Weeks 9-12)

Phase 4: Control (Weeks 13-16)

Phase 5: Optimization (Weeks 17-20)

Key Decisions

The Bottom Line

Stay ahead of AI governance

Related Articles

What Moltbook Reveals About Multi-Agent Trust at Scale

The Agent Watchtower, Part 4: Economics of Agent Operations

Download Resource