Back to Blog
April 18, 2026 15 min read

Multi-Agent AI Systems in Google Cloud

Well-Architected Multi-Agent AI Terraform Security
Multi-Agent Private Networking Hero Image

In a previous blog, we broke down how our internal developer platform, V-Collab, utilized the Model Context Protocol (MCP) to detach our LLMs from strict vendor schemas. But the moment you transition from a single chat utility into a fully autonomous Multi-Agent System, the risk surface area explodes.

When an AI orchestrator is given the ability to delegate tasks to specific sub-agents—each with distinct permissions to act on your real Microservices—network exfiltration and credential theft become a paramount concern. You cannot deploy an AI agent that spins up Developer Workspaces or evaluates code if that agent runs on a generic public endpoint.

The Cognitive Blueprint: V-Collab Multi-Agent Topology

A Well-Architected AI platform separates cognitive reasoning (planning) from execution (doing). In V-Collab, when an engineer asks to "Review the latest failed build and tune the infrastructure right-sizing," the request traverses a highly specialized hierarchy:

  1. A Frontend Application securely captures the prompt and user JWTs.
  2. It passes execution to the Root Coordinator (running on Vertex AI Reasoning Engine).
  3. The Coordinator decomposes the problem and spawns specialized Subagents:
    • A "Code Inspector" subagent (Task A)
    • An "Infra Tuning" subagent (Task B)
  4. These sub-agents do not talk directly to the platform database. Instead, they hit dedicated MCP Servers acting as API translation gateways to V-Collab’s core microservices.

Here is what the architecture looks like from a network perimeter standpoint:

graph TD subgraph "VPC Service Controls Perimeter (Google Cloud)" Frontend[("V-Collab Frontend
Cloud Run")] --> Coordinator[("Root Coordinator
Vertex AI Reasoning Engine")] subgraph "Isolated Subagents (Cloud Run/GKE)" Coordinator --> SubA["Code Inspector
Subagent"] Coordinator --> SubB["Infra Tuning
Subagent"] end subgraph "Developer Platform Tooling" SubA --> MCP["MCP Server
(Translation Layer)"] SubB --> MCP MCP --> MS1["V-Collab Logs MS"] MCP --> MS2["V-Collab Deploy MS"] end subgraph "Inference Guardrails" Coordinator -.-> |Inference Request| MA[("Model Armor
Data Sanitization")] SubA -.-> |Inference Request| MA SubB -.-> |Inference Request| MA MA --> Gemini["Gemini Enterprise
Vertex AI Endpoint"] end end User((Platform Admin)) -->|Private Connection / IAP| Frontend OnPrem[("On-Prem Data Center")] <--> |HA VPN / Interconnect| NCC{"Network Connectivity Center
(Transit Hub)"} NCC -.->|Private Route| Frontend classDef secure fill:#0f172a,stroke:#38bdf8,stroke-width:2px,color:#fff; classDef agent fill:#1e1b4b,stroke:#a78bfa,stroke-width:2px,color:#fff; classDef mcp fill:#3f2c00,stroke:#f59e0b,stroke-width:2px,color:#fff; classDef shield fill:#14532d,stroke:#4ade80,stroke-width:2px,color:#fff; classDef transit fill:#4c1d95,stroke:#c084fc,stroke-width:2px,color:#fff; class Frontend,MS1,MS2 secure; class Coordinator,SubA,SubB agent; class MCP mcp; class Gemini,MA shield; class NCC,OnPrem transit;

The Well-Architected Assessment

Pillar 1: Security, Privacy, and Compliance

The core of private agent networking revolves around establishing a defense-in-depth security perimeter. Relying solely on application-layer OAuth validation isn't enough; we enforce infrastructure-layer policies.

  • Direct VPC Egress: Our subagents deployed on Cloud Run are configured with INGRESS_TRAFFIC_INTERNAL_ONLY. They are physically barred from receiving commands from the public internet.
  • Network Connectivity Center (Transit Hub): What if our V-Collab agents need to query legacy databases in an on-premises data center? By deploying Google Cloud Network Connectivity Center (NCC) as a transit hub, we bridge our VPC spoke to HA VPN or Cloud Interconnects. This extends the private, non-routable AI perimeter down to our physical data centers securely.
  • Private Service Connect (PSC): To route traffic from the VPC to the managed Vertex AI endpoint (the Root Coordinator) without internet translation, we utilize a PSC Network Attachment.
  • Model Armor: Imagine a malicious actor utilizes prompt-injection forcing the Code Inspector agent to regurgitate internal database strings. Model Armor intercepts the payload before it hits Gemini. Our configuration automatically blocks Data Loss Prevention (DLP) flags like Auth Tokens or PII.

Below is the precise Terraform representation of this Secure Architecture foundation:

# 1. Provide the secure network backbone
resource "google_compute_network" "vcollab_ai_vpc" {
  name                    = "vcollab-platform-ai-vpc"
  auto_create_subnetworks = false
}

resource "google_compute_subnetwork" "agent_subnet" {
  name          = "agent-workload-subnet"
  ip_cidr_range = "10.0.1.0/24"
  region        = "us-central1"
  network       = google_compute_network.vcollab_ai_vpc.id
}

# 2. Configure Direct VPC Egress for Subagents
resource "google_cloud_run_v2_service" "code_inspector_subagent" {
  name     = "vcollab-code-inspector"
  location = "us-central1"
  ingress  = "INGRESS_TRAFFIC_INTERNAL_ONLY"

  template {
    containers {
      image = "us-central1-docker.pkg.dev/.../code-inspector:latest"
    }
    vpc_access {
      network_interfaces {
        network    = google_compute_network.vcollab_ai_vpc.name
        subnetwork = google_compute_subnetwork.agent_subnet.name
      }
      # All outbound traffic is tethered to the VPC, preventing external beaconing
      egress = "ALL_TRAFFIC"
    }
  }
}

# 3. Explicit Data Sanitization via Model Armor
resource "google_vertex_ai_model_armor_template" "strict_sanitization" {
  name     = "vcollab-pii-sanitization"
  location = "us-central1"
  
  filter_config {
    rai_settings {
      rai_filters {
        filter_type      = "HATE_SPEECH"
        confidence_level = "HIGH"
      }
    }
    sdp_settings {
      basic_config {
        filter_enforcement = "ENABLED"
      }
    }
    pi_and_jailbreak_filter_settings {
      filter_enforcement = "ENABLED"
      confidence_level   = "MEDIUM_AND_ABOVE"
    }
  }
}

# 3.b Hybrid Connectivity (Network Connectivity Center Route)
resource "google_network_connectivity_hub" "vcollab_transit_hub" {
  name        = "vcollab-transit-hub"
  description = "Transit gateway hub for routing on-prem tools to Vertex Agents"
}

resource "google_network_connectivity_spoke" "vpc_spoke" {
  name     = "vcollab-vpc-spoke"
  location = "global"
  hub      = google_network_connectivity_hub.vcollab_transit_hub.id

  linked_vpc_network {
    uris = [google_compute_network.vcollab_ai_vpc.self_link]
  }
}

# 4. Create the Network Attachment for the Root Agent
resource "google_compute_network_attachment" "vertex_ai_psc" {
  name                  = "vcollab-vertex-psc-attachment"
  region                = "us-central1"
  connection_preference = "ACCEPT_AUTOMATIC"
  subnetworks           = [google_compute_subnetwork.agent_subnet.self_link]
}

# 5. Deploy the V-Collab Coordinator (Reasoning Engine)
resource "google_vertex_ai_reasoning_engine" "root_coordinator" {
  location     = "us-central1"
  display_name = "vcollab-platform-coordinator"
  
  deployment_spec {
    psc_interface_config {
      network_attachment = google_compute_network_attachment.vertex_ai_psc.id
    }
  }
}

Pillar 2: Reliability

Monolithic AI orchestrators are inherently brittle. If a single complex API tool times out, the entire reasoning loop collapses. In V-Collab's Multi-Agent system, the Vertex AI Reasoning Engine acts purely as the Cognitive Scheduler.

Because the execution happens entirely in isolated sub-agents on Cloud Run, we benefit from granular fault isolation. If the *Code Inspector* subagent exceeds memory or fails due to a bad API call to the Logging MS, the Cloud Run instance trips, but the Vertex Coordinator catches the error cleanly and can reroute the prompt or notify the user without losing the overarching conversational state.

Pillar 3: Operational Excellence

Orchestrating multiple independent agents, Model Armor policies, and MCP routes by hand in the Google Cloud console is impossible to replicate safely. By rigorously defining the entire agent perimeter in Infrastructure as Code (the Terraform above), V-Collab operations teams can continuously stamp out identical isolated environments for Staging and Production.

Closing Thoughts

A developer platform like V-Collab thrives on trust. If an engineer cannot trust the AI assistant reading their repository, the feature fails. By explicitly designing our multi-agent architecture around the Google Cloud Architecture Framework—implementing Private Service Connect, strict VPC egress, and Model Guardrails—we successfully secure the velocity of Generative AI behind the fortress of the Enterprise perimeter.


We're continuously evolving V-Collab's internal developer tooling. Have thoughts on well-architected AI systems? Get in touch or connect with me on LinkedIn.