Sub-Agent Detection

Quint detects when AI agents spawn child agents — even without explicit instrumentation. The detection system uses three independent layers that feed signals into a correlation engine for high-confidence parent-child attribution.

Detection Layers

Layer 1: Model Divergence

When an agent uses a different model than previously seen in the same tunnel, Quint infers a sub-agent spawn. The first model seen becomes the parent model; subsequent different models trigger a split. Known divergence patterns:

Parent Model	Child Model	Provider
`opus` / `sonnet`	`haiku`	Anthropic
`*gpt-4o`	`gpt-4o-mini`	OpenAI
`pro`	`flash`	Google

Any model change (not just known patterns) triggers detection. Each tunnel can only split once.

Layer 2: Concurrency Spike

Quint tracks CONNECT tunnel counts per IP address. During the first 10 seconds (stabilization window), it learns the baseline concurrency. After that, if active tunnels exceed baseline + 2, a sub-agent is detected.

Stabilization: 10 seconds
Spike threshold: baseline + 2 tunnels

Pending children are tracked until their model can be confirmed via Layer 1.

Layer 3: Temporal Gap

When time between CONNECT requests from the same IP exceeds the burst window (2000ms default), Quint evaluates:

Has parent trace (X-Quint-Trace header) → confirmed child (source: child_detect)
No trace, active tunnels exist → inferred child (source: inferred_child)
No trace, no active tunnels → new peer agent

Correlation Engine

Detection signals from all three layers merge in the correlation engine, which maintains a relationship graph with confidence scores.

Signal Types

Signal	Constant	Base Confidence
Spawn pattern match	`spawn`	~0.85
Trace context header	`context`	~0.95
Temporal correlation	`temporal`	~0.50
HMAC-verified ticket	`signature`	1.0

Confidence Merging

Multiple signals for the same parent-child pair are merged using diminishing returns:

merged = max(existing, new) + (1 - max) * 0.1

For example, a spawn signal (0.85) followed by a context signal (0.95) produces:

max(0.85, 0.95) + (1 - 0.95) * 0.1 = 0.95 + 0.005 = 0.955

Relationship Graph

The engine tracks:

type AgentRelationship struct {
    ParentAgentID string
    ChildAgentID  string
    Confidence    float64
    Signals       []Signal   // all contributing signals
    SpawnType     string     // "direct", "delegation", "fork"
    Depth         int        // nesting level
}

Relationships are published to the agent.relationships Kafka topic for downstream consumption.

Dashboard Visualization

The dashboard renders parent-child trees from the correlation engine data. Each node shows:

Agent name and provider
Model in use
Confidence score for the parent-child link
Signal types that contributed to detection

Sub-agent detection in forward proxy mode relies on heuristics (model divergence, timing, concurrency). For guaranteed detection, use spawn tickets with HMAC-SHA256 verification.

Configuration

Sub-agent detection is enabled by default in forward proxy mode. Tuning parameters:

Parameter	Default	Description
Burst window	2000ms	Time gap threshold for temporal detection
Stabilization window	10s	Time to learn baseline concurrency
Spike threshold	baseline + 2	Tunnel count increase to trigger detection
Max body preview	8192 bytes	How much of POST body to read for model extraction

Architecture & Workflows

​Sub-Agent Detection

​Detection Layers

​Layer 1: Model Divergence

​Layer 2: Concurrency Spike

​Layer 3: Temporal Gap

​Correlation Engine

​Signal Types

​Confidence Merging

​Relationship Graph

​Dashboard Visualization

​Configuration

Sub-Agent Detection

Detection Layers

Layer 1: Model Divergence

Layer 2: Concurrency Spike

Layer 3: Temporal Gap

Correlation Engine

Signal Types

Confidence Merging

Relationship Graph

Dashboard Visualization

Configuration