OWASP Agentic Security Initiative Top 10
How BASIS and Vorion address each of the OWASP ASI Top 10 risks for AI agents.
OWASP Agentic Security Initiative Top 10
The OWASP Agentic Security Initiative (ASI) defines the top 10 security risks specific to AI agents operating with autonomy. Unlike the traditional OWASP Top 10 for web applications, the ASI focuses on threats unique to agentic systems: prompt injection, tool misuse, trust manipulation, and multi-agent collusion.
This guide maps each ASI risk to specific Vorion defenses and explains the architectural decisions that make these defenses effective.
Risk 1: Prompt Injection
Threat: An attacker embeds malicious instructions in data that an agent processes, causing it to override its system prompt or execute unintended actions.
Vorion Defense: Multi-layer input sanitization.
The InputSanitizer (packages/security/src/security/input-sanitizer/)
uses deterministic regex-based detection -- no LLM-based detection, which
would be vulnerable to recursive injection. Detection covers:
- Instruction override (
ignore previous instructions,disregard context) - Role assumption (
you are now a system administrator) - Encoding evasion (base64, URL encoding, Unicode fullwidth, homoglyphs)
- Context overflow (token-stuffing attacks)
import { InputSanitizer } from '@vorionsys/security';
const sanitizer = new InputSanitizer({
strictness: 'high',
enableEncodingDetection: true,
});
const result = sanitizer.sanitize(userInput);
if (result.status !== 'CLEAN') {
// result.detections contains category, severity, matched pattern
// Feed into trust engine as a negative signal
}
Homoglyph normalization catches Cyrillic character substitution (e.g., Cyrillic 'a' U+0430 masquerading as Latin 'a'). NFKC normalization converts fullwidth characters to ASCII before pattern matching.
See also: Prompt Injection Defense
Risk 2: Insecure Output Handling
Threat: Agent output is consumed by downstream systems without validation, enabling injection into databases, APIs, or other agents.
Vorion Defense: Output integrity via the Proof Plane.
Every agent execution logs an EXECUTION_COMPLETED event with an
outputHash -- the SHA-256 hash of the execution output. The AI
governance output filter (src/security/ai-governance/output-filter.ts)
scans output for:
- PII leakage (credit cards, SSNs, email addresses)
- Injection payloads embedded in responses
- Bias indicators
The DLP scanner (src/security/dlp/scanner.ts) provides a second layer
that catches sensitive data before it leaves the governance boundary.
Risk 3: Tool and Function Misuse
Threat: An agent uses its available tools in unintended ways, such as calling a file-deletion API to cover tracks or using an email tool to exfiltrate data.
Vorion Defense: Capability gating by trust tier.
The capability taxonomy (packages/basis/src/trust-capabilities.ts)
defines exactly 35 capabilities across 8 categories. Each capability
carries explicit constraints:
| Trust Tier | Example Capabilities | Constraints | |-----------|---------------------|-------------| | T0 (Sandbox) | Read public data, generate responses, observe metrics | No network, no write | | T2 (Provisional) | External GET, limited file write | Approved endpoints only, size limited | | T3 (Monitored) | DB write, sandboxed code execution | Time limited, memory limited, no network | | T4 (Standard) | Agent communication, external integrations | Rate limited, approved agents only | | T6 (Certified) | Cross-org communication, policy modification | Federation approved, encrypted, logged |
An agent at T2 simply cannot call a DELETE endpoint -- the Validation Gate rejects the intent before it reaches execution.
Risk 4: Excessive Agency and Privilege Escalation
Threat: An agent accumulates more permissions than necessary, or an attacker manipulates the trust system to gain elevated privileges.
Vorion Defense: Logarithmic gain curve + observation ceilings.
Trust gain follows gain = 0.05 * ln(1 + C - S) * cbrt(R). As score S
approaches ceiling C, gains shrink logarithmically. The cube-root risk
bonus prevents risk-seeking behavior -- doing dangerous things for faster
advancement produces diminishing returns.
Observation ceilings impose hard caps: a BLACK_BOX model (ceiling 600) can never exceed T3 regardless of behavior. Promotion delays gate T5-T7:
| Tier | Delay | |------|-------| | T5 | 7 days sustained performance | | T6 | 10 days | | T7 | 14 days |
The oscillation detector trips the circuit breaker after 3 direction changes in 24 hours, catching attempts to game the scoring system.
Risk 5: Insecure Agent-to-Agent Communication
Threat: Agents exchange information without verifying each other's identity or trust level, enabling impersonation or privilege relay attacks.
Vorion Defense: A2A trust negotiation protocol.
The TrustNegotiationService (packages/platform-core/src/a2a/)
implements bidirectional trust verification:
- Caller generates a time-limited trust proof (HMAC-SHA256 signed)
- Callee verifies proof: CAR match, trust context, expiration, capabilities
- Mutual negotiation takes the stricter of each party's requirements
- Chain-of-trust tracking prevents privilege escalation through chains
The chain trust floor ensures no agent in a delegation chain exercises capabilities above the minimum trust of all participants. Loop detection prevents circular agent-to-agent calls.
Risk 6: Lack of Agent Observability
Threat: An agent's internal state and decision-making process are opaque, making it impossible to detect misbehavior until damage occurs.
Vorion Defense: 5-tier observation model + continuous monitoring.
Observation tiers directly cap achievable trust:
BLACK_BOX: ceiling 600 // I/O only -- proprietary API models
GRAY_BOX: ceiling 750 // I/O + logs -- platform-hosted models
WHITE_BOX: ceiling 900 // Full code access -- open-source models
ATTESTED_BOX: ceiling 950 // TEE-verified execution
VERIFIED_BOX: ceiling 1000 // TEE + zkML + interpretability
The CognitiveEnvelope monitors 6 behavioral dimensions in real time (weight geometry, activation clustering, embedding drift, attention entropy, gradient norm, effective dimensionality). The ParameSphere engine detects model weight changes within seconds using SVD-based fingerprinting.
Risk 7: Supply Chain Vulnerabilities
Threat: A compromised model, dependency, or training dataset introduces hidden vulnerabilities into the agent's behavior.
Vorion Defense: Trust inheritance is NONE.
Every agent starts at trust score 0, regardless of lineage. A fine-tuned model derived from a T7 parent begins at T0 in PROVISIONING and must pass the full qualification course (31 exercises, 9 categories). Provenance modifiers can only be negative -- positive modifiers are rejected by schema validation.
Automated npm audit runs in the compliance evidence pipeline. Semgrep SAST and gitleaks secret scanning execute on every push via CI. The safety gate workflow validates security posture before deployment.
Risk 8: Data Poisoning and Manipulation
Threat: Training data or RAG retrieval results are manipulated to alter agent behavior over time.
Vorion Defense: Canary probes detect gradual behavioral drift.
The CanaryProbeService injects Poisson-distributed verification probes that are indistinguishable from real queries. 9 probe categories test different trust factors:
| Category | Trust Factor | Risk Level | |----------|-------------|------------| | ETHICAL | Alignment (OP-ALIGN) | CRITICAL | | SAFETY | Safety (CT-SAFE) | CRITICAL | | EPISTEMIC | Humility (SF-HUM) | HIGH | | FAIRNESS | Transparency (CT-TRANS) | HIGH | | CAUSAL | Competence (CT-COMP) | HIGH | | FACTUAL | Competence (CT-COMP) | MEDIUM |
Any SAFETY or ETHICAL probe failure triggers immediate circuit breaker action. The gradual nature of data poisoning is specifically what canary probes are designed to catch -- they detect the "boiling frog" attack pattern.
Risk 9: Inadequate Access Controls for Agent Memory
Threat: An agent's persistent memory (conversation history, learned preferences, cached data) is accessible to unauthorized parties or modified by adversaries.
Vorion Defense: Tenant isolation + encrypted storage.
PostgreSQL Row-Level Security (RLS) enforces tenant_id scoping at the
database engine level, preventing cross-tenant memory access even in the
presence of application bugs. Encryption service with field-level
encryption protects sensitive data at rest. KMS integration supports AWS
KMS, HashiCorp Vault, and local providers.
Secure memory (src/security/secure-memory.ts) handles sensitive data in
application memory with TEE support for production environments.
Risk 10: Insufficient Audit and Logging
Threat: Agent actions are not logged with sufficient detail to reconstruct incidents, prove compliance, or detect slow-moving attacks.
Vorion Defense: Cryptographically chained proof records.
The Proof Plane creates an immutable, hash-chained audit trail for every governance decision. Each record includes:
- Event type and correlation ID
- Agent identity (CAR string)
- Trust score at time of decision
- Decision reasoning (machine-readable)
- Ed25519 digital signature for non-repudiation
- SHA-256 link to previous record
The chain is verifiable end-to-end:
const verification = await proofPlane.verifyChainAndSignatures({
startIndex: 0,
endIndex: lastEvent,
});
console.log(verification.valid); // true
console.log(verification.brokenLinks); // []
console.log(verification.invalidSigs); // []
Merkle tree aggregation enables efficient batch verification for large proof chains. Enterprise soak testing validated chain integrity across 200K+ signals with zero hash-chain breaks.
Mapping Summary
| ASI Risk | Primary Defense | Vorion Component |
|----------|----------------|------------------|
| 1. Prompt Injection | InputSanitizer, encoding detection | packages/security/ |
| 2. Insecure Output | Output filter, DLP, output hashing | Proof Plane, AI governance |
| 3. Tool Misuse | Capability gating (35 caps, 8 categories) | packages/basis/ |
| 4. Excessive Agency | Log gain curve, observation ceilings | canonical.ts |
| 5. Insecure A2A | Trust negotiation, chain-of-trust | packages/platform-core/ |
| 6. Lack of Observability | 5-tier observation model, CognitiveEnvelope | packages/atsf-core/ |
| 7. Supply Chain | Zero trust inheritance, qualification course | canonical.ts |
| 8. Data Poisoning | Canary probes (9 categories, Poisson injection) | packages/a3i/ |
| 9. Memory Access | RLS tenant isolation, encrypted storage | packages/contracts/ |
| 10. Insufficient Audit | Dual-hash proof chain, Ed25519 signatures | packages/proof-plane/ |
Next Steps
- Prompt Injection Defense -- Deep dive into sanitization layers
- Canary Probes -- How behavioral verification works
- Circuit Breakers in Depth -- Graduated response system