# Advanced Agent Verifier

## ⚡*<mark style="color:purple;">Responsible</mark>*. We *<mark style="color:purple;">Care</mark>*.

{% embed url="<https://climate.stripe.com/6SV9BA>" %}

{% content-ref url="/pages/jdIiEemQ6diNpNvQz0YF" %}
[Testing](/lisaiceland/platform+/active-development/testing.md)
{% endcontent-ref %}

{% content-ref url="/pages/NicqNRF1RaLZCvAmf9ds" %}
[Bias Protections](/lisaiceland/smarter-ai-learn-more/ai-safety+/bias-protections.md)
{% endcontent-ref %}

{% content-ref url="/pages/TRIi7ZophDWfdxTjp4T6" %}
[AI Safety Guardrails](/lisaiceland/smarter-ai-learn-more/ai-safety+/guardrails+/ai-safety-guardrails.md)
{% endcontent-ref %}

{% content-ref url="/pages/OssXaVNosrMvJiN35wOL" %}
[Human-in-the-Loop](/lisaiceland/platform+/active-development/human-in-the-loop.md)
{% endcontent-ref %}

![Security](https://img.shields.io/badge/agent--verifier-implemented-blue)

Last updated: March 24, 2026.

[![Tests](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e)](https://camo.githubusercontent.com/26641b5a70dea0526ad84e92b8d1dea013f3682c187ef1cac1ac09685e2c31e2/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f74657374732d39373725323070617373696e672d627269676874677265656e) [![Vitest](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138)](https://camo.githubusercontent.com/75fde65290dbfc7dd2b52c4aa25a9f069d8f342dd65f96e40ddf31bab03b69bc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f746573746564253230776974682d7669746573742d364539463138) [![Languages](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565)](https://camo.githubusercontent.com/526cf55b8be703ab2d413b92d1ccf65a837f02413ed34f9f3015fc0e07161bf8/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f6c616e6775616765732d33372d626c7565) [![BYOK Providers](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765)](https://camo.githubusercontent.com/991e44d30e63e00f5d26eb53658c85234ebfbfad6a72f2854fddb9ebd5ba80d3/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f42594f4b25323070726f7669646572732d31382d6f72616e6765) [![Security](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)](https://camo.githubusercontent.com/9848248df8f878d8a375f7a0993b27219c2ed5c209d9a2442e6001064201a7cc/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f73656375726974792d68617264656e65642d637269746963616c)

![Moat](https://docs.lisaiceland.com/~gitbook/image?url=https%3A%2F%2Fcamo.githubusercontent.com%2Fcf9284fb15978bad5057ded6dd214f81c456978e71e1a84894a7c1c0203e94db%2F68747470733a2f2f696d672e736869656c64732e696f2f62616467652f636f6d70657469746976652532306d6f61742d3530253230646966666572656e746961746f72732d707572706c65\&width=300\&dpr=3\&quality=100\&sign=f6cd6274\&sv=2) [![Shipped](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)](https://camo.githubusercontent.com/c6d47a185d7feee3e89913188f3f3f27c0dcd3c37348c167e7c76818d565e5d4/68747470733a2f2f696d672e736869656c64732e696f2f62616467652f7368697070656425323066656174757265732d7e3332352d677265656e)

***

## AI Voice+

### Overview

The Agent Verifier is a conceptual security framework that ensures AI agents operating within the AI Voice+ platform are trustworthy, sandboxed, and auditable. This document maps the 18 verifier concepts to our actual implementation.

***

### Implementation Status

#### ✅ Already Implemented

| Verifier Concept                  | Our Implementation                                                                                                | Code Location                                    |
| --------------------------------- | ----------------------------------------------------------------------------------------------------------------- | ------------------------------------------------ |
| **Agent Identity & Provenance**   | Org-scoped `ai_agents` table with unique IDs, API key hashing (SHA-256) for MCP                                   | `ai_agents` table, `mcp-server/index.ts`         |
| **Capability Declaration**        | `agent_one_tools` table declares per-workspace tools with type and definition                                     | `agent_one_tools` table                          |
| **Prompt & Policy Compliance**    | 10-pattern injection scanning, expanded safety preamble with bias/fairness/boundaries                             | `agent-one-chat/index.ts`, `convo-chat/index.ts` |
| **Tool & API Sandboxing**         | Per-request context isolation, org-scoped queries, daily quotas (100 msgs/day)                                    | `mcp-server/index.ts` (RequestContext class)     |
| **DLP (Data Leakage Prevention)** | 5-pattern PII redaction on input AND output (CC, SSN, email, phone, UK NINO)                                      | All chat edge functions                          |
| **Audit Logs**                    | `ai_usage_logs` table tracks moderation blocks, injection detections, and usage                                   | `ai_usage_logs` table                            |
| **Human-in-the-Loop**             | Content moderation blocks with safety refusals; fail-closed moderation                                            | `moderateContent()` in chat functions            |
| **Self-Restricting Behavior**     | SAFETY\_PREAMBLE includes: "ask for clarification rather than guessing", "may decline tasks outside capabilities" | System prompts                                   |
| **Rate Limiting**                 | IP-based (30/15min) + org-based daily quotas; fail-closed rate limiter                                            | `_shared/rate-limit.ts`                          |

#### 🔮 Planned (Future Roadmap)

| Verifier Concept                   | Status  | Notes                                                              |
| ---------------------------------- | ------- | ------------------------------------------------------------------ |
| **Multi-Agent Cross-Verification** | Planned | Requires multi-model voting system; would use agent transfer rules |
| **Agent Reputation/Trust Scores**  | Planned | Needs historical behavior data collection over time                |
| **Behavioral Drift Detection**     | Planned | Requires baseline behavior collection and comparison               |
| **Certification Badges**           | Planned | UI feature showing agent compliance status                         |
| **Version Control & Rollback**     | Planned | Agent configuration versioning with rollback capability            |
| **Automated Red-Teaming**          | Planned | Periodic injection testing against live agents                     |

***

### Verification Architecture

```
User Request
    │
    ▼
┌─────────────────┐
│  Rate Limiter    │  ← IP-based, fail-closed
│  (Layer 1)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Input Validator │  ← Length, type, UUID format
│  (Layer 2)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Sanitizer       │  ← Control char stripping
│  (Layer 3)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Injection Scan  │  ← 10 regex patterns, safe-wrapping
│  (Layer 4)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  PII Redactor    │  ← 5 PII patterns on input
│  (Layer 5)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Content Mod     │  ← AI gateway, fail-closed
│  (Layer 6)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Auth & Org      │  ← JWT verification, org membership
│  (Layer 7)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Daily Quota     │  ← Per-org message limits
│  (Layer 8)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  AI Model Call   │  ← Safety preamble + context
│  (Layer 9)       │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Output PII      │  ← PII redaction on response
│  Redaction       │
│  (Layer 10)      │
└────────┬────────┘
         │
    ▼
┌─────────────────┐
│  Output Mod      │  ← Content moderation on response
│  (Layer 11)      │
└────────┬────────┘
         │
    ▼
  Response to User
```

***

### MCP Server Security

The MCP (Model Context Protocol) server uses a `RequestContext` class instead of global state to prevent cross-tenant data leaks during concurrent requests:

* Each request authenticates via SHA-256 hashed API key
* Context (Supabase client, org ID, user ID) is stored per-request
* All tool handlers read from the request-scoped context
* Org-scoped queries prevent data access across tenants

***

### How Existing Safety Layers Map to Verifier Concepts

| Safety Layer                                  | Verifier Concept                     |
| --------------------------------------------- | ------------------------------------ |
| `INJECTION_PATTERNS` (10 patterns)            | Prompt & Policy Compliance           |
| `PII_PATTERNS` (5 patterns)                   | DLP / Data Leakage Prevention        |
| `SAFETY_PREAMBLE` (bias, boundaries, honesty) | Policy Compliance + Self-Restriction |
| `moderateContent()` (fail-closed)             | Human-in-the-Loop (automated)        |
| `RequestContext` class                        | Tool Sandboxing                      |
| `ai_usage_logs` audit entries                 | Audit Logs                           |
| `checkRateLimit()` (fail-closed)              | Rate Limiting                        |
| `BLOCKED_VOICE_PHRASES`                       | DLP for Voice                        |
| Error masking (generic messages)              | Information Disclosure Prevention    |
| `encrypt_sensitive()` / `decrypt_sensitive()` | Data Protection at Rest              |

***

### 🚀 What's Next? (see Roadmap)

{% embed url="<https://future.lisaiceland.com/roadmap>" %}


---

# Agent Instructions: Querying This Documentation

If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter:

```
GET https://nexas-ridewiz.gitbook.io/lisaiceland/platform+/active-development/advanced-agent-verifier.md?ask=<question>
```

The question should be specific, self-contained, and written in natural language.
The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
