Verification tools for agent builders
Open-source CLI and SDK for multi-model adversarial verification. Catch hallucinations before your users do.
Run claims through multiple independent models — Gemini, Grok, DeepSeek, and Claude. A dedicated red-team critic challenges their verdicts. A synthesizer produces the final decision.
Verify that agent outputs are grounded in their source material. Detect fabrication, drift, and unsupported claims with trace-level granularity.
Three speed tiers — fast, standard, deep. You control the cost/confidence tradeoff per call.
Use pot-cli in your terminal and CI pipelines. Embed pot-sdk directly in your agent code. Same engine, two interfaces.
Scan the input for adversarial patterns and prompt injection before verification begins.
N diverse models produce independent proposals. Each evaluates the claim separately — no shared context, no groupthink.
A red-team model attacks the proposals. Objections are classified by materiality and fact-checked across providers.
Dual synthesizer aggregates proposals and critique into a verdict. Confidence is calibrated via reasoning-based aggregation.
Return ALLOW, BLOCK, or UNCERTAIN — with per-model evidence, confidence score, and optional signed attestation.
# Install globally npm install -g pot-cli # Verify a single claim against its trace pot-cli verify \ --claim "The Eiffel Tower is 330 metres tall" \ --trace "According to the official Eiffel Tower website, the structure stands at 330 metres including the antenna." # Output: # ✅ ALLOW (confidence: 0.94) # 3/3 models agree. Source verified. # Full adversarial pipeline pot-cli verify \ --claim "GPT-4 scores 90% on the bar exam" \ --tier standard \ --output json
import { verify } from 'pot-sdk'; const agentOutput = `The contract implements a reentrancy guard correctly. The mutex lock is acquired before the external call and released after.`; const result = await verify(agentOutput, { claim: 'Reentrancy guard is correctly implemented', mode: 'standard', // 'lite' | 'standard' providers: [ { name: 'anthropic', model: 'claude-sonnet-4-6', apiKey: process.env.ANTHROPIC_API_KEY }, { name: 'google', model: 'gemini-2.5-flash', apiKey: process.env.GOOGLE_API_KEY }, { name: 'deepseek', model: 'deepseek-v4-flash', apiKey: process.env.DEEPSEEK_API_KEY }, ], }); if (result.verdict === 'BLOCK') { // Hallucination or unsupported claim detected console.warn('Blocked:', result.confidence); return fallback(result); } // result.verdict: 'ALLOW' | 'BLOCK' | 'UNCERTAIN' // result.confidence: 0.0 — 1.0 // result.mdi: model disagreement index