pot

Verification tools for agent builders

Open-source CLI and SDK for multi-model adversarial verification. Catch hallucinations before your users do.

$ npm install -g pot-cli click to copy
$ npm install pot-sdk click to copy

What pot does

โš”๏ธ

Multi-model adversarial verification

Run claims through Claude, GPT, Gemini, Grok, and DeepSeek simultaneously. Models challenge each other. Consensus emerges or disagreement is flagged.

๐Ÿ”

Faithfulness detection

Verify that agent outputs are grounded in their source material. Detect fabrication, drift, and unsupported claims with trace-level granularity.

๐ŸŽš๏ธ

Configurable tiers

Quick single-model checks for development. Full adversarial pipeline for production. You control the cost/confidence tradeoff.

๐Ÿ”Œ

CLI + SDK

Use pot-cli in your terminal and CI pipelines. Embed pot-sdk directly in your agent code. Same engine, two interfaces.

How it works

01

Guard

Scan the input for adversarial patterns and prompt injection before verification begins.

02

Generate

N diverse models produce independent proposals. Each evaluates the claim separately โ€” no shared context, no groupthink.

03

Critique

A red-team model attacks the proposals. Objections are classified by materiality and fact-checked across providers.

04

Synthesize

Dual synthesizer aggregates proposals and critique into a verdict. Confidence is calibrated via reasoning-based aggregation.

05

Attest

Return ALLOW, BLOCK, or UNCERTAIN โ€” with per-model evidence, confidence score, and optional signed attestation.

Quick start

CLI โ€” verify a claim bash
# Install globally
npm install -g pot-cli

# Verify a single claim against its trace
pot-cli verify \
  --claim "The Eiffel Tower is 330 metres tall" \
  --trace "According to the official Eiffel Tower website, the structure stands at 330 metres including the antenna."

# Output:
# โœ… ALLOW (confidence: 0.94)
# 3/3 models agree. Source verified.

# Full adversarial pipeline
pot-cli verify \
  --claim "GPT-4 scores 90% on the bar exam" \
  --tier standard \
  --output json
SDK โ€” embed in your agent typescript
import { verify } from 'pot-sdk';

const agentOutput = `The contract implements a reentrancy
guard correctly. The mutex lock is acquired before
the external call and released after.`;

const result = await verify(agentOutput, {
  claim: 'Reentrancy guard is correctly implemented',
  mode: 'standard',          // 'lite' | 'standard'
  providers: [
    { name: 'anthropic', model: 'claude-sonnet-4-6',  apiKey: process.env.ANTHROPIC_API_KEY },
    { name: 'openai',    model: 'gpt-4o',            apiKey: process.env.OPENAI_API_KEY },
    { name: 'deepseek',  model: 'deepseek-v4-flash', apiKey: process.env.DEEPSEEK_API_KEY },
  ],
});

if (result.verdict === 'BLOCK') {
  // Hallucination or unsupported claim detected
  console.warn('Blocked:', result.confidence);
  return fallback(result);
}

// result.verdict:    'ALLOW' | 'BLOCK' | 'UNCERTAIN'
// result.confidence: 0.0 โ€” 1.0
// result.mdi:        model disagreement index

Resources