About Pramana
Crowdsourced LLM drift detection via hash-based output consistency tracking.
How It Works
- Run
pramana run --tier cheap --model gpt-4o. This sends 10 predefined prompts to the model with deterministic parameters (temperature=0,seed=42). - Each response is checked against assertions — exact_match, contains, contains_any, is_json — across 6 categories: reasoning, factual, instruction-following, coding, safety, creative.
- Each output is hashed for drift detection. Same prompt + same model + same hash = no change.
- Submit results to the shared API. The backend appends records, runs daily aggregation, and serves the dashboard at pramana.pages.dev.
- Three tiers — cheap (10), moderate (25), comprehensive (75) — cover the same 6 categories at different density.
Hash Formula
output_hash = SHA-256(model_id + "|" + prompt_id + "|" + output)
Content-addressable versioning via SHA-256 ensures everyone runs the same suite version. Only hashes and aggregate counts are stored — not raw outputs.
Detection Signals
Pass rate changes. If a prompt that used to pass now fails, the model's behavior degraded on that task.
Hash changes. If the output hash changes even though parameters are identical, the model produced different output — catching changes that assertions alone might miss.
Limitations
- Provider-dependent reproducibility. OpenAI respects temperature=0 + seed. Anthropic does not — even at temperature=0, outputs are non-deterministic. Hash-based detection is most reliable against OpenAI.
- Value scales with contributors. One user's data shows their own history. Multiple independent submitters let you distinguish "my environment changed" from "the model changed for everyone."
- Hash catches any change. Including benign ones (formatting, whitespace). It tells you that something changed, not whether it matters.
Architecture
- Stateless API — no database, no connection pools. CSV + JSON on R2.
- Append-only storage — immutable results. No edits, no deletes.
- Privacy-preserving — only hashes and aggregate counts stored, not raw outputs.
- Zero cost — Cloudflare Pages free tier + R2 free tier = $0/month.