Getting Started
From install to first submission in under 5 minutes.
1. Install the CLI
pip install pramana-ai
Requires Python 3.10+. See CLI Quick Start for full setup.
2. Write an Eval Suite
An eval suite is a directory of YAML files. Each defines a prompt and expected behavior. See Test Suites for the full format.
# evals/factuality/capital.yaml
prompt: "What is the capital of France?"
expected: "Paris"
scorer: exact_match
3. Run Evals
# Run against a specific model
pramana run --model gpt-4o --suite evals/factuality/
# Run with specific temperature
pramana run --model claude-3.5-sonnet --suite evals/ --temperature 0.0
# Run multiple models
pramana run --model gpt-4o --model claude-3.5-sonnet --suite evals/
Results are stored locally in .pramana/results/ until submitted. See CLI Reference for all options.
4. Submit Results
# Submit all pending results
pramana submit
# Submit results for a specific run
pramana submit --run-id abc123
Submissions go to POST /api/submit/batch. See API Reference for the request schema.
5. Authenticate (Optional)
Anonymous submissions work but are attributed to a shared anonymous user. Authenticate to track your personal stats.
# Open browser to authenticate
pramana auth login
# Or use a CLI token from the dashboard
pramana auth token <your-token>
Authentication uses OAuth (GitHub or Google) via pramana.pages.dev/signin. The CLI stores the JWT locally in ~/.config/pramana/auth.json.
6. View the Dashboard
Open pramana.pages.dev to see aggregated drift charts:
- Submission volume per model over time
- Output consistency tracking (hash-based drift detection)
- Contribution counts and unique contributor stats
Authenticated users: pramana.pages.dev/my-stats.
7. Automate with CI
# .github/workflows/eval.yml
name: LLM Evals
on:
schedule:
- cron: '0 6 * * *' # Daily at 6 AM UTC
workflow_dispatch:
jobs:
eval:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- uses: actions/setup-python@v5
with:
python-version: '3.12'
- run: pip install pramana-ai
- run: pramana auth token ${{ secrets.PRAMANA_TOKEN }}
- run: pramana run --model gpt-4o --suite evals/
- run: pramana submit
Configuration
# pramana.toml
[default]
suite = "evals/"
models = ["gpt-4o", "claude-3.5-sonnet"]
temperature = 0.0
[submit]
endpoint = "https://pramana.pages.dev/api"