Promptster API v1 in 10 Minutes: Your First Test, Compare, and Error-Handling

By Promptster Team · 2026-06-05

We've published a lot about why to test prompts across providers and almost nothing about how to actually call the API to do it. This post fixes that gap. In ten minutes you'll have a working key, a single-model test call, a multi-provider comparison, and error handling that survives a rate limit.

No SDK to install. The Public API v1 is plain HTTP — curl and requests are enough.

Step 1: Get a Key (60 seconds)

API keys live at /developer/api-keys. Create one and you get a secret that starts with pk_live_* (production) or pk_test_* (sandbox). Copy it immediately — it's shown once.

export PROMPTSTER_API_KEY="pk_live_xxxxxxxxxxxxxxxxxxxx"
export PROMPTSTER_BASE="https://www.promptster.dev/v1"

Always use www.promptster.dev. The apex domain 307-redirects, and the redirect strips your Authorization header — your call arrives unauthenticated and you get a confusing 401.

Step 2: Your First `/prompts/test` Call

/prompts/test runs one prompt against one provider/model. Auth is a bearer token.

curl -s -X POST "$PROMPTSTER_BASE/prompts/test" \
  -H "Authorization: Bearer $PROMPTSTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "provider": "openai",
    "model": "gpt-5.2",
    "prompt": "Summarize the CAP theorem in two sentences.",
    "temperature": 0.2,
    "max_tokens": 200
  }'

The response carries the text plus the metadata you actually care about for cost and latency tracking:

{
  "result": "The CAP theorem states that a distributed data store can...",
  "metadata": {
    "provider": "openai",
    "model": "gpt-5.2",
    "input_tokens": 14,
    "output_tokens": 58,
    "cost_usd": 0.00072,
    "latency_ms": 940
  }
}

The metadata block is the point. Token counts, dollar cost, and latency come back on every call so you can log cost-per-call from line one instead of reverse-engineering it from provider dashboards later.

The same call in Python

import os, requests

BASE = "https://www.promptster.dev/v1"
HEADERS = {"Authorization": f"Bearer {os.environ['PROMPTSTER_API_KEY']}"}

resp = requests.post(
    f"{BASE}/prompts/test",
    headers=HEADERS,
    json={
        "provider": "anthropic",
        "model": "claude-opus-4-6",  # <!-- verify model id -->
        "prompt": "Summarize the CAP theorem in two sentences.",
        "temperature": 0.2,
        "max_tokens": 200,
    },
    timeout=60,
)
resp.raise_for_status()
data = resp.json()
print(data["result"])
print(data["metadata"]["cost_usd"], "USD")

Step 3: Fan Out With `/prompts/compare`

The whole reason to use Promptster instead of eleven raw SDKs is comparison. /prompts/compare sends one prompt to several providers in parallel and returns an array of results in one round trip.

curl -s -X POST "$PROMPTSTER_BASE/prompts/compare" \
  -H "Authorization: Bearer $PROMPTSTER_API_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Write a SQL query to find the second-highest salary.",
    "temperature": 0.0,
    "targets": [
      {"provider": "openai",    "model": "gpt-5.2"},
      {"provider": "anthropic", "model": "claude-opus-4-6"},
      {"provider": "google",    "model": "gemini-3.1-pro-preview"},
      {"provider": "deepseek",  "model": "deepseek-reasoner"}
    ]
  }'

You get back one entry per target, each with its own result and metadata. That's the data behind every cost-per-quality table we publish — and the foundation for a task-aware LLM router or a full multi-model router on the API.

Field	Meaning
`results[].provider`	Which provider produced this entry
`results[].result`	The generated text (or `null` on failure)
`results[].error`	Present only if that one target failed
`results[].metadata`	Per-target tokens, `cost_usd`, `latency_ms`

Partial failure is normal and expected. If DeepSeek 500s while the other three succeed, you get three results and one error entry — not a top-level failure. Always loop and check each entry rather than assuming all-or-nothing.

Step 4: Handle Rate Limits Like You Mean It

The API rate-limits per minute, per tier:

Tier	Requests / minute
Free	5
Builder	30
Scale	120
Enterprise	500

Exceed it and you get HTTP 429. A naive script falls over here; a real one backs off and retries.

import time, requests

def call_with_retry(payload, path="/prompts/test", max_tries=5):
    for attempt in range(max_tries):
        r = requests.post(f"{BASE}{path}", headers=HEADERS, json=payload, timeout=60)
        if r.status_code == 429:
            # Respect Retry-After if present, else exponential backoff.
            wait = float(r.headers.get("Retry-After", 2 ** attempt))
            time.sleep(wait)
            continue
        r.raise_for_status()
        return r.json()
    raise RuntimeError("rate limited after retries")

The status codes worth special-casing:

Status	Meaning	What to do
`400`	Bad request (invalid provider/model enum, malformed JSON)	Fix the payload; don't retry
`401`	Bad/missing key, or apex-domain redirect ate your header	Check key + use `www.`
`402`	Out of credits	Top up or downgrade the call
`429`	Rate limit	Back off and retry
`5xx`	Upstream provider hiccup	Retry once, then fall back to another provider

The Real Lesson

The Public API v1 is intentionally boring: bearer auth, JSON in, JSON out, the same metadata shape everywhere. That boring consistency is what lets you build interesting things on top — routers, production eval gates, drift monitors — without maintaining a different client per provider. Start with test, graduate to compare, and instrument metadata.cost_usd from day one.

Code tested against Promptster API v1 as of 2026-06-05. Requires a pk_live_* or pk_test_* key from /developer/api-keys. Model IDs are illustrative — confirm current IDs in the docs before copy-pasting.