Relay board Panshi Sentinel · scoring method, in the open

How we grade a relay

Fully transparent, explainable scoring — no black box. Every grade and every verdict label has a defined meaning. We keep detection internals (probes & fingerprints) secret to prevent evasion, but the scoring logic itself is fully public.

Certification grades

🛡️ Panshi Certified A

Mainstream models verified with strong evidence (trust ≥ 90, incl. crypto signature or official-cloud confirmation); no substitution or downgrade.

✅ Verified B

Mainstream models are genuine (trust ≥ 75); no substitution or downgrade.

🟡 Partial C

Some models unconfirmed, or same-vendor downgrade suspected.

— Not certified

At least one mainstream model failed verification. We publish only positive certification — those that do not pass are simply "not certified"; we make no public substitution claims.

Capping rule: a few genuine models can never mask a substituted one

Any cross-vendor mismatch (substitution) on a single mainstream model caps the whole relay to "not certified", however genuine the others are; any same-vendor downgrade caps it to B. Flagship models also carry more weight, so a relay cannot inflate its score with cheap genuine models while substituting a flagship.

Trust index (0–100)

The relay-level trust index is a weighted average of per-model trust scores (flagships like opus / gpt-5 weigh more than cheap tiers), counting only models actually tested. Models outside our reference set are excluded and labelled separately — they neither help nor hurt the score.

Per-model verdict labels

Each model is judged independently, with a 0–100 trust score and a clearly defined label:

100	Cryptographically verified	Native signature passes official replay — strongest evidence.
92	Genuine · official-cloud resale	Genuine model resold via Bedrock / Vertex official cloud — no native signature, but channel fingerprint + multiple signals agree.
90	Behavioral-fingerprint verified	High-confidence behavioral-fingerprint match to the official-source reference.
85	Multi-signal verified	Multiple independent signals agree (unsigned).
75	Genuine · unsigned	Behaviorally genuine, but without cryptographic-grade evidence.
50	Unconfirmed	Insufficient signal to reach a confident verdict.
30	Same-vendor downgrade suspected	Appears swapped to a cheaper same-vendor tier.
10	Failed verification	Behaves like a different (cross-vendor) model.
5	Signature rejected	Claimed signature failed official verification.
—	Not yet covered (excluded)	Outside our reference set — no verdict, excluded from the score.

How we test

Behavioral fingerprinting

A set of probes samples the model's answer-style distribution and compares it against our official-source reference fingerprints — identifying which model it behaves like, even when prompted to disguise itself as Claude.

Cryptographic signature / official replay

Models supporting native signatures are verified via official replay; genuine models resold through Bedrock / Vertex official cloud lack native signatures but are cross-confirmed via channel fingerprint and multiple signals.

Multi-signal cross-check + per-model verdict

High confidence requires several independent signals (identity, latency, capability, rank tests) to agree. Each model is judged independently — one verified model never speaks for the whole relay.

Why you can trust it

Per-model verdicts

Within one relay, claude may be genuine while gpt is swapped — each model is judged on its own; one verified model never speaks for the whole relay.

Honest boundaries

Models we cannot cover are marked "not yet covered" — no guessing, no false accusations; same-vendor downgrade uses double-guard thresholds against false positives.

Positive-only certification

We publish only verification / certification. Poorly-performing relays simply lack a badge, rank lower, or drop off — we never publish negative accusations (brand & legal safety).

⚠️ Results are probabilistic signals, not legal proof. This certification is a point-in-time snapshot; a relay's backend may change at any time, and continuous assurance requires paid monitoring. Models outside our reference set are marked "not yet covered" with no verdict. We only publish positive verification — those that do not pass are simply "not certified".

Verify it yourself / monitor continuously

🔎 Run a free check with your own key

Paste answers or run a one-file CLI — see genuine vs substituted in seconds.

Run a free check →

🛡️ Monitor your upstreams continuously

7×24 watch on every upstream channel — get alerted the moment one is swapped or degraded.

Open console →

← Browse the relay board