LLM Attested

Run cards for captured model work.

Runcard turns LLM-related actions into signed receipts. The public card says what was captured, how it was captured, what counted as evidence, and where the verifier can fetch the proof.

It does not claim magic. A card proves a stream of captured events. It does not prove that no other model, account, browser, phone, or laptop was used.

Example team runcard for The Esoteric Build Society Example individual signal runcard for a solo developer

public surfaces

One card for the team. One card for the person.

Hackathons are judged by team. Daily proof is shown by individuals. The data model supports both without changing capture infrastructure.

Team runcard preview

Team card

Aggregates members, captured events, token stats, capture methods, assurance labels, and proof links for judging.

Individual signal runcard preview

Individual signal

Tracks one participant or agent session. It is suitable for a profile README because it is a plain SVG with embedded metadata.

capture sources

Different paths, different claims.

Each capture source writes the same signed event envelope. The card labels the source and assurance instead of flattening everything into "verified."

agent TEE gateway

Attested gateway

Infrastructure: OpenAI-compatible gateway, event signer, event sink, TEE quote verifier.

Trust: Strong for calls routed through the measured gateway.

Implication: Good for strict events. It still cannot prove out-of-band non-use.

laptop local model

Local proxy

Infrastructure: Local OpenAI-compatible proxy near Ollama, LM Studio, oMLX, vLLM, or llama.cpp.

Trust: Participant controlled unless run inside a managed or attested environment.

Implication: Useful for local-model visibility. Not hard anti-cheat by itself.

chat page

Browser extension

Infrastructure: Extension plus local capture daemon. The daemon signs and forwards events.

Trust: Desktop browser extensions are participant controlled.

Implication: Good activity signal. Do not score it like an enforced gateway.

$ llmattest start agent command

CLI and SDK wrapper

Infrastructure: One binary launches the agent, injects env vars, and records session events.

Trust: Proves the wrapper ran. LLM calls still need gateway or proxy capture.

Implication: Low-friction onboarding. Best as glue, not the only evidence source.

tools keys

Strict workbench

Infrastructure: Managed workspace, controlled credentials, capture defaults, and limited escape hatches.

Trust: Stronger than participant-local tools, weaker than a quoted TEE boundary.

Implication: Good middle tier for events that need fairness without forcing every workflow through one model.

quote workspace

TEE workspace

Infrastructure: Confidential VM or enclave, quote collection, signed capture stack, remote verifier.

Trust: Strong for the measured workspace and the hooks it actually runs.

Implication: Highest assurance path. Still state what is observed; do not imply omniscience.

user TLS notary

Provider TLS notary

Infrastructure: Retrospective proof import from provider web or API evidence.

Trust: Can support what was shown in that account or session.

Implication: Mercy lane and audit tool. It cannot prove absence across other accounts.

sig

Manual import

Infrastructure: CLI or daemon accepts a report and emits the same signed event envelope.

Trust: Self reported unless backed by separate evidence.

Implication: Keeps late teams from falling out of the event. Label it plainly.

infrastructure

The system is a few narrow interfaces.

Capture sources differ. The backend should not. Everything becomes a signed ContestEvent, stored in an append-only event log, then rendered into cards and proof documents.

Hosted or self-hosted event service

Creates events, team payloads, join URLs, dashboards, runcards, SVGs, and proof endpoints.

One participant binary

Starts the local companion, probes the machine, selects a capture path, and wraps agent commands.

Capture adapters

Gateway, local proxy, browser, workbench, TEE workspace, TLS notary, and manual import all emit the same envelope.

Proof endpoints

Cards link to JSON, receipts, credential documents, and event log roots. SVGs embed compact metadata.

trust assumptions

What a card means.

Claims we can make

  • The card was derived from signed captured events.
  • The event log root commits to the receipts used for the stats.
  • Assurance labels describe the capture path that produced each event.
  • Remote attestation proves a measured component when the path has real quote evidence.

Claims we must not make

  • It does not prove a participant never used another model.
  • It does not make browser or local capture equal to a TEE gateway.
  • It does not make provider token numbers independently true.
  • It does not turn a voluntary universal card into an anti-cheat system.

current path

Start with the participant binary.

llmattest start --team-payload team.json -- <agent command>

The same event can also run as a hosted service or a self-hosted service inside a TEE. The page labels the difference; the verifier checks the receipt chain.

repo map

Implementation pointers.