Pi Coding Experience¶
Pi is Hordago's governed terminal science agent — a coding agent you drive from your shell, wrapped in the phase-FSM governance, evidence-locking, and deterministic gates that turn an interactive session into a sealed, auditable scientific run.
A vanilla Pi session is a generic terminal coding agent: a fast, general-purpose runtime substrate. Hordago keeps that ergonomics and adds the parts raw Pi does not have — a phase state machine, evidence contracts, code-computed gates, manifest sealing, cross-vendor verification, and L1–L5 trust levels. The result is a session that a reviewer can trust: every claim carries provenance, every promotion clears a gate, and the run layout on disk is canonical and sealable.
Naming
Pi runs on an upstream open-source terminal coding-agent runtime. One package in that chain is mid-promotion from an external-org mirror; it is referenced here by role only, never by name, until the promotion lands.
Quickstart — start a governed session¶
One command turns a plain checkout into a governed Pi science session. The
pi-science surface scaffolds the canonical run directory, seals the inputs,
and drops you into an interactive agent that already knows the phase rules.
# Start a governed Pi science session from a domain pack.
# The pack templates the session; the harness governs it.
pi-science start --pack fine-mapping --run-dir runs/locus-2p23
# Inside the session, the state-transition discipline is always available
# (see agents/skills/scipi/SKILL.md):
scipi status # where the run is in the phase FSM
scipi next # the deterministic next legal action
scipi preview # dry-run a transition before it mutates state
scipi transition # advance a phase — only through a gate, never by editing state
Where the surface lives
The interactive pi-science surface and the headless science-runtime
PiAdapter ship as sibling packages that this portal documents; the pack
resolution, phase FSM, and gate contracts they drive are the Hordago-core
modules cited under Source pointers. The scipi
state-transition commands are defined by the skill in this repo
(agents/skills/scipi/SKILL.md).
A vanilla Pi session becomes a governed science session through five layers, each a real component in this repo:
| Layer | Component | Role |
|---|---|---|
| Session surface | pi-science (≈26 tools, ≈30 slash commands, 6 skills) |
The governed terminal agent the operator talks to |
| Headless driver | science-runtime PiAdapter |
Drives the same session non-interactively for batch and CI |
| Gate authority | sci-lock-os |
Seals artifacts and holds deterministic FAIL/BLOCK verdicts |
| Templating | Sci-PI Packs (scipi.pack.v1) |
Domain templates — questions, validators, workflows, L1–L5, evals |
| Planning | /scipi:planning-* flow |
Claim-scoped, falsification-first plan approved before execution |
Why a harness around Pi¶
Pi on its own is a capable generic coding agent, but "capable" is not "admissible." Science needs the run to be reproducible, the claims to be provenance-backed, and the gates to be computed by code rather than narrated by a model. That is exactly the delta the harness supplies.
| Concern | Raw Pi session | Governed Pi science session |
|---|---|---|
| Phase control | Free-form; the agent does whatever the prompt implies | Phase FSM (plan → eda → lock → analyze → report); advances only through explicit policy checks |
| Evidence | Text in the transcript | Evidence contracts — claims carried with provenance and gate state |
| Gates | Model self-assessment | Deterministic verdicts computed in code; FAIL/BLOCK beats any confidence display |
| Reproducibility | Whatever files happen to exist | Canonical run layout with sealed inputs and a freeze.lock |
| Verification | Single-model say-so | Cross-vendor verify — a second lane must corroborate before promotion |
| Autonomy | Unbounded | Scaled by L1–L5 trust levels against evidence strength and reversibility |
Fail-closed is the whole point
Per the scipi state discipline, a deterministic FAIL or BLOCK verdict takes
precedence over confidence, calibration, or display-only tier labels. A
passing model-generated explanation is not verifier evidence; missing
verifier evidence is a block, not approval to continue.
flowchart LR
OP[Operator in terminal] --> PI[pi-science session]
PI --> RT[science-runtime PiAdapter]
RT --> FSM{Phase FSM<br/>plan→eda→lock→analyze→report}
FSM --> GATE[sci-lock-os gates<br/>L1–L5, deterministic]
GATE -->|pass| SEAL[Sealed artifacts + manifest]
GATE -->|FAIL / BLOCK| STOP[Halt — no promotion]
PACK[(Sci-PI Pack<br/>scipi.pack.v1)] -.templates.-> PI
PACK -.wires validators/evals.-> GATE
The kernel and gate internals are documented in Kernel & Gates; this page is the operator's view of the surface that sits on top of them.
Pack / template system¶
Packs are how a new scientific domain enters the system without a code change.
A pack is a declarative scipi.pack.v1 document (science-packs/<name>/pack.yaml)
that names the session's skills, commands, required_questions,
validators, golden_cases, eval_tasks, and an assurance block that wires
each validator and eval to a trust level. Packs compose by inheritance through
extends.
The shipped chain is base → bioinformatics → fine-mapping:
basesupplies the sealed-analysis loop skill, theresearch_question/input_data_refsquestions, and the two contract validators every run needs (canonical-artifacts-presentat L1,provenance-sha256-completeat L5).bioinformaticsextendsbasewith domain skills and shape validators.fine-mappingextendsbioinformaticswith LD-panel questions, GWAS domain validators, and two eval tasks.
# science-packs/fine-mapping/pack.yaml (excerpt)
name: fine-mapping
version: 1.0.0
extends: [bioinformatics] # base → bioinformatics → fine-mapping
required_questions:
- id: ld_reference_panel
prompt: "Which LD reference panel matches the study ancestry?"
required: true
validators:
- id: fine-mapping-ld-panel-match
phase: L2 # domain check, gated at trust level 2
kind: domain
eval_tasks:
- id: PACK-FM-001
domain: gwas_causal_variant
subtype: fine_mapping
expected_decision: answer
- id: PACK-FM-002
subtype: colocalization
expected_decision: insufficient_evidence # defer when eQTL evidence is absent
Resolving a pack flattens the inheritance chain, then generates a runnable instance — a sealed manifest, a lock, and the per-phase artifacts the run expects:
from hordago.science_packs import generate_science_instance, load_science_pack
pack = load_science_pack("fine-mapping")
assert pack.extends == ["bioinformatics"] # inheritance is explicit
generated = generate_science_instance("fine-mapping", "runs/locus-2p23")
# writes: <manifest> (hordago.science_pack_manifest.v1),
# <lock> (hordago.science_pack_lock.v1, manifest_sha256: "sha256:…"),
# and skills.md, commands.md, required-questions.json, validators.json,
# golden-cases.json, eval-tasks.json, assurance-plan.json
The assurance block is what makes a pack governed rather than merely
descriptive: build_pack_assurance_plan maps every declared validator, golden
case, and eval task onto the L1–L5 ladder.
| Level | What it gates | fine-mapping example |
|---|---|---|
| L1 | Canonical-artifact contract | canonical-artifacts-present |
| L2 | Domain validators | fine-mapping-ld-panel-match |
| L3 | Golden cases | fine-mapping-susie-credible-set |
| L4 | Eval tasks / scorecards | PACK-FM-001 |
| L5 | Release validators | provenance-sha256-complete |
New domain = new pack, no code change
To onboard a domain, author a pack.yaml that extends an existing pack and
declares its questions, validators, and evals. Pack promotion and
renamespacing are tracked upstream; this page documents the system as it
lands. Pack files must stay free of external-org identifiers — a test
(tests/test_science_packs.py::test_pack_files_do_not_contain_external_org_identifiers)
enforces it.
Planning system¶
Before a governed session touches data, it plans. The /scipi:planning-* flow
produces a ScientificPlanningPacket: a claim-scoped, falsification-first
plan that a human approves before execution.
- Claim scope — the plan states exactly which claim the run is allowed to make, so downstream evidence and gates can be checked against a fixed target rather than a drifting narrative.
- Falsification — each claim is paired with what would refute it, so an
insufficient_evidenceoutcome (seePACK-FM-002) is a first-class, gate-respecting result, not a failure to explain away. - Approve-before-execute — the packet is a human gate. The FSM will not
leave
planforeda/lockuntil the packet is approved, keeping operator review explicit at the first major handoff.
/scipi:planning-draft → assemble a ScientificPlanningPacket (claim + falsifiers)
/scipi:planning-review → operator reviews claim scope and falsification design
/scipi:planning-approve → seal the packet; unlock the plan→eda transition
File discipline and directory scaffolding¶
Governance is legible on disk. Every governed session materializes the same
canonical run layout, so any reviewer — or the headless PiAdapter in CI — finds
artifacts in the same place across every domain.
runs/locus-2p23/
├── state/ # phase_state.json — the FSM cursor; never hand-edited
├── inputs/ # sealed input references (input_data_refs)
├── provenance/ # SHA-256 provenance records for every sealed artifact
├── results/
│ ├── validation/ # validator outputs by phase (L1–L5)
│ ├── qc/ # quality-control artifacts
│ └── final/ # promoted, release-ready results
├── scratch/ # ephemeral working space; never promoted
├── archive/ # superseded runs, retained for audit
└── freeze.lock # the seal — closes the run to further mutation
The canonical artifacts a sealed run is expected to produce are fixed by the
pack's golden cases — for fine-mapping: result.json, report.md,
provenance.json, gate_status.json, and session_summary.json; for base:
phase_state.json and freeze.lock.
Lease and seal
Protected state files are never mutated directly. State advances under a
lease (acquired for a transition) and is closed by a seal (freeze.lock
plus SHA-256 provenance). Use scipi transition to change state; editing
state/ or provenance/ by hand breaks the seal chain and the L5 release
validator (provenance-sha256-complete) will block promotion.
Pi as a named surface in the trinity¶
Pi is one of Hordago's first-class co-scientist surfaces, not an afterthought.
The co-scientist surface registry enumerates the
parity-grouped surfaces (rest, sdk_python, sdk_typescript, cli, mcp,
ui_card) that all drive the same runScienceLoop. Pi is the named terminal
surface in that trinity: the governed interactive agent, backed by the headless
PiAdapter for batch/CI parity, and by the run cockpit for the UI view. All
three enter the same phase FSM and clear the same gates — the surface changes,
the governance does not.
Source pointers¶
agents/skills/scipi/SKILL.md— Sci-Pi state-transition disciplinesrc/hordago/science_packs.py— pack resolution, inheritance, instance generationscience-packs/base/pack.yaml,science-packs/bioinformatics/pack.yaml,science-packs/fine-mapping/pack.yamlsrc/hordago/science_runtime.py— phase FSM and sealed-run executiontests/test_science_packs.py— pack inheritance, assurance-plan wiring, and leak guardreferences/co-scientist-surface-registry.md— parity-grouped co-scientist surfacesreferences/engine-catalog.md— engines reachable from the orchestrator