Skip to content

Pi Coding Experience

Pi is Hordago's governed terminal science agent — a coding agent you drive from your shell, wrapped in the phase-FSM governance, evidence-locking, and deterministic gates that turn an interactive session into a sealed, auditable scientific run.

A vanilla Pi session is a generic terminal coding agent: a fast, general-purpose runtime substrate. Hordago keeps that ergonomics and adds the parts raw Pi does not have — a phase state machine, evidence contracts, code-computed gates, manifest sealing, cross-vendor verification, and L1–L5 trust levels. The result is a session that a reviewer can trust: every claim carries provenance, every promotion clears a gate, and the run layout on disk is canonical and sealable.

Naming

Pi runs on an upstream open-source terminal coding-agent runtime. One package in that chain is mid-promotion from an external-org mirror; it is referenced here by role only, never by name, until the promotion lands.

Quickstart — start a governed session

One command turns a plain checkout into a governed Pi science session. The pi-science surface scaffolds the canonical run directory, seals the inputs, and drops you into an interactive agent that already knows the phase rules.

# Start a governed Pi science session from a domain pack.
# The pack templates the session; the harness governs it.
pi-science start --pack fine-mapping --run-dir runs/locus-2p23

# Inside the session, the state-transition discipline is always available
# (see agents/skills/scipi/SKILL.md):
scipi status      # where the run is in the phase FSM
scipi next        # the deterministic next legal action
scipi preview     # dry-run a transition before it mutates state
scipi transition  # advance a phase — only through a gate, never by editing state

Where the surface lives

The interactive pi-science surface and the headless science-runtime PiAdapter ship as sibling packages that this portal documents; the pack resolution, phase FSM, and gate contracts they drive are the Hordago-core modules cited under Source pointers. The scipi state-transition commands are defined by the skill in this repo (agents/skills/scipi/SKILL.md).

A vanilla Pi session becomes a governed science session through five layers, each a real component in this repo:

Layer Component Role
Session surface pi-science (≈26 tools, ≈30 slash commands, 6 skills) The governed terminal agent the operator talks to
Headless driver science-runtime PiAdapter Drives the same session non-interactively for batch and CI
Gate authority sci-lock-os Seals artifacts and holds deterministic FAIL/BLOCK verdicts
Templating Sci-PI Packs (scipi.pack.v1) Domain templates — questions, validators, workflows, L1–L5, evals
Planning /scipi:planning-* flow Claim-scoped, falsification-first plan approved before execution

Why a harness around Pi

Pi on its own is a capable generic coding agent, but "capable" is not "admissible." Science needs the run to be reproducible, the claims to be provenance-backed, and the gates to be computed by code rather than narrated by a model. That is exactly the delta the harness supplies.

Concern Raw Pi session Governed Pi science session
Phase control Free-form; the agent does whatever the prompt implies Phase FSM (plan → eda → lock → analyze → report); advances only through explicit policy checks
Evidence Text in the transcript Evidence contracts — claims carried with provenance and gate state
Gates Model self-assessment Deterministic verdicts computed in code; FAIL/BLOCK beats any confidence display
Reproducibility Whatever files happen to exist Canonical run layout with sealed inputs and a freeze.lock
Verification Single-model say-so Cross-vendor verify — a second lane must corroborate before promotion
Autonomy Unbounded Scaled by L1–L5 trust levels against evidence strength and reversibility

Fail-closed is the whole point

Per the scipi state discipline, a deterministic FAIL or BLOCK verdict takes precedence over confidence, calibration, or display-only tier labels. A passing model-generated explanation is not verifier evidence; missing verifier evidence is a block, not approval to continue.

flowchart LR
    OP[Operator in terminal] --> PI[pi-science session]
    PI --> RT[science-runtime PiAdapter]
    RT --> FSM{Phase FSM<br/>plan→eda→lock→analyze→report}
    FSM --> GATE[sci-lock-os gates<br/>L1–L5, deterministic]
    GATE -->|pass| SEAL[Sealed artifacts + manifest]
    GATE -->|FAIL / BLOCK| STOP[Halt — no promotion]
    PACK[(Sci-PI Pack<br/>scipi.pack.v1)] -.templates.-> PI
    PACK -.wires validators/evals.-> GATE

The kernel and gate internals are documented in Kernel & Gates; this page is the operator's view of the surface that sits on top of them.

Pack / template system

Packs are how a new scientific domain enters the system without a code change. A pack is a declarative scipi.pack.v1 document (science-packs/<name>/pack.yaml) that names the session's skills, commands, required_questions, validators, golden_cases, eval_tasks, and an assurance block that wires each validator and eval to a trust level. Packs compose by inheritance through extends.

The shipped chain is basebioinformaticsfine-mapping:

  • base supplies the sealed-analysis loop skill, the research_question / input_data_refs questions, and the two contract validators every run needs (canonical-artifacts-present at L1, provenance-sha256-complete at L5).
  • bioinformatics extends base with domain skills and shape validators.
  • fine-mapping extends bioinformatics with LD-panel questions, GWAS domain validators, and two eval tasks.
# science-packs/fine-mapping/pack.yaml (excerpt)
name: fine-mapping
version: 1.0.0
extends: [bioinformatics]          # base → bioinformatics → fine-mapping
required_questions:
  - id: ld_reference_panel
    prompt: "Which LD reference panel matches the study ancestry?"
    required: true
validators:
  - id: fine-mapping-ld-panel-match
    phase: L2                      # domain check, gated at trust level 2
    kind: domain
eval_tasks:
  - id: PACK-FM-001
    domain: gwas_causal_variant
    subtype: fine_mapping
    expected_decision: answer
  - id: PACK-FM-002
    subtype: colocalization
    expected_decision: insufficient_evidence   # defer when eQTL evidence is absent

Resolving a pack flattens the inheritance chain, then generates a runnable instance — a sealed manifest, a lock, and the per-phase artifacts the run expects:

from hordago.science_packs import generate_science_instance, load_science_pack

pack = load_science_pack("fine-mapping")
assert pack.extends == ["bioinformatics"]     # inheritance is explicit

generated = generate_science_instance("fine-mapping", "runs/locus-2p23")
# writes: <manifest> (hordago.science_pack_manifest.v1),
#         <lock> (hordago.science_pack_lock.v1, manifest_sha256: "sha256:…"),
#         and skills.md, commands.md, required-questions.json, validators.json,
#         golden-cases.json, eval-tasks.json, assurance-plan.json

The assurance block is what makes a pack governed rather than merely descriptive: build_pack_assurance_plan maps every declared validator, golden case, and eval task onto the L1–L5 ladder.

Level What it gates fine-mapping example
L1 Canonical-artifact contract canonical-artifacts-present
L2 Domain validators fine-mapping-ld-panel-match
L3 Golden cases fine-mapping-susie-credible-set
L4 Eval tasks / scorecards PACK-FM-001
L5 Release validators provenance-sha256-complete

New domain = new pack, no code change

To onboard a domain, author a pack.yaml that extends an existing pack and declares its questions, validators, and evals. Pack promotion and renamespacing are tracked upstream; this page documents the system as it lands. Pack files must stay free of external-org identifiers — a test (tests/test_science_packs.py::test_pack_files_do_not_contain_external_org_identifiers) enforces it.

Planning system

Before a governed session touches data, it plans. The /scipi:planning-* flow produces a ScientificPlanningPacket: a claim-scoped, falsification-first plan that a human approves before execution.

  • Claim scope — the plan states exactly which claim the run is allowed to make, so downstream evidence and gates can be checked against a fixed target rather than a drifting narrative.
  • Falsification — each claim is paired with what would refute it, so an insufficient_evidence outcome (see PACK-FM-002) is a first-class, gate-respecting result, not a failure to explain away.
  • Approve-before-execute — the packet is a human gate. The FSM will not leave plan for eda/lock until the packet is approved, keeping operator review explicit at the first major handoff.
/scipi:planning-draft     → assemble a ScientificPlanningPacket (claim + falsifiers)
/scipi:planning-review    → operator reviews claim scope and falsification design
/scipi:planning-approve   → seal the packet; unlock the plan→eda transition

File discipline and directory scaffolding

Governance is legible on disk. Every governed session materializes the same canonical run layout, so any reviewer — or the headless PiAdapter in CI — finds artifacts in the same place across every domain.

runs/locus-2p23/
├── state/            # phase_state.json — the FSM cursor; never hand-edited
├── inputs/           # sealed input references (input_data_refs)
├── provenance/       # SHA-256 provenance records for every sealed artifact
├── results/
│   ├── validation/   # validator outputs by phase (L1–L5)
│   ├── qc/           # quality-control artifacts
│   └── final/        # promoted, release-ready results
├── scratch/          # ephemeral working space; never promoted
├── archive/          # superseded runs, retained for audit
└── freeze.lock       # the seal — closes the run to further mutation

The canonical artifacts a sealed run is expected to produce are fixed by the pack's golden cases — for fine-mapping: result.json, report.md, provenance.json, gate_status.json, and session_summary.json; for base: phase_state.json and freeze.lock.

Lease and seal

Protected state files are never mutated directly. State advances under a lease (acquired for a transition) and is closed by a seal (freeze.lock plus SHA-256 provenance). Use scipi transition to change state; editing state/ or provenance/ by hand breaks the seal chain and the L5 release validator (provenance-sha256-complete) will block promotion.

Pi as a named surface in the trinity

Pi is one of Hordago's first-class co-scientist surfaces, not an afterthought. The co-scientist surface registry enumerates the parity-grouped surfaces (rest, sdk_python, sdk_typescript, cli, mcp, ui_card) that all drive the same runScienceLoop. Pi is the named terminal surface in that trinity: the governed interactive agent, backed by the headless PiAdapter for batch/CI parity, and by the run cockpit for the UI view. All three enter the same phase FSM and clear the same gates — the surface changes, the governance does not.

Source pointers

  • agents/skills/scipi/SKILL.md — Sci-Pi state-transition discipline
  • src/hordago/science_packs.py — pack resolution, inheritance, instance generation
  • science-packs/base/pack.yaml, science-packs/bioinformatics/pack.yaml, science-packs/fine-mapping/pack.yaml
  • src/hordago/science_runtime.py — phase FSM and sealed-run execution
  • tests/test_science_packs.py — pack inheritance, assurance-plan wiring, and leak guard
  • references/co-scientist-surface-registry.md — parity-grouped co-scientist surfaces
  • references/engine-catalog.md — engines reachable from the orchestrator