Context Engine¶
The Context Engine is the persistent-memory layer of the platform: the
hordago-knowledge-graph engine that stores biomedical entities, claims, and
their provenance as a queryable graph and serves them back as graph-grounded
evidence context. Hordago positions it as the Tier 2a knowledge_query engine
and routes to it through the hordago-kg MCP category.
Three Distinct Context Surfaces¶
The platform runs three separate context/retrieval surfaces. They are complementary, not interchangeable — this page describes the Context Engine and draws the boundary against the other two so audits do not conflate them.
| Surface | Retrieval model | Role |
|---|---|---|
Context Engine (hordago-knowledge-graph) |
Persistent graph + GraphRAG + vector/hybrid | Reasoning over a durable biomedical knowledge graph |
context-discovery-mcp |
BM25 + Reciprocal Rank Fusion (RRF) | Stateless skill/tool discovery index over repo surfaces |
| biocontext7 | Skill catalog (bioinformatics tools) | Tool-discovery MCP over 47K+ bioinformatics tools |
The Context Engine is the only one of the three that persists a knowledge graph;
context-discovery-mcp is a stateless lexical index and biocontext7 is a skill
catalog.
Storage & Retrieval Stack¶
The engine layers a graph store, an analytical store, and a full-text/vector index so a single scientific question can be answered by lexical lookup, vector similarity, hybrid fusion, or multi-hop graph traversal.
| Layer | Technology | Purpose |
|---|---|---|
| Graph store | Neo4j | Nodes/edges for genes, variants, pathways, drugs; Cypher query |
| Analytical store | DuckDB | Columnar analytics over graph-derived tables |
| Full-text index | SQLite FTS5 | Lexical search over node/claim text |
| Vector / hybrid search | HNSW embeddings | Semantic and hybrid (lexical + vector) retrieval |
| Community detection | Louvain | Community summaries for DRIFT-style exploration |
Embeddings are produced with BGE and specter2 models (the embedding
work consumed from hordago-knowledge-graph#162), giving the vector and hybrid
search paths biomedical-tuned representations.
GraphRAG & DRIFT Reasoning¶
Retrieval is exposed as GraphRAG: neighborhood exploration, global summaries, and Microsoft GraphRAG-style DRIFT paths across the graph. The reasoning layer roadmap (entity linking, query decomposition, DRIFT path scoring, Bayesian evidence aggregation, answer synthesis) is tracked as epic E-02.
MCP Tool Surface¶
The engine ships as an MCP server exposing roughly 20 MCP tools grouped into five skills:
| Skill | Coverage |
|---|---|
graph-query |
Query genes, variants, pathways, drugs, and ad-hoc Cypher |
graph-ingest |
Ingest external nodes/edges and sources into the KG |
graph-search |
Full-text and hybrid search |
graph-explore |
Neighborhoods, global summaries, and DRIFT-style paths |
graph-status |
Graph health, schema, cache, and namespace audit |
Source Pointers¶
references/plugins/hordago-knowledge-graph.mdreferences/kg-reasoning-epic.mddocs/adr/003-kg-reasoning-layer.mdreferences/engine-catalog.mdsrc/hordago/kg_ingest.pysrc/hordago/route_intent.py