# 01 · The ByteLyst Ecosystem as an Agentic RAG Fabric
The trick in this interview is to stop treating ByteLyst as "a bunch of side projects"
and start describing it as **one governed retrieval fabric with multiple agentic
front-ends**. Every diagram below is something you can reproduce on a whiteboard.
---
## 1. System context — what we actually run
```mermaid
flowchart TB
subgraph Users["👤 Humans & Agents"]
U1[End users
web / mobile]
U2[Coding agents
claude · codex · devin]
U3[Operators
Hermes Mission Control]
end
subgraph Fronts["Agentic Product Surfaces"]
P1["invt_trdg
AI trading chat
(tool-calling over markets)"]
P2["flowmonk
planning + bounded AI layer"]
P3["notelett
notes for humans + agents"]
P4["chronomind
contextual time AI"]
end
subgraph Platform["common_plat — the shared fabric"]
PS["platform-service :4003
auth · flags · telemetry · billing · blob"]
ES["extraction-service :4005
URL / doc → retrievable units"]
MCP["mcp-server :4007
tool / resource registration"]
LR["packages/llm-router
provider abstraction"]
end
subgraph Data["Governed Data Sources"]
DB[("Cosmos DB
docs + Gremlin graph")]
PG[("Postgres
structured + pgvector*")]
EV[("event-store
immutable audit")]
BLOB[("blob
raw documents")]
end
subgraph Ops["Control Plane"]
HERMES["Hermes Mission Control
(devops_tools/dashboard)"]
AQ["agent-queue
multi-agent runner"]
end
U1 --> P1 & P2 & P3 & P4
U2 --> AQ
U3 --> HERMES
P1 & P2 & P3 & P4 --> PS
P1 & P2 & P3 & P4 --> MCP
MCP --> LR
ES --> BLOB
PS --> DB & PG & EV
MCP --> DB & PG
AQ --> MCP
HERMES --> PS
HERMES -.observes.-> ES & MCP & LR
classDef plan fill:#fef3c7,stroke:#d97706
class PG,LR plan
```
> `*` pgvector and the Gremlin graph are the planned hardening (see `04-enhancement-roadmap.md`).
> Everything else is a real, deployed component of the ecosystem.
**How to narrate it:** *"The platform-service is my policy/identity plane, the mcp-server
is my tool-boundary plane, llm-router is my model plane, and the data sources are governed
behind both. Any product surface is just a thin agentic UI over that fabric — which is
exactly the shape of an enterprise agentic-RAG platform."*
---
## 2. The reference agentic-RAG container view
This is the canonical picture the interviewer wants to see — drawn in *our* components.
```mermaid
flowchart LR
Q[User query] --> ROUTER
subgraph Orchestration["Agentic Orchestration (LangGraph-shaped)"]
ROUTER{{"Router / planner agent
intent + complexity"}}
RETR["Retriever agent"]
GRADE{{"Relevance grader
(CRAG gate)"}}
REWRITE["Query rewriter
(HyDE)"]
GEN["Generator agent
+ citation enforcer"]
CRITIC{{"Self-RAG critic
groundedness check"}}
end
subgraph Retrieval["Hybrid Retrieval Fabric"]
VEC[("Vector
pgvector / Azure AI Search")]
BM25[("Lexical
BM25")]
GRAPH[("Graph traversal
Cosmos Gremlin")]
SQL[("Structured
schema-aware SQL tool")]
RERANK["Cross-encoder rerank
+ context compression"]
end
subgraph Gov["Governance plane (every hop)"]
ACL["Access-controlled retrieval
auth + row/col masking"]
AUDIT["event-store audit trail"]
KILL["kill-switch / flags"]
end
ROUTER --> RETR
RETR --> VEC & BM25 & GRAPH & SQL
VEC & BM25 & GRAPH & SQL --> RERANK
RERANK --> GRADE
GRADE -- "low relevance" --> REWRITE --> RETR
GRADE -- "ok" --> GEN
GEN --> CRITIC
CRITIC -- "ungrounded" --> REWRITE
CRITIC -- "grounded + cited" --> A[Answer + citations]
RETR -.enforced by.-> ACL
GEN -.logged to.-> AUDIT
ROUTER -.gated by.-> KILL
```
**Key talking points keyed to the JD:**
- *Hybrid search (vector + BM25 + graph)* → the four parallel retrievers fan-out, reranker fans-in.
- *Reranking + context compression* → the `RERANK` node (cross-encoder, e.g. ColBERT late-interaction or a bge-reranker).
- *CRAG* → the `GRADE` gate that triggers corrective re-retrieval.
- *HyDE* → the `REWRITE` node generating a hypothetical answer to embed.
- *Self-RAG* → the `CRITIC` node reflecting on groundedness before release.
- *Access-controlled retrieval / Zero Trust / audit* → the governance plane wraps **every** hop, not just the entrance.
---
## 3. Multi-agent orchestration topology (we run a real one)
`agent-queue/` is a production folder-kanban that drives **three different agent engines**
(`claude`, `codex`, `devin`) through an explicit state machine. That *is* multi-agent
orchestration — and it's the strongest "I've shipped agents" story you have.
```mermaid
stateDiagram-v2
[*] --> inbox: drop prompt .md
inbox --> doing: runner claims (auto-approve)
doing --> done: success
doing --> failed: error / timeout
failed --> inbox: requeue (human-in-loop)
done --> [*]
note right of doing
Engine selected per task:
claude · codex · devin
= heterogeneous agent pool
end note
```
Map this to LangGraph vocabulary in the room:
| agent-queue concept | LangGraph / agentic equivalent |
|---|---|
| `inbox/doing/done/failed` folders | graph **nodes** / state enum |
| runner claiming + transitioning | **conditional edges** |
| engine flag (claude/codex/devin) | **tool/agent binding** per node |
| `failed → inbox` requeue | **cyclic edge** w/ human-in-the-loop checkpoint |
| live `status`/`watch` | **state checkpointer** + observability |
> Honest framing: *"I built this deliberately framework-light to stay bash-portable and
> dependency-free. The state model is identical to LangGraph; porting it onto LangGraph's
> `StateGraph` mostly buys me typed state, built-in checkpointing, and the A2A handoff
> contract — which is exactly the enhancement I've scoped."*
---
## 4. MCP server — Zero-Trust tool boundary
This is your strongest *governance* asset and a direct hit on a Preferred Qualification
("MCP server architecture, tool/resource registration patterns, agentic security threat
modeling"). We run `mcp-server` on :4007 with `packages/mcp-client`.
```mermaid
flowchart TB
subgraph Agent["Agent (untrusted by default)"]
A[LLM reasoning loop]
end
subgraph Boundary["mcp-server :4007 — policy enforcement point"]
REG["Tool / resource registry
(declared, typed, versioned)"]
AUTHZ{"AuthZ check
identity + scope + role"}
MASK["Row/column masking
field-encrypt"]
RATE["Rate / cost limits + kill-switch"]
LOG["Audit emit → event-store"]
end
subgraph Resources["Governed resources"]
T1[Market data tool]
T2[Doc retrieval tool]
T3[Graph query tool]
T4[Text-to-SQL tool
read-only views]
end
A -- "tool call (intent)" --> REG
REG --> AUTHZ
AUTHZ -- deny --> A
AUTHZ -- allow --> MASK
MASK --> RATE
RATE --> T1 & T2 & T3 & T4
T1 & T2 & T3 & T4 --> LOG
LOG --> A
```
**Threat-model talking points** (say these — they signal seniority):
- **Confused-deputy:** the agent never holds raw credentials; the MCP server exchanges the *user's* scoped identity, so a tool can't be tricked into over-broad reads.
- **Tool-poisoning / prompt injection via retrieved content:** retrieved text is treated as data, never as instructions; the generator is sandboxed from re-invoking tools without re-passing the AuthZ gate.
- **Exfiltration:** column masking + egress logging means even a successful injection can't surface PII it wasn't entitled to.
- **Blast radius:** `kill-switch-client` lets us disable a model or a single tool instantly without redeploying — critical for SR 11-7 "ability to constrain a model in production."
---
## 5. Governance & grounding plane (the part that wins regulated deals)
```mermaid
flowchart LR
subgraph Ingest["Ingestion governance"]
CLASS["Data classification
(public / internal / PII)"]
EMB["Embedding + metadata tags
tenant · sensitivity · source"]
end
subgraph Query["Query-time governance"]
IDENT["Caller identity + role"]
FILTER["Namespace + ACL filter
(pre-retrieval)"]
RETR2["Retrieve only entitled chunks"]
end
subgraph Answer["Answer governance"]
CITE["Mandatory citation
(source attribution)"]
FAITH["Faithfulness score
(RAGAS / LLM-as-judge)"]
CARD["Model card + decision log"]
end
CLASS --> EMB --> RETR2
IDENT --> FILTER --> RETR2 --> CITE --> FAITH --> CARD
FAITH -- "below SLA" --> ABSTAIN["Abstain / escalate to human"]
```
This single diagram covers four JD bullets at once: **access-controlled retrieval**,
**citation/source attribution**, **faithfulness SLAs**, and **model cards / audit**.
The `ABSTAIN` branch is the line that separates a demo from a regulated system — *"in
banking, a confident wrong answer is a worse outcome than 'I don't know, here's a human.'"*
---
## 6. Multi-tenant / namespace isolation (real concern here already)
We *already* think in tenants: every product has a `productId`, and Hermes runs **two
isolated instances (Vijay / Bheem)** with separate users, services, and backup repos. That
is the same isolation discipline a vector DB needs.
```mermaid
flowchart TB
subgraph T_A["Tenant A (productId=invt_trdg)"]
NSA["Vector namespace A"]
GA["Graph partition A"]
SA["SQL schema A (RLS)"]
end
subgraph T_B["Tenant B (productId=notelett)"]
NSB["Vector namespace B"]
GB["Graph partition B"]
SB["SQL schema B (RLS)"]
end
POLICY["platform-service
tenant resolver + auth"] --> NSA & NSB & GA & GB & SA & SB
```
> *"Namespace isolation isn't a vector-DB feature I'd discover late — it's how the whole
> platform is partitioned. Pinecone namespaces / Azure AI Search index-per-tenant /
> pgvector schema-per-tenant are just the storage expression of a `productId` model I
> already run."*
---
## Cheat-sheet: which diagram answers which question
| If they ask… | Draw |
|---|---|
| "Walk me through your RAG architecture" | §2 container view |
| "How do you orchestrate multiple agents?" | §3 state machine |
| "How is this secure / Zero Trust?" | §4 MCP boundary |
| "How do you prevent hallucination in production?" | §5 governance plane (CRITIC + ABSTAIN) |
| "How do you handle multi-tenancy at scale?" | §6 isolation |
| "What does your whole platform look like?" | §1 context |