7-doc kit mapping the JD competency matrix to the ByteLyst ecosystem: ecosystem-as-RAG-fabric architecture, competency deep-dives, STAR bank, enhancement roadmap, banking blueprints, and a glossary quick-ref. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
280 lines
10 KiB
Markdown
280 lines
10 KiB
Markdown
# 01 · The ByteLyst Ecosystem as an Agentic RAG Fabric
|
|
|
|
The trick in this interview is to stop treating ByteLyst as "a bunch of side projects"
|
|
and start describing it as **one governed retrieval fabric with multiple agentic
|
|
front-ends**. Every diagram below is something you can reproduce on a whiteboard.
|
|
|
|
---
|
|
|
|
## 1. System context — what we actually run
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph Users["👤 Humans & Agents"]
|
|
U1[End users<br/>web / mobile]
|
|
U2[Coding agents<br/>claude · codex · devin]
|
|
U3[Operators<br/>Hermes Mission Control]
|
|
end
|
|
|
|
subgraph Fronts["Agentic Product Surfaces"]
|
|
P1["invt_trdg<br/>AI trading chat<br/>(tool-calling over markets)"]
|
|
P2["flowmonk<br/>planning + bounded AI layer"]
|
|
P3["notelett<br/>notes for humans + agents"]
|
|
P4["chronomind<br/>contextual time AI"]
|
|
end
|
|
|
|
subgraph Platform["common_plat — the shared fabric"]
|
|
PS["platform-service :4003<br/>auth · flags · telemetry · billing · blob"]
|
|
ES["extraction-service :4005<br/>URL / doc → retrievable units"]
|
|
MCP["mcp-server :4007<br/>tool / resource registration"]
|
|
LR["packages/llm-router<br/>provider abstraction"]
|
|
end
|
|
|
|
subgraph Data["Governed Data Sources"]
|
|
DB[("Cosmos DB<br/>docs + Gremlin graph")]
|
|
PG[("Postgres<br/>structured + pgvector*")]
|
|
EV[("event-store<br/>immutable audit")]
|
|
BLOB[("blob<br/>raw documents")]
|
|
end
|
|
|
|
subgraph Ops["Control Plane"]
|
|
HERMES["Hermes Mission Control<br/>(devops_tools/dashboard)"]
|
|
AQ["agent-queue<br/>multi-agent runner"]
|
|
end
|
|
|
|
U1 --> P1 & P2 & P3 & P4
|
|
U2 --> AQ
|
|
U3 --> HERMES
|
|
P1 & P2 & P3 & P4 --> PS
|
|
P1 & P2 & P3 & P4 --> MCP
|
|
MCP --> LR
|
|
ES --> BLOB
|
|
PS --> DB & PG & EV
|
|
MCP --> DB & PG
|
|
AQ --> MCP
|
|
HERMES --> PS
|
|
HERMES -.observes.-> ES & MCP & LR
|
|
|
|
classDef plan fill:#fef3c7,stroke:#d97706
|
|
class PG,LR plan
|
|
```
|
|
|
|
> `*` pgvector and the Gremlin graph are the planned hardening (see `04-enhancement-roadmap.md`).
|
|
> Everything else is a real, deployed component of the ecosystem.
|
|
|
|
**How to narrate it:** *"The platform-service is my policy/identity plane, the mcp-server
|
|
is my tool-boundary plane, llm-router is my model plane, and the data sources are governed
|
|
behind both. Any product surface is just a thin agentic UI over that fabric — which is
|
|
exactly the shape of an enterprise agentic-RAG platform."*
|
|
|
|
---
|
|
|
|
## 2. The reference agentic-RAG container view
|
|
|
|
This is the canonical picture the interviewer wants to see — drawn in *our* components.
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
Q[User query] --> ROUTER
|
|
|
|
subgraph Orchestration["Agentic Orchestration (LangGraph-shaped)"]
|
|
ROUTER{{"Router / planner agent<br/>intent + complexity"}}
|
|
RETR["Retriever agent"]
|
|
GRADE{{"Relevance grader<br/>(CRAG gate)"}}
|
|
REWRITE["Query rewriter<br/>(HyDE)"]
|
|
GEN["Generator agent<br/>+ citation enforcer"]
|
|
CRITIC{{"Self-RAG critic<br/>groundedness check"}}
|
|
end
|
|
|
|
subgraph Retrieval["Hybrid Retrieval Fabric"]
|
|
VEC[("Vector<br/>pgvector / Azure AI Search")]
|
|
BM25[("Lexical<br/>BM25")]
|
|
GRAPH[("Graph traversal<br/>Cosmos Gremlin")]
|
|
SQL[("Structured<br/>schema-aware SQL tool")]
|
|
RERANK["Cross-encoder rerank<br/>+ context compression"]
|
|
end
|
|
|
|
subgraph Gov["Governance plane (every hop)"]
|
|
ACL["Access-controlled retrieval<br/>auth + row/col masking"]
|
|
AUDIT["event-store audit trail"]
|
|
KILL["kill-switch / flags"]
|
|
end
|
|
|
|
ROUTER --> RETR
|
|
RETR --> VEC & BM25 & GRAPH & SQL
|
|
VEC & BM25 & GRAPH & SQL --> RERANK
|
|
RERANK --> GRADE
|
|
GRADE -- "low relevance" --> REWRITE --> RETR
|
|
GRADE -- "ok" --> GEN
|
|
GEN --> CRITIC
|
|
CRITIC -- "ungrounded" --> REWRITE
|
|
CRITIC -- "grounded + cited" --> A[Answer + citations]
|
|
|
|
RETR -.enforced by.-> ACL
|
|
GEN -.logged to.-> AUDIT
|
|
ROUTER -.gated by.-> KILL
|
|
```
|
|
|
|
**Key talking points keyed to the JD:**
|
|
- *Hybrid search (vector + BM25 + graph)* → the four parallel retrievers fan-out, reranker fans-in.
|
|
- *Reranking + context compression* → the `RERANK` node (cross-encoder, e.g. ColBERT late-interaction or a bge-reranker).
|
|
- *CRAG* → the `GRADE` gate that triggers corrective re-retrieval.
|
|
- *HyDE* → the `REWRITE` node generating a hypothetical answer to embed.
|
|
- *Self-RAG* → the `CRITIC` node reflecting on groundedness before release.
|
|
- *Access-controlled retrieval / Zero Trust / audit* → the governance plane wraps **every** hop, not just the entrance.
|
|
|
|
---
|
|
|
|
## 3. Multi-agent orchestration topology (we run a real one)
|
|
|
|
`agent-queue/` is a production folder-kanban that drives **three different agent engines**
|
|
(`claude`, `codex`, `devin`) through an explicit state machine. That *is* multi-agent
|
|
orchestration — and it's the strongest "I've shipped agents" story you have.
|
|
|
|
```mermaid
|
|
stateDiagram-v2
|
|
[*] --> inbox: drop prompt .md
|
|
inbox --> doing: runner claims (auto-approve)
|
|
doing --> done: success
|
|
doing --> failed: error / timeout
|
|
failed --> inbox: requeue (human-in-loop)
|
|
done --> [*]
|
|
|
|
note right of doing
|
|
Engine selected per task:
|
|
claude · codex · devin
|
|
= heterogeneous agent pool
|
|
end note
|
|
```
|
|
|
|
Map this to LangGraph vocabulary in the room:
|
|
|
|
| agent-queue concept | LangGraph / agentic equivalent |
|
|
|---|---|
|
|
| `inbox/doing/done/failed` folders | graph **nodes** / state enum |
|
|
| runner claiming + transitioning | **conditional edges** |
|
|
| engine flag (claude/codex/devin) | **tool/agent binding** per node |
|
|
| `failed → inbox` requeue | **cyclic edge** w/ human-in-the-loop checkpoint |
|
|
| live `status`/`watch` | **state checkpointer** + observability |
|
|
|
|
> Honest framing: *"I built this deliberately framework-light to stay bash-portable and
|
|
> dependency-free. The state model is identical to LangGraph; porting it onto LangGraph's
|
|
> `StateGraph` mostly buys me typed state, built-in checkpointing, and the A2A handoff
|
|
> contract — which is exactly the enhancement I've scoped."*
|
|
|
|
---
|
|
|
|
## 4. MCP server — Zero-Trust tool boundary
|
|
|
|
This is your strongest *governance* asset and a direct hit on a Preferred Qualification
|
|
("MCP server architecture, tool/resource registration patterns, agentic security threat
|
|
modeling"). We run `mcp-server` on :4007 with `packages/mcp-client`.
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph Agent["Agent (untrusted by default)"]
|
|
A[LLM reasoning loop]
|
|
end
|
|
|
|
subgraph Boundary["mcp-server :4007 — policy enforcement point"]
|
|
REG["Tool / resource registry<br/>(declared, typed, versioned)"]
|
|
AUTHZ{"AuthZ check<br/>identity + scope + role"}
|
|
MASK["Row/column masking<br/>field-encrypt"]
|
|
RATE["Rate / cost limits + kill-switch"]
|
|
LOG["Audit emit → event-store"]
|
|
end
|
|
|
|
subgraph Resources["Governed resources"]
|
|
T1[Market data tool]
|
|
T2[Doc retrieval tool]
|
|
T3[Graph query tool]
|
|
T4[Text-to-SQL tool<br/>read-only views]
|
|
end
|
|
|
|
A -- "tool call (intent)" --> REG
|
|
REG --> AUTHZ
|
|
AUTHZ -- deny --> A
|
|
AUTHZ -- allow --> MASK
|
|
MASK --> RATE
|
|
RATE --> T1 & T2 & T3 & T4
|
|
T1 & T2 & T3 & T4 --> LOG
|
|
LOG --> A
|
|
```
|
|
|
|
**Threat-model talking points** (say these — they signal seniority):
|
|
- **Confused-deputy:** the agent never holds raw credentials; the MCP server exchanges the *user's* scoped identity, so a tool can't be tricked into over-broad reads.
|
|
- **Tool-poisoning / prompt injection via retrieved content:** retrieved text is treated as data, never as instructions; the generator is sandboxed from re-invoking tools without re-passing the AuthZ gate.
|
|
- **Exfiltration:** column masking + egress logging means even a successful injection can't surface PII it wasn't entitled to.
|
|
- **Blast radius:** `kill-switch-client` lets us disable a model or a single tool instantly without redeploying — critical for SR 11-7 "ability to constrain a model in production."
|
|
|
|
---
|
|
|
|
## 5. Governance & grounding plane (the part that wins regulated deals)
|
|
|
|
```mermaid
|
|
flowchart LR
|
|
subgraph Ingest["Ingestion governance"]
|
|
CLASS["Data classification<br/>(public / internal / PII)"]
|
|
EMB["Embedding + metadata tags<br/>tenant · sensitivity · source"]
|
|
end
|
|
subgraph Query["Query-time governance"]
|
|
IDENT["Caller identity + role"]
|
|
FILTER["Namespace + ACL filter<br/>(pre-retrieval)"]
|
|
RETR2["Retrieve only entitled chunks"]
|
|
end
|
|
subgraph Answer["Answer governance"]
|
|
CITE["Mandatory citation<br/>(source attribution)"]
|
|
FAITH["Faithfulness score<br/>(RAGAS / LLM-as-judge)"]
|
|
CARD["Model card + decision log"]
|
|
end
|
|
CLASS --> EMB --> RETR2
|
|
IDENT --> FILTER --> RETR2 --> CITE --> FAITH --> CARD
|
|
FAITH -- "below SLA" --> ABSTAIN["Abstain / escalate to human"]
|
|
```
|
|
|
|
This single diagram covers four JD bullets at once: **access-controlled retrieval**,
|
|
**citation/source attribution**, **faithfulness SLAs**, and **model cards / audit**.
|
|
The `ABSTAIN` branch is the line that separates a demo from a regulated system — *"in
|
|
banking, a confident wrong answer is a worse outcome than 'I don't know, here's a human.'"*
|
|
|
|
---
|
|
|
|
## 6. Multi-tenant / namespace isolation (real concern here already)
|
|
|
|
We *already* think in tenants: every product has a `productId`, and Hermes runs **two
|
|
isolated instances (Vijay / Bheem)** with separate users, services, and backup repos. That
|
|
is the same isolation discipline a vector DB needs.
|
|
|
|
```mermaid
|
|
flowchart TB
|
|
subgraph T_A["Tenant A (productId=invt_trdg)"]
|
|
NSA["Vector namespace A"]
|
|
GA["Graph partition A"]
|
|
SA["SQL schema A (RLS)"]
|
|
end
|
|
subgraph T_B["Tenant B (productId=notelett)"]
|
|
NSB["Vector namespace B"]
|
|
GB["Graph partition B"]
|
|
SB["SQL schema B (RLS)"]
|
|
end
|
|
POLICY["platform-service<br/>tenant resolver + auth"] --> NSA & NSB & GA & GB & SA & SB
|
|
```
|
|
|
|
> *"Namespace isolation isn't a vector-DB feature I'd discover late — it's how the whole
|
|
> platform is partitioned. Pinecone namespaces / Azure AI Search index-per-tenant /
|
|
> pgvector schema-per-tenant are just the storage expression of a `productId` model I
|
|
> already run."*
|
|
|
|
---
|
|
|
|
## Cheat-sheet: which diagram answers which question
|
|
|
|
| If they ask… | Draw |
|
|
|---|---|
|
|
| "Walk me through your RAG architecture" | §2 container view |
|
|
| "How do you orchestrate multiple agents?" | §3 state machine |
|
|
| "How is this secure / Zero Trust?" | §4 MCP boundary |
|
|
| "How do you prevent hallucination in production?" | §5 governance plane (CRITIC + ABSTAIN) |
|
|
| "How do you handle multi-tenancy at scale?" | §6 isolation |
|
|
| "What does your whole platform look like?" | §1 context |
|