# 01 · The ByteLyst Ecosystem as an Agentic RAG Fabric The trick in this interview is to stop treating ByteLyst as "a bunch of side projects" and start describing it as **one governed retrieval fabric with multiple agentic front-ends**. Every diagram below is something you can reproduce on a whiteboard. --- ## 1. System context — what we actually run ```mermaid flowchart TB subgraph Users["👤 Humans & Agents"] U1[End users
web / mobile] U2[Coding agents
claude · codex · devin] U3[Operators
Hermes Mission Control] end subgraph Fronts["Agentic Product Surfaces"] P1["invt_trdg
AI trading chat
(tool-calling over markets)"] P2["flowmonk
planning + bounded AI layer"] P3["notelett
notes for humans + agents"] P4["chronomind
contextual time AI"] end subgraph Platform["common_plat — the shared fabric"] PS["platform-service :4003
auth · flags · telemetry · billing · blob"] ES["extraction-service :4005
URL / doc → retrievable units"] MCP["mcp-server :4007
tool / resource registration"] LR["packages/llm-router
provider abstraction"] end subgraph Data["Governed Data Sources"] DB[("Cosmos DB
docs + Gremlin graph")] PG[("Postgres
structured + pgvector*")] EV[("event-store
immutable audit")] BLOB[("blob
raw documents")] end subgraph Ops["Control Plane"] HERMES["Hermes Mission Control
(devops_tools/dashboard)"] AQ["agent-queue
multi-agent runner"] end U1 --> P1 & P2 & P3 & P4 U2 --> AQ U3 --> HERMES P1 & P2 & P3 & P4 --> PS P1 & P2 & P3 & P4 --> MCP MCP --> LR ES --> BLOB PS --> DB & PG & EV MCP --> DB & PG AQ --> MCP HERMES --> PS HERMES -.observes.-> ES & MCP & LR classDef plan fill:#fef3c7,stroke:#d97706 class PG,LR plan ``` > `*` pgvector and the Gremlin graph are the planned hardening (see `04-enhancement-roadmap.md`). > Everything else is a real, deployed component of the ecosystem. **How to narrate it:** *"The platform-service is my policy/identity plane, the mcp-server is my tool-boundary plane, llm-router is my model plane, and the data sources are governed behind both. Any product surface is just a thin agentic UI over that fabric — which is exactly the shape of an enterprise agentic-RAG platform."* --- ## 2. The reference agentic-RAG container view This is the canonical picture the interviewer wants to see — drawn in *our* components. ```mermaid flowchart LR Q[User query] --> ROUTER subgraph Orchestration["Agentic Orchestration (LangGraph-shaped)"] ROUTER{{"Router / planner agent
intent + complexity"}} RETR["Retriever agent"] GRADE{{"Relevance grader
(CRAG gate)"}} REWRITE["Query rewriter
(HyDE)"] GEN["Generator agent
+ citation enforcer"] CRITIC{{"Self-RAG critic
groundedness check"}} end subgraph Retrieval["Hybrid Retrieval Fabric"] VEC[("Vector
pgvector / Azure AI Search")] BM25[("Lexical
BM25")] GRAPH[("Graph traversal
Cosmos Gremlin")] SQL[("Structured
schema-aware SQL tool")] RERANK["Cross-encoder rerank
+ context compression"] end subgraph Gov["Governance plane (every hop)"] ACL["Access-controlled retrieval
auth + row/col masking"] AUDIT["event-store audit trail"] KILL["kill-switch / flags"] end ROUTER --> RETR RETR --> VEC & BM25 & GRAPH & SQL VEC & BM25 & GRAPH & SQL --> RERANK RERANK --> GRADE GRADE -- "low relevance" --> REWRITE --> RETR GRADE -- "ok" --> GEN GEN --> CRITIC CRITIC -- "ungrounded" --> REWRITE CRITIC -- "grounded + cited" --> A[Answer + citations] RETR -.enforced by.-> ACL GEN -.logged to.-> AUDIT ROUTER -.gated by.-> KILL ``` **Key talking points keyed to the JD:** - *Hybrid search (vector + BM25 + graph)* → the four parallel retrievers fan-out, reranker fans-in. - *Reranking + context compression* → the `RERANK` node (cross-encoder, e.g. ColBERT late-interaction or a bge-reranker). - *CRAG* → the `GRADE` gate that triggers corrective re-retrieval. - *HyDE* → the `REWRITE` node generating a hypothetical answer to embed. - *Self-RAG* → the `CRITIC` node reflecting on groundedness before release. - *Access-controlled retrieval / Zero Trust / audit* → the governance plane wraps **every** hop, not just the entrance. --- ## 3. Multi-agent orchestration topology (we run a real one) `agent-queue/` is a production folder-kanban that drives **three different agent engines** (`claude`, `codex`, `devin`) through an explicit state machine. That *is* multi-agent orchestration — and it's the strongest "I've shipped agents" story you have. ```mermaid stateDiagram-v2 [*] --> inbox: drop prompt .md inbox --> doing: runner claims (auto-approve) doing --> done: success doing --> failed: error / timeout failed --> inbox: requeue (human-in-loop) done --> [*] note right of doing Engine selected per task: claude · codex · devin = heterogeneous agent pool end note ``` Map this to LangGraph vocabulary in the room: | agent-queue concept | LangGraph / agentic equivalent | |---|---| | `inbox/doing/done/failed` folders | graph **nodes** / state enum | | runner claiming + transitioning | **conditional edges** | | engine flag (claude/codex/devin) | **tool/agent binding** per node | | `failed → inbox` requeue | **cyclic edge** w/ human-in-the-loop checkpoint | | live `status`/`watch` | **state checkpointer** + observability | > Honest framing: *"I built this deliberately framework-light to stay bash-portable and > dependency-free. The state model is identical to LangGraph; porting it onto LangGraph's > `StateGraph` mostly buys me typed state, built-in checkpointing, and the A2A handoff > contract — which is exactly the enhancement I've scoped."* --- ## 4. MCP server — Zero-Trust tool boundary This is your strongest *governance* asset and a direct hit on a Preferred Qualification ("MCP server architecture, tool/resource registration patterns, agentic security threat modeling"). We run `mcp-server` on :4007 with `packages/mcp-client`. ```mermaid flowchart TB subgraph Agent["Agent (untrusted by default)"] A[LLM reasoning loop] end subgraph Boundary["mcp-server :4007 — policy enforcement point"] REG["Tool / resource registry
(declared, typed, versioned)"] AUTHZ{"AuthZ check
identity + scope + role"} MASK["Row/column masking
field-encrypt"] RATE["Rate / cost limits + kill-switch"] LOG["Audit emit → event-store"] end subgraph Resources["Governed resources"] T1[Market data tool] T2[Doc retrieval tool] T3[Graph query tool] T4[Text-to-SQL tool
read-only views] end A -- "tool call (intent)" --> REG REG --> AUTHZ AUTHZ -- deny --> A AUTHZ -- allow --> MASK MASK --> RATE RATE --> T1 & T2 & T3 & T4 T1 & T2 & T3 & T4 --> LOG LOG --> A ``` **Threat-model talking points** (say these — they signal seniority): - **Confused-deputy:** the agent never holds raw credentials; the MCP server exchanges the *user's* scoped identity, so a tool can't be tricked into over-broad reads. - **Tool-poisoning / prompt injection via retrieved content:** retrieved text is treated as data, never as instructions; the generator is sandboxed from re-invoking tools without re-passing the AuthZ gate. - **Exfiltration:** column masking + egress logging means even a successful injection can't surface PII it wasn't entitled to. - **Blast radius:** `kill-switch-client` lets us disable a model or a single tool instantly without redeploying — critical for SR 11-7 "ability to constrain a model in production." --- ## 5. Governance & grounding plane (the part that wins regulated deals) ```mermaid flowchart LR subgraph Ingest["Ingestion governance"] CLASS["Data classification
(public / internal / PII)"] EMB["Embedding + metadata tags
tenant · sensitivity · source"] end subgraph Query["Query-time governance"] IDENT["Caller identity + role"] FILTER["Namespace + ACL filter
(pre-retrieval)"] RETR2["Retrieve only entitled chunks"] end subgraph Answer["Answer governance"] CITE["Mandatory citation
(source attribution)"] FAITH["Faithfulness score
(RAGAS / LLM-as-judge)"] CARD["Model card + decision log"] end CLASS --> EMB --> RETR2 IDENT --> FILTER --> RETR2 --> CITE --> FAITH --> CARD FAITH -- "below SLA" --> ABSTAIN["Abstain / escalate to human"] ``` This single diagram covers four JD bullets at once: **access-controlled retrieval**, **citation/source attribution**, **faithfulness SLAs**, and **model cards / audit**. The `ABSTAIN` branch is the line that separates a demo from a regulated system — *"in banking, a confident wrong answer is a worse outcome than 'I don't know, here's a human.'"* --- ## 6. Multi-tenant / namespace isolation (real concern here already) We *already* think in tenants: every product has a `productId`, and Hermes runs **two isolated instances (Vijay / Bheem)** with separate users, services, and backup repos. That is the same isolation discipline a vector DB needs. ```mermaid flowchart TB subgraph T_A["Tenant A (productId=invt_trdg)"] NSA["Vector namespace A"] GA["Graph partition A"] SA["SQL schema A (RLS)"] end subgraph T_B["Tenant B (productId=notelett)"] NSB["Vector namespace B"] GB["Graph partition B"] SB["SQL schema B (RLS)"] end POLICY["platform-service
tenant resolver + auth"] --> NSA & NSB & GA & GB & SA & SB ``` > *"Namespace isolation isn't a vector-DB feature I'd discover late — it's how the whole > platform is partitioned. Pinecone namespaces / Azure AI Search index-per-tenant / > pgvector schema-per-tenant are just the storage expression of a `productId` model I > already run."* --- ## Cheat-sheet: which diagram answers which question | If they ask… | Draw | |---|---| | "Walk me through your RAG architecture" | §2 container view | | "How do you orchestrate multiple agents?" | §3 state machine | | "How is this secure / Zero Trust?" | §4 MCP boundary | | "How do you prevent hallucination in production?" | §5 governance plane (CRITIC + ABSTAIN) | | "How do you handle multi-tenancy at scale?" | §6 isolation | | "What does your whole platform look like?" | §1 context |