bytelyst-devops-tools/docs/INTERVIEW/README.md
Hermes VM 076449268b docs(interview): add Senior Agentic RAG Architect prep kit
7-doc kit mapping the JD competency matrix to the ByteLyst ecosystem:
ecosystem-as-RAG-fabric architecture, competency deep-dives, STAR bank,
enhancement roadmap, banking blueprints, and a glossary quick-ref.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
2026-05-31 10:48:52 +00:00

114 lines
8.9 KiB
Markdown

# Senior Agentic RAG Architect — Interview Prep Kit
> Target role: **Senior Agentic RAG Architect — TEKsystems Global Services, Product Engineering Group**
> Candidate anchor: the **ByteLyst ecosystem** (this monorepo workspace).
> Purpose: turn what we already run in production into a defensible, evidence-backed
> narrative for every line of the job description — plus a concrete roadmap of
> enhancements that make each claim *literally true* if we choose to build them.
This kit is deliberately structured so you can walk into the interview and, for **any**
competency on the matrix, point to (a) a real system we run, (b) an architecture diagram,
(c) a STAR story, and (d) a credible "here's how I'd take it to enterprise scale" answer.
---
## How to use this kit
| If you have… | Read |
|---|---|
| 60 minutes the night before | `06-glossary-quickref.md` then this README's matrix |
| A full prep day | All docs in order 01 → 06 |
| A whiteboard / panel round | `01-ecosystem-rag-fabric.md` + `05-banking-blueprints.md` |
| A behavioral / leadership round | `03-star-interview-bank.md` |
| A "what would you build here" round | `04-enhancement-roadmap.md` |
### Documents
1. **[01-ecosystem-rag-fabric.md](01-ecosystem-rag-fabric.md)** — The ByteLyst ecosystem re-drawn as an agentic RAG retrieval fabric. Context, container, retrieval-pipeline, multi-agent topology, MCP Zero-Trust, and governance diagrams.
2. **[02-competency-deepdives.md](02-competency-deepdives.md)** — Every competency-matrix row: the concept, how it maps to our code, talking points, and honest gaps.
3. **[03-star-interview-bank.md](03-star-interview-bank.md)** — 12 STAR stories grounded in real ecosystem work (Hermes, agent-queue, mcp-server, invt_trdg AI chat, flowmonk grounding, llm-router).
4. **[04-enhancement-roadmap.md](04-enhancement-roadmap.md)** — Buildable enhancements that convert "I understand X" into "I shipped X here": pgvector hybrid retrieval, CRAG/Self-RAG loops, RAGAS eval harness, Cosmos Gremlin knowledge graph, model-card registry in Hermes.
5. **[05-banking-blueprints.md](05-banking-blueprints.md)** — Two client-ready solution blueprints (compliance-document retrieval; customer-support automation) with ADRs, SR 11-7 / EU AI Act alignment, and phased delivery.
6. **[06-glossary-quickref.md](06-glossary-quickref.md)** — Rapid-fire definitions and crisp answers: RAPTOR, HyDE, CRAG, Self-RAG, ColBERT, RAGAS metrics, SR 11-7, EU AI Act, Zero Trust for agents.
---
## The role in one paragraph
Design, build, and tune **enterprise-grade RAG systems that power agentic applications**,
fusing **structured (RDBMS / warehouse), unstructured (PDF / docs / email), and graph
(knowledge-graph / ontology)** sources into one **governed** retrieval fabric. Be the
technical authority across **financial-services** engagements; enforce **grounding,
citation, hallucination mitigation**; own **evaluation harnesses (RAGAS / TruLens /
DeepEval)**; embed **Zero Trust, access-controlled retrieval, SR 11-7 / EU AI Act**
governance; and lead **ADRs, blueprints, roadmaps** for execs and engineers.
---
## Competency matrix → ByteLyst evidence
The JD's matrix is reproduced verbatim in the left columns; the right column is **our
real anchor** in this ecosystem (where it exists today) plus a pointer to the enhancement
that hardens it.
| Competency | Must-have | Nice-to-have | ByteLyst anchor (today → planned) |
|---|---|---|---|
| **Agentic Frameworks** | LangGraph, LangChain (prod-grade) | Google ADK, A2A, AutoGen | `agent-queue/` multi-engine runner (claude·codex·devin) is a real folder-kanban orchestration topology with state transitions (`inbox→doing→done/failed`) — a hand-rolled state machine analogous to LangGraph nodes/edges. `packages/mcp-client` + `mcp-server` (:4007) provide tool binding. → **04 §A** ports the topology onto LangGraph and adds an A2A handoff contract. |
| **RAG Architecture** | Hybrid retrieval, reranking, HyDE, Self-RAG | RAPTOR, multimodal | `packages/extraction` + `extraction-service` (:4005) parse URLs/docs into retrievable units today; `invt_trdg` AI chat already does retrieve-then-reason over structured data. → **04 §B** adds vector+BM25 hybrid, cross-encoder rerank, HyDE & CRAG loops. |
| **Structured Retrieval** | Text-to-SQL, schema-aware retrieval | Snowflake Cortex, BigQuery ML | `invt_trdg` AI chat assistant maps NL → trading actions/queries over a typed domain (quotes, plans, watchlists) — schema-aware tool-calling, the safe cousin of free Text-to-SQL. → **04 §C** adds a guarded Text-to-SQL tool with read-only views + row-level filters. |
| **Unstructured Retrieval** | PDF parsing, layout-aware chunking | Multi-modal pipelines | `packages/extraction` + `extraction-service`; `notelett` ingests structured notes for humans+agents. → **04 §B** adds PyMuPDF/Unstructured.io layout-aware chunking + OCR fallback. |
| **Graph RAG** | KG + vector hybrid | SPARQL, ontology design | We run **Azure Cosmos DB** (`packages/cosmos`); Cosmos exposes the **Gremlin** graph API. `event-store`/`events` already model entity relationships. → **04 §D** stands up a Cosmos Gremlin knowledge graph + graph-augmented retrieval. |
| **Vector Databases** | Pinecone / Weaviate / Azure AI Search | Qdrant, pgvector, multi-tenancy | Postgres is in the stack; **pgvector** is the lowest-friction path. Multi-tenant namespace isolation is already a first-class concern (per-product `productId`, two-instance Hermes Vijay/Bheem). → **04 §B** adds pgvector with per-tenant namespaces. |
| **Grounding & Eval** | RAGAS, TruLens, faithfulness SLAs | LangSmith, LLM-as-judge | `flowmonk` deliberately **bounds the AI layer to explanation/safe recommendation** over a deterministic engine — a production grounding pattern. `diagnostics-client`/`telemetry-client`/`monitoring` + Hermes dashboards are the eval-harness home. → **04 §E** wires a RAGAS/DeepEval harness + drift monitor pane in Hermes. |
| **Cloud Platform** | Azure (AI Foundry, OpenAI, Search) | AWS Bedrock, GCP Vertex | Azure Cosmos DB in prod (`_AZURE/`, `packages/cosmos`); `packages/llm-router` abstracts providers so Azure OpenAI / Bedrock / Vertex are swap-in. → **02** talks Azure AI Search as the managed hybrid index. |
| **AI Governance** | Access-controlled RAG, Zero Trust | SR 11-7, EU AI Act | `packages/auth` + `fastify-auth`, `field-encrypt`/`client-encrypt` (column/field masking), `feature-flag-client` + `kill-switch-client` (instant model kill), `event-store` (immutable audit). MCP tool boundaries are explicit. → **05** maps all of this to SR 11-7 + EU AI Act. |
| **Domain: Banking** | Support / compliance automation | Model risk mgmt, KYC/AML | `invt_trdg` is our regulated-industry analog (markets, trade plans, alerts, auditability). → **05** translates it into a bank customer-support + compliance-retrieval blueprint. |
---
## Honest gap analysis (say this out loud — it builds trust)
Be candid in the interview. Frame it as *"here's what's production-real, here's what's
adjacent, here's exactly how I'd close the gap."*
```mermaid
quadrantChart
title Evidence strength vs. JD centrality
x-axis "Adjacent / planned" --> "Production-real today"
y-axis "Nice-to-have" --> "Core to the role"
quadrant-1 "Lead with these"
quadrant-2 "Build before interview if possible"
quadrant-3 "Mention, don't dwell"
quadrant-4 "Frame as quick wins"
"MCP tool boundaries": [0.82, 0.78]
"Multi-agent orchestration (agent-queue)": [0.75, 0.7]
"Access-controlled / Zero-Trust retrieval": [0.8, 0.85]
"Bounded grounding (flowmonk)": [0.78, 0.9]
"Schema-aware tool-calling (invt_trdg)": [0.72, 0.72]
"LangGraph (prod-grade)": [0.3, 0.88]
"RAGAS / TruLens eval harness": [0.25, 0.86]
"pgvector hybrid retrieval": [0.35, 0.8]
"Cosmos Gremlin Graph RAG": [0.3, 0.6]
"Google ADK / A2A": [0.2, 0.4]
"RAPTOR / HyDE / CRAG / Self-RAG": [0.28, 0.65]
"SR 11-7 / EU AI Act docs": [0.45, 0.7]
```
**Three sentences to own the gap:**
> "My production depth is in **agentic orchestration, MCP tool boundaries, and bounded
> grounding** — the parts that decide whether an agentic system is *safe* in a regulated
> setting. The classic LangChain/LangGraph and RAGAS surface area I've architected and
> can stand up fast; in fact I've scoped exactly that as a roadmap on our own platform.
> What I bring that's harder to hire is the **governance instinct** — designing retrieval
> so that masking, kill-switches, and audit trails are structural, not bolted on."
---
## One-line elevator pitch for the role
> *"I build agentic systems where the interesting engineering is in the **boundaries** —
> what a tool is allowed to retrieve, how a model's output is grounded and cited, and how
> every hop is audited — and I've been running a multi-product ecosystem (MCP servers,
> a multi-agent runner, provider-routed LLMs, encrypted/flagged data access) that is one
> deliberate roadmap away from being a textbook enterprise agentic-RAG fabric."*