Hermes VM 076449268b docs(interview): add Senior Agentic RAG Architect prep kit

7-doc kit mapping the JD competency matrix to the ByteLyst ecosystem:
ecosystem-as-RAG-fabric architecture, competency deep-dives, STAR bank,
enhancement roadmap, banking blueprints, and a glossary quick-ref.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

2026-05-31 10:48:52 +00:00

8.9 KiB

Raw Blame History

Senior Agentic RAG Architect — Interview Prep Kit

Target role: Senior Agentic RAG Architect — TEKsystems Global Services, Product Engineering Group Candidate anchor: the ByteLyst ecosystem (this monorepo workspace). Purpose: turn what we already run in production into a defensible, evidence-backed narrative for every line of the job description — plus a concrete roadmap of enhancements that make each claim literally true if we choose to build them.

This kit is deliberately structured so you can walk into the interview and, for any competency on the matrix, point to (a) a real system we run, (b) an architecture diagram, (c) a STAR story, and (d) a credible "here's how I'd take it to enterprise scale" answer.

How to use this kit

If you have…	Read
60 minutes the night before	`06-glossary-quickref.md` then this README's matrix
A full prep day	All docs in order 01 → 06
A whiteboard / panel round	`01-ecosystem-rag-fabric.md` + `05-banking-blueprints.md`
A behavioral / leadership round	`03-star-interview-bank.md`
A "what would you build here" round	`04-enhancement-roadmap.md`

Documents

01-ecosystem-rag-fabric.md — The ByteLyst ecosystem re-drawn as an agentic RAG retrieval fabric. Context, container, retrieval-pipeline, multi-agent topology, MCP Zero-Trust, and governance diagrams.
02-competency-deepdives.md — Every competency-matrix row: the concept, how it maps to our code, talking points, and honest gaps.
03-star-interview-bank.md — 12 STAR stories grounded in real ecosystem work (Hermes, agent-queue, mcp-server, invt_trdg AI chat, flowmonk grounding, llm-router).
04-enhancement-roadmap.md — Buildable enhancements that convert "I understand X" into "I shipped X here": pgvector hybrid retrieval, CRAG/Self-RAG loops, RAGAS eval harness, Cosmos Gremlin knowledge graph, model-card registry in Hermes.
05-banking-blueprints.md — Two client-ready solution blueprints (compliance-document retrieval; customer-support automation) with ADRs, SR 11-7 / EU AI Act alignment, and phased delivery.
06-glossary-quickref.md — Rapid-fire definitions and crisp answers: RAPTOR, HyDE, CRAG, Self-RAG, ColBERT, RAGAS metrics, SR 11-7, EU AI Act, Zero Trust for agents.

The role in one paragraph

Design, build, and tune enterprise-grade RAG systems that power agentic applications, fusing structured (RDBMS / warehouse), unstructured (PDF / docs / email), and graph (knowledge-graph / ontology) sources into one governed retrieval fabric. Be the technical authority across financial-services engagements; enforce grounding, citation, hallucination mitigation; own evaluation harnesses (RAGAS / TruLens / DeepEval); embed Zero Trust, access-controlled retrieval, SR 11-7 / EU AI Act governance; and lead ADRs, blueprints, roadmaps for execs and engineers.

Competency matrix → ByteLyst evidence

The JD's matrix is reproduced verbatim in the left columns; the right column is our real anchor in this ecosystem (where it exists today) plus a pointer to the enhancement that hardens it.

Competency	Must-have	Nice-to-have	ByteLyst anchor (today → planned)
Agentic Frameworks	LangGraph, LangChain (prod-grade)	Google ADK, A2A, AutoGen	`agent-queue/` multi-engine runner (claude·codex·devin) is a real folder-kanban orchestration topology with state transitions (`inbox→doing→done/failed`) — a hand-rolled state machine analogous to LangGraph nodes/edges. `packages/mcp-client` + `mcp-server` (:4007) provide tool binding. → 04 §A ports the topology onto LangGraph and adds an A2A handoff contract.
RAG Architecture	Hybrid retrieval, reranking, HyDE, Self-RAG	RAPTOR, multimodal	`packages/extraction` + `extraction-service` (:4005) parse URLs/docs into retrievable units today; `invt_trdg` AI chat already does retrieve-then-reason over structured data. → 04 §B adds vector+BM25 hybrid, cross-encoder rerank, HyDE & CRAG loops.
Structured Retrieval	Text-to-SQL, schema-aware retrieval	Snowflake Cortex, BigQuery ML	`invt_trdg` AI chat assistant maps NL → trading actions/queries over a typed domain (quotes, plans, watchlists) — schema-aware tool-calling, the safe cousin of free Text-to-SQL. → 04 §C adds a guarded Text-to-SQL tool with read-only views + row-level filters.
Unstructured Retrieval	PDF parsing, layout-aware chunking	Multi-modal pipelines	`packages/extraction` + `extraction-service`; `notelett` ingests structured notes for humans+agents. → 04 §B adds PyMuPDF/Unstructured.io layout-aware chunking + OCR fallback.
Graph RAG	KG + vector hybrid	SPARQL, ontology design	We run Azure Cosmos DB (`packages/cosmos`); Cosmos exposes the Gremlin graph API. `event-store`/`events` already model entity relationships. → 04 §D stands up a Cosmos Gremlin knowledge graph + graph-augmented retrieval.
Vector Databases	Pinecone / Weaviate / Azure AI Search	Qdrant, pgvector, multi-tenancy	Postgres is in the stack; pgvector is the lowest-friction path. Multi-tenant namespace isolation is already a first-class concern (per-product `productId`, two-instance Hermes Vijay/Bheem). → 04 §B adds pgvector with per-tenant namespaces.
Grounding & Eval	RAGAS, TruLens, faithfulness SLAs	LangSmith, LLM-as-judge	`flowmonk` deliberately bounds the AI layer to explanation/safe recommendation over a deterministic engine — a production grounding pattern. `diagnostics-client`/`telemetry-client`/`monitoring` + Hermes dashboards are the eval-harness home. → 04 §E wires a RAGAS/DeepEval harness + drift monitor pane in Hermes.
Cloud Platform	Azure (AI Foundry, OpenAI, Search)	AWS Bedrock, GCP Vertex	Azure Cosmos DB in prod (`_AZURE/`, `packages/cosmos`); `packages/llm-router` abstracts providers so Azure OpenAI / Bedrock / Vertex are swap-in. → 02 talks Azure AI Search as the managed hybrid index.
AI Governance	Access-controlled RAG, Zero Trust	SR 11-7, EU AI Act	`packages/auth` + `fastify-auth`, `field-encrypt`/`client-encrypt` (column/field masking), `feature-flag-client` + `kill-switch-client` (instant model kill), `event-store` (immutable audit). MCP tool boundaries are explicit. → 05 maps all of this to SR 11-7 + EU AI Act.
Domain: Banking	Support / compliance automation	Model risk mgmt, KYC/AML	`invt_trdg` is our regulated-industry analog (markets, trade plans, alerts, auditability). → 05 translates it into a bank customer-support + compliance-retrieval blueprint.

Honest gap analysis (say this out loud — it builds trust)

Be candid in the interview. Frame it as "here's what's production-real, here's what's adjacent, here's exactly how I'd close the gap."

quadrantChart
    title Evidence strength vs. JD centrality
    x-axis "Adjacent / planned" --> "Production-real today"
    y-axis "Nice-to-have" --> "Core to the role"
    quadrant-1 "Lead with these"
    quadrant-2 "Build before interview if possible"
    quadrant-3 "Mention, don't dwell"
    quadrant-4 "Frame as quick wins"
    "MCP tool boundaries": [0.82, 0.78]
    "Multi-agent orchestration (agent-queue)": [0.75, 0.7]
    "Access-controlled / Zero-Trust retrieval": [0.8, 0.85]
    "Bounded grounding (flowmonk)": [0.78, 0.9]
    "Schema-aware tool-calling (invt_trdg)": [0.72, 0.72]
    "LangGraph (prod-grade)": [0.3, 0.88]
    "RAGAS / TruLens eval harness": [0.25, 0.86]
    "pgvector hybrid retrieval": [0.35, 0.8]
    "Cosmos Gremlin Graph RAG": [0.3, 0.6]
    "Google ADK / A2A": [0.2, 0.4]
    "RAPTOR / HyDE / CRAG / Self-RAG": [0.28, 0.65]
    "SR 11-7 / EU AI Act docs": [0.45, 0.7]

Three sentences to own the gap:

"My production depth is in agentic orchestration, MCP tool boundaries, and bounded grounding — the parts that decide whether an agentic system is safe in a regulated setting. The classic LangChain/LangGraph and RAGAS surface area I've architected and can stand up fast; in fact I've scoped exactly that as a roadmap on our own platform. What I bring that's harder to hire is the governance instinct — designing retrieval so that masking, kill-switches, and audit trails are structural, not bolted on."

One-line elevator pitch for the role

"I build agentic systems where the interesting engineering is in the boundaries — what a tool is allowed to retrieve, how a model's output is grounded and cited, and how every hop is audited — and I've been running a multi-product ecosystem (MCP servers, a multi-agent runner, provider-routed LLMs, encrypted/flagged data access) that is one deliberate roadmap away from being a textbook enterprise agentic-RAG fabric."

8.9 KiB Raw Blame History