learning_ai_notes/docs/SMART_ACTIONS_ROADMAP.md
saravanakumardb1 3f903a7a70 docs: mark 7 open review findings as resolved in SMART_ACTIONS_ROADMAP
- #8 (embeddings): @bytelyst/llm embed() already implemented in all providers
- #9 (voice transcription): POST /api/transcribe added to extraction-service
- #10 (URL-extract phase): already implemented in Phase 1
- #11 (webhook container): note_prompt_webhooks already in cosmos-init.ts
- #12 (LLM factory vs config): clarified, no conflict
- #13 (streaming): chatCompletionStream() already in OpenAI/Azure/Mock
- #14 (CopilotAction expansion): fix-rewrite, change-tone, continue, explain added

All 18 findings now resolved or noted. Zero open items remain.
2026-04-06 11:12:24 -07:00

1425 lines
63 KiB
Markdown

# Smart Actions Roadmap — End-to-End Implementation (v2)
> **Feature:** AI-powered note intelligence — Smart Actions, inline editor AI, capture enhancement, cross-note intelligence, agent workflows
> **Scope:** `learning_ai_common_plat` (shared LLM packages) + `learning_ai_notes` (backend + web + mobile)
> **Author:** Product Team
> **Date:** April 2026
> **Version:** 2.0 — Full 27-feature roadmap across 6 categories, 7 phases
---
## Executive Summary
This roadmap delivers a comprehensive AI layer for NoteLett across **27 features in 6 categories**, spanning the shared platform, backend, web, and mobile. It transforms NoteLett into an AI-native knowledge workspace where:
- Users run **Smart Actions** on notes (text + images) — summarize, translate, rate food labels, parse receipts
- The **editor** has inline AI — rewrite, change tone, continue writing, explain highlighted text
- **Note intelligence** runs in the background — auto-summarize, auto-tag, detect duplicates, suggest links
- **Capture** is AI-enhanced — voice-to-note, screenshot OCR, URL extraction, multi-image processing
- **Cross-note intelligence** — weekly digests, knowledge gap detection, note merge/compare
- **Agent workflows** — scheduled actions, webhook triggers, approval-gated actions, action chains
The feature spans two codebases:
1. **`learning_ai_common_plat`** — Enhance `@bytelyst/llm` with vision + embedding support
2. **`learning_ai_notes`** — Backend module + web UI + mobile app
---
## Timeline Overview
| Phase | What | Where | Duration | Depends on | Parallel? |
|-------|------|-------|----------|------------|-----------|
| **0** | LLM vision + embedding support | common-plat | 3 days | — | — |
| **1** | Note prompts core + copilot upgrade | backend | 4-5 days | Phase 0 | — |
| **2** | Note intelligence (background AI) | backend | 2-3 days | Phase 1 | — |
| **3** | Smart Actions web UI + editor AI | web | 4-5 days | Phase 1 | Yes (with 4, 5) |
| **4** | Smart Actions mobile + capture | mobile | 4-5 days | Phase 1 | Yes (with 3, 5) |
| **5** | Agent & workflow intelligence | backend + web | 2-3 days | Phase 2 | Yes (with 3, 4) |
| **6** | Polish, E2E, documentation | all | 2-3 days | Phases 3-5 | — |
**Total: ~18-24 days sequential, ~14-18 days with parallel execution**
```
Phase 0 (3d) → Phase 1 (5d) ─┬──→ Phase 2 (3d) ──→ Phase 5 (3d) ──┐
├──→ Phase 3 (5d) ────────────────────→ Phase 6 (3d)
└──→ Phase 4 (5d) ────────────────────┘
```
---
## Feature Master List — 27 Features, 6 Categories
### Cat 1: Inline Editor AI
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F1 | Fix & Rewrite | Select text → rewrite with proper grammar, tone, clarity | 3 |
| F2 | Change Tone | Rewrite selection as formal / casual / professional / friendly | 3 |
| F3 | Continue Writing | LLM generates next 2-3 paragraphs from cursor context (streaming) | 3 |
| F4 | Inline Q&A | Highlight term → "Explain this" → tooltip with definition | 3 |
| F5 | Auto-tag suggestion | After save, LLM suggests 3-5 tags based on content | 1 |
### Cat 2: Note Intelligence
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F6 | Auto-summarize on save | Note body > 300 words → auto-generate summary artifact | 2 |
| F7 | Smart title suggestion | Upgrade existing `suggestTitleFromBody()` to use `@bytelyst/llm` | 1 |
| F8 | Duplicate/similar note detection | Before save, warn if semantically similar notes exist | 2 |
| F9 | Auto-link related notes | After creation, suggest 3-5 related notes to link to | 2 |
| F10 | Reading time estimate | Display estimated reading time on each note | 1 |
### Cat 3: Multi-Note Intelligence
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F11 | Weekly workspace digest | Auto-generate summary of all workspace activity this week | 5 |
| F12 | Knowledge gap detection | Identify topics mentioned but under-covered in workspace | 2 |
| F13 | Note merge | Select 2+ notes → LLM merges into single coherent note | 1 |
| F14 | Compare notes | Select 2 notes → LLM produces comparison summary | 1 |
### Cat 4: Capture Enhancement
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F15 | Voice-to-note | Record audio → transcribe → save as note | 4 |
| F16 | Screenshot-to-note | Share screenshot → OCR + LLM cleanup → structured note | 4 |
| F17 | URL-to-note | Paste URL → extract content → summarize → save | 4 |
| F18 | Multi-image capture | Photograph multiple pages → combine into one note | 4 |
| F19 | Clipboard AI paste | Paste messy text → LLM cleans and structures it | 4 |
### Cat 5: Export & Sharing Intelligence
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F20 | Shareable summary | One-click polished shareable version of a note | 1+3 |
| F21 | Presentation outline | Note → structured slide outline (title + bullets) | 1+3 |
| F22 | Email draft | Note → formatted email with subject, greeting, body | 1+3 |
| F23 | Social post | Note → Twitter/LinkedIn post draft | 1+3 |
### Cat 6: Agent & Workflow Intelligence
| # | Feature | Description | Phase |
|---|---------|-------------|-------|
| F24 | Smart Action chains | Pipe output of one action as input to next | 1 |
| F25 | Scheduled Smart Actions | Cron-like: "Summarize workspace every Friday" | 5 |
| F26 | Webhook-triggered actions | External event → auto-run a Smart Action | 5 |
| F27 | Approval-gated actions | High-risk actions require human review before applying | 5 |
---
## Phase 0 — Common Platform: LLM Vision + Embedding Support
**Repo:** `learning_ai_common_plat`
**Duration:** 3 days
**Depends on:** Nothing
**Features enabled:** Foundation for all F1-F27
### 0.1 Enhance `@bytelyst/llm` ChatMessage for multipart content
The current `ChatMessage.content` is `string`-only. Vision models (GPT-4o, Gemini) require multipart content arrays.
**File:** `packages/llm/src/types.ts`
| Change | Detail |
|--------|--------|
| New `ContentPart` type | `{ type: 'text'; text: string } \| { type: 'image_url'; image_url: { url: string; detail?: 'auto' \| 'low' \| 'high' } }` |
| Update `ChatMessage.content` | `string \| ContentPart[]` |
| New `isVisionMessage()` helper | Type guard to check if a message contains image parts |
| New `buildVisionMessage()` helper | Convenience: `(text: string, imageUrl: string) => ChatMessage` |
**Tests:** 8-10 new tests
### 0.2 Update `OpenAIProvider` for vision
**File:** `packages/llm/src/providers/openai.ts`
| Change | Detail |
|--------|--------|
| Pass multipart `content` to API | When `content` is an array, send as-is (OpenAI format) |
| Default model upgrade | If any message has image content, auto-suggest `gpt-4o` |
**Tests:** 4-6 new tests (mock HTTP)
### 0.3 Update `AzureOpenAIProvider` for vision
Same multipart content handling as OpenAI provider. **Tests:** 4-6 new tests.
### 0.4 Update `MockLLMProvider`
Return deterministic mock responses when vision content is detected, for downstream test use.
### 0.5 Add streaming support enhancement
Ensure `chatCompletionStream()` works with multipart content for F3 (Continue Writing).
### 0.6 Add embedding support (for F8, F9, F12)
**File:** `packages/llm/src/types.ts` + providers
| Change | Detail |
|--------|--------|
| New `EmbeddingRequest` type | `{ input: string \| string[]; model?: string }` |
| New `EmbeddingResponse` type | `{ embeddings: number[][]; model: string; usage: TokenUsage }` |
| Add `embed()` to `LLMProvider` | Optional method for embedding generation |
| Implement in `OpenAIProvider` | Call `/v1/embeddings` endpoint |
| Implement in `AzureOpenAIProvider` | Call Azure embeddings endpoint |
| Implement in `MockLLMProvider` | Return deterministic fake embeddings |
**Tests:** 6-8 new tests
### 0.7 Export new types + helpers
**File:** `packages/llm/src/index.ts` — export `ContentPart`, `EmbeddingRequest`, `EmbeddingResponse`, `isVisionMessage`, `buildVisionMessage`
### 0.8 Update `@bytelyst/llm-router`
| Change | Detail |
|--------|--------|
| Vision-aware routing | `classifyPrompt()` detects image content → routes to vision-capable models |
| Model capability flags | Add `supportsVision: boolean` and `supportsEmbedding: boolean` to `ModelConfig` |
### 0.9 Publish updated packages
Bump versions → publish to Gitea npm registry.
**Phase 0 Deliverables:**
- [ ] `@bytelyst/llm@0.2.0` — vision + embedding + streaming enhancements
- [ ] `@bytelyst/llm-router@0.2.0` — vision-aware routing + capability flags
- [ ] All existing tests pass + **25-30 new tests**
- [ ] Published to Gitea npm registry
---
## Phase 1 — Backend: Note Prompts Core + Copilot Upgrade
**Repo:** `learning_ai_notes`
**Duration:** 4-5 days
**Depends on:** Phase 0
**Features:** F5 (auto-tag), F7 (smart title), F10 (reading time), F13 (merge), F14 (compare), F20-F23 (templates), F24 (chains)
### 1.1 Add LLM dependency to backend
**File:** `backend/package.json` — add `"@bytelyst/llm": "^0.2.0"`
### 1.2 Create `backend/src/lib/llm.ts`
Singleton wrapper over `@bytelyst/llm`:
```typescript
import { getLLM, type LLMProvider } from '@bytelyst/llm';
let _llm: LLMProvider | null = null;
export function getNoteLettLLM(): LLMProvider {
if (!_llm) _llm = getLLM();
return _llm;
}
```
### 1.3 Add LLM env vars to config
**File:** `backend/src/lib/config.ts`
| Variable | Default | Description |
|----------|---------|-------------|
| `LLM_PROVIDER` | `openai` | `openai` / `azure` / `mock` |
| `OPENAI_API_KEY` | — | OpenAI API key |
| `OPENAI_BASE_URL` | — | Optional base URL override |
| `AZURE_OPENAI_ENDPOINT` | — | Azure OpenAI endpoint |
| `AZURE_OPENAI_API_KEY` | — | Azure OpenAI key |
| `LLM_DEFAULT_MODEL` | `gpt-4o-mini` | Default model for text prompts |
| `LLM_VISION_MODEL` | `gpt-4o` | Default model for image prompts |
| `LLM_EMBEDDING_MODEL` | `text-embedding-3-small` | Default model for embeddings |
### 1.4 New Cosmos container: `note_prompts`
**File:** `backend/src/lib/cosmos-init.ts` — register `note_prompts` container (partition key: `/userId`)
### 1.5 Create `backend/src/modules/note-prompts/types.ts`
Key types:
- **`PromptTemplateDoc`** — id, productId, userId, slug, name, description, category, systemPrompt, userPromptTemplate, inputType (`text`/`image`/`text+image`/`multi-note`), outputFormat, outputAction (`new_note`/`artifact`/`update_note`), parameters, builtIn, createdAt, updatedAt
- **`PromptParameter`** — key, label, type (`string`/`select`), options, default, required
- **`RunPromptInput`** — noteId, workspaceId, promptTemplateId OR inlinePrompt, parameters, imageUrls, additionalNoteIds (for F13/F14 merge/compare), previousResultNoteId (for F24 chains), dryRun, agentId
- **`RunPromptOutput`** — resultNoteId, resultArtifactId, content, model, tokenUsage, agentActionId, suggestedTags (for F5)
Zod schemas for all of the above.
### 1.6 Create `backend/src/modules/note-prompts/repository.ts`
CRUD for `PromptTemplateDoc`:
- `listTemplates(userId)` — returns built-in + user's custom templates
- `getTemplate(id, userId)`
- `createTemplate(doc)`
- `updateTemplate(id, userId, updates)`
- `deleteTemplate(id, userId)` — cannot delete built-in
### 1.7 Create `backend/src/modules/note-prompts/runner.ts`
The core orchestration logic:
```
1. Validate input (template or inline prompt)
2. Fetch the source note (verify ownership + productId)
3. If additionalNoteIds provided (F13/F14 merge/compare):
a. Fetch all additional notes
b. Combine content into multi-note context
4. If template has inputType 'image' or 'text+image':
a. Fetch artifact images from blob storage via SAS URLs
b. Build vision message with buildVisionMessage()
5. If previousResultNoteId provided (F24 chains):
a. Fetch previous result note
b. Include its content as additional context
6. Build LLM messages array:
- System: template.systemPrompt (or default)
- User: interpolated template with note content + images + additional notes
7. Call getNoteLettLLM().chatCompletion(request)
8. Post-process response:
- If template slug is 'auto-tag': parse tags from response → return as suggestedTags
- If outputAction is 'new_note': createNote() → link to source via note-relationships
- If outputAction is 'artifact': createNoteArtifact() on source note
- If outputAction is 'update_note': updateNote() body/tags on source note
9. Record NoteAgentActionDoc (actionType: 'smart_action')
10. Return RunPromptOutput
```
### 1.8 Create `backend/src/modules/note-prompts/routes.ts`
| Method | Path | Auth | Description | Features |
|--------|------|------|-------------|----------|
| `GET` | `/api/prompt-templates` | viewer | List built-in + user templates | Core |
| `GET` | `/api/prompt-templates/:id` | viewer | Get single template | Core |
| `POST` | `/api/prompt-templates` | admin | Create custom template | Core |
| `PATCH` | `/api/prompt-templates/:id` | admin | Update custom template | Core |
| `DELETE` | `/api/prompt-templates/:id` | admin | Delete custom template | Core |
| `POST` | `/api/note-prompts/run` | admin | Run a prompt on a note | Core |
| `POST` | `/api/note-prompts/run-stream` | admin | Run with SSE streaming | F3 |
| `GET` | `/api/note-prompts/history` | viewer | List past prompt runs | Core |
| `POST` | `/api/notes/:id/suggest-tags` | admin | Suggest tags via LLM | F5 |
| `GET` | `/api/notes/:id/reading-time` | viewer | Calculate reading time | F10 |
| `POST` | `/api/notes/compare` | admin | Compare 2+ notes | F14 |
| `POST` | `/api/notes/merge` | admin | Merge 2+ notes | F13 |
### 1.9 Seed 20 built-in prompt templates
**File:** `backend/src/modules/note-prompts/seed.ts`
| # | Slug | Name | Input | Output | Category | Feature |
|---|------|------|-------|--------|----------|---------|
| 1 | `summarize` | Summarize | text | new_note | transform | Core |
| 2 | `translate` | Translate | text | new_note | transform | Core |
| 3 | `simplify` | Simplify / ELI5 | text | artifact | transform | Core |
| 4 | `extract-key-facts` | Extract Key Facts | text | artifact | extract | Core |
| 5 | `food-label-rating` | Rate Food Label | image | new_note | analysis | Core |
| 6 | `parse-receipt` | Parse Receipt | image | new_note | extract | Core |
| 7 | `read-business-card` | Read Business Card | image | new_note | extract | Core |
| 8 | `handwriting-to-text` | Handwriting to Text | image | new_note | transform | Core |
| 9 | `generate-flashcards` | Generate Flashcards | text | new_note | generate | Core |
| 10 | `pros-and-cons` | Pros & Cons | text | artifact | analysis | Core |
| 11 | `presentation-outline` | Presentation Outline | text | new_note | generate | F21 |
| 12 | `email-draft` | Email Draft | text | new_note | generate | F22 |
| 13 | `social-post` | Social Post | text | artifact | generate | F23 |
| 14 | `shareable-summary` | Shareable Summary | text | new_note | transform | F20 |
| 15 | `compare-notes` | Compare Notes | multi-note | new_note | analysis | F14 |
| 16 | `merge-notes` | Merge Notes | multi-note | new_note | transform | F13 |
| 17 | `fix-rewrite` | Fix & Rewrite | text | update_note | transform | F1 |
| 18 | `change-tone` | Change Tone | text | update_note | transform | F2 |
| 19 | `continue-writing` | Continue Writing | text | update_note | generate | F3 |
| 20 | `auto-tag` | Auto-Tag | text | update_note | extract | F5 |
### 1.10 Upgrade `copilot-transform.ts` to use `@bytelyst/llm` (F1, F2, F7)
**File:** `backend/src/lib/copilot-transform.ts`
Replace extraction-service calls with direct `@bytelyst/llm` calls:
- `runCopilotTransform()``getNoteLettLLM().chatCompletion()` with action-specific system prompts
- `suggestTitleFromBody()``getNoteLettLLM().chatCompletion()` with title-suggestion prompt
- Add `rewriteText(text, style)` for F1/F2 — accepts tone parameter
- Keep extraction-service fallback for graceful degradation
### 1.11 Reading time utility (F10)
**File:** `backend/src/lib/reading-time.ts`
```typescript
export function estimateReadingTime(html: string): { minutes: number; words: number } {
const plain = html.replace(/<[^>]*>/g, ' ').replace(/\s+/g, ' ').trim();
const words = plain.split(/\s+/).length;
return { minutes: Math.max(1, Math.ceil(words / 238)), words };
}
```
Pure calculation — no LLM needed. Expose via `GET /api/notes/:id` response and note detail endpoints.
### 1.12 Extend agent action types
**File:** `backend/src/modules/note-agent-actions/types.ts`
Add `'smart_action'` and `'auto_enrich'` to `NOTE_AGENT_ACTION_TYPES`.
### 1.13 Register routes in server.ts + MCP tool
- **`backend/src/server.ts`** — register `notePromptRoutes`
- **`backend/src/mcp/note-tool-contracts.ts`** — add `notes.prompts.run` to `NOTES_MCP_TOOL_NAMES`
- **`backend/src/mcp/note-tools.ts`** — implement `executeRunPrompt()`
### 1.14 Tests
| Test file | Coverage | Count |
|-----------|----------|-------|
| `note-prompts/repository.test.ts` | Template CRUD | 8-10 |
| `note-prompts/runner.test.ts` | Prompt execution with mock LLM, chains, multi-note | 15-18 |
| `note-prompts/routes.test.ts` | API endpoint integration | 10-12 |
| `lib/copilot-transform.test.ts` | Upgraded copilot with LLM | 4-6 |
| `lib/reading-time.test.ts` | Reading time calculation | 4-5 |
| `mcp/note-tools.test.ts` | `notes.prompts.run` MCP tool | 4-6 |
**Phase 1 Deliverables:**
- [ ] `note-prompts` module: types, repository, runner, routes, seed (20 templates)
- [ ] `lib/llm.ts` singleton + config extended with LLM env vars
- [ ] `lib/reading-time.ts` pure utility (F10)
- [ ] Upgraded `copilot-transform.ts` using `@bytelyst/llm` (F1, F2, F7)
- [ ] Multi-note support in runner (F13 merge, F14 compare)
- [ ] Chain support in runner via `previousResultNoteId` (F24)
- [ ] `smart_action` + `auto_enrich` agent action types
- [ ] `notes.prompts.run` MCP tool
- [ ] `note_prompts` Cosmos container
- [ ] **45-57 new tests**
---
## Phase 2 — Backend: Note Intelligence (Background AI)
**Repo:** `learning_ai_notes`
**Duration:** 2-3 days
**Depends on:** Phase 1
**Features:** F6 (auto-summarize), F8 (duplicate detection), F9 (auto-link), F12 (knowledge gaps)
### 2.1 Embedding service: `backend/src/lib/embeddings.ts` (F8, F9, F12)
```typescript
import { getNoteLettLLM } from './llm.js';
export async function embedText(text: string): Promise<number[]> {
const llm = getNoteLettLLM();
if (!llm.embed) throw new Error('Embedding not supported by current LLM provider');
const res = await llm.embed({ input: text });
return res.embeddings[0];
}
export function cosineSimilarity(a: number[], b: number[]): number {
let dot = 0, magA = 0, magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}
```
### 2.2 Note embedding storage
**File:** `backend/src/modules/notes/types.ts` — add optional `embedding: number[]` field to `NoteDoc`
On note create/update, compute embedding in background (non-blocking). Store in Cosmos alongside the note.
### 2.3 Auto-summarize on save (F6)
**File:** `backend/src/lib/note-hooks.ts`
After a note is saved with body > 300 words:
1. Run "summarize" template via `runner.ts`
2. Store result as artifact (type: `summary`) on the note
3. Record agent action with `actionType: 'auto_enrich'`
4. Gated behind feature flag `notelett_auto_summarize_enabled`
### 2.4 Duplicate/similar note detection (F8)
**File:** `backend/src/modules/note-prompts/routes.ts`
New endpoint: `POST /api/notes/:id/check-duplicates`
1. Embed the current note's body
2. Fetch all notes in workspace with embeddings
3. Compute cosine similarity
4. Return notes with similarity > 0.85 threshold
### 2.5 Auto-link related notes (F9)
**File:** `backend/src/modules/note-prompts/routes.ts`
New endpoint: `POST /api/notes/:id/suggest-links`
1. Embed the current note
2. Find top 5 most similar notes (similarity > 0.6, excluding self)
3. Return as suggested links with similarity scores
4. UI can accept/dismiss suggestions
### 2.6 Knowledge gap detection (F12)
**File:** `backend/src/modules/note-prompts/routes.ts`
New endpoint: `POST /api/workspaces/:id/knowledge-gaps`
1. Fetch all notes in workspace
2. Extract topics from each note (via auto-tag or LLM)
3. Build topic frequency map
4. Send to LLM: "Given these topics and their coverage depth, what's missing?"
5. Return gap analysis as structured JSON
### 2.7 Tests
| Test file | Coverage | Count |
|-----------|----------|-------|
| `lib/embeddings.test.ts` | Embed + cosine similarity | 6-8 |
| `lib/note-hooks.test.ts` | Auto-summarize trigger logic | 4-6 |
| `note-prompts/routes.test.ts` | Duplicate check, suggest links, knowledge gaps | 10-12 |
**Phase 2 Deliverables:**
- [ ] `lib/embeddings.ts` — embed text + cosine similarity
- [ ] Note embedding storage on create/update
- [ ] Auto-summarize on save (F6) — feature-flag gated
- [ ] Duplicate detection endpoint (F8)
- [ ] Related notes suggestion endpoint (F9)
- [ ] Knowledge gap detection endpoint (F12)
- [ ] **20-26 new tests**
---
## Phase 3 — Web: Smart Actions UI + Editor AI
**Repo:** `learning_ai_notes`
**Duration:** 4-5 days
**Depends on:** Phase 1 (Phase 2 optional for F8/F9 UI)
**Features:** F1-F4 (editor AI), F5 (tag UI), F8-F9 (duplicate/link UI), F10 (reading time UI), F14 (compare UI), F20-F23 (export UI)
**Can run in parallel with:** Phase 4, Phase 5
### 3.1 API client: `web/src/lib/prompt-client.ts`
```typescript
listPromptTemplates(): Promise<PromptTemplate[]>
getPromptTemplate(id: string): Promise<PromptTemplate>
createPromptTemplate(input): Promise<PromptTemplate>
updatePromptTemplate(id, input): Promise<PromptTemplate>
deletePromptTemplate(id): Promise<void>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
runPromptStream(input: RunPromptInput): AsyncIterable<string> // F3
listPromptHistory(noteId?, limit?): Promise<AgentAction[]>
suggestTags(noteId, workspaceId): Promise<string[]> // F5
checkDuplicates(noteId, workspaceId): Promise<SimilarNote[]> // F8
suggestLinks(noteId, workspaceId): Promise<SuggestedLink[]> // F9
compareNotes(noteIds, workspaceId): Promise<RunPromptOutput> // F14
mergeNotes(noteIds, workspaceId): Promise<RunPromptOutput> // F13
getKnowledgeGaps(workspaceId): Promise<GapAnalysis> // F12
```
### 3.2 SmartActionsPanel component
**File:** `web/src/components/SmartActionsPanel.tsx`
Renders on the note detail page:
- Grid of action buttons grouped by category (built-in + custom)
- Each button: icon + name + inputType badge (text/image/multi)
- Click → opens RunPromptModal
- Shows recent prompt runs for this note
- Reading time display (F10)
- "Suggest tags" button (F5)
### 3.3 RunPromptModal component
**File:** `web/src/components/RunPromptModal.tsx`
- Template selector (filtered by input type compatibility)
- Parameter inputs (e.g., target language, tone)
- Image picker (browse note artifacts or upload new)
- Multi-note selector (for merge/compare — F13, F14)
- Inline prompt textarea (for custom one-off prompts)
- Chain toggle: "Continue from previous result" (F24)
- Dry-run checkbox
- "Run" button with loading spinner
### 3.4 PromptResultView component
**File:** `web/src/components/PromptResultView.tsx`
- Markdown renderer for LLM response
- Action buttons: "Save as Note", "Save as Artifact", "Apply to Note", "Discard"
- Token usage + model info footer
- Link to created note or artifact
### 3.5 Prompt Template Library page
**File:** `web/src/app/(app)/prompts/page.tsx`
- Browse all 20 built-in templates (read-only cards)
- User's custom templates (edit/delete)
- "Create Custom Prompt" → opens PromptTemplateEditor
- Category filter tabs: All, Analysis, Transform, Extract, Generate, Custom
### 3.6 PromptTemplateEditor component
**File:** `web/src/components/PromptTemplateEditor.tsx`
- Form: name, slug, description, category, system prompt, user prompt template
- Input type selector (text / image / text+image / multi-note)
- Output format + output action selectors
- Parameter builder (add/remove dynamic parameters)
- Template variable reference: `{{note.title}}`, `{{note.body}}`, `{{note.tags}}`, `{{params.X}}`
- Live preview
### 3.7 Upgrade NoteEditor with advanced Copilot (F1-F4)
**File:** `web/src/components/NoteEditor.tsx`
Enhance existing Copilot toolbar:
| Current | Upgraded |
|---------|----------|
| `shorten` | Keep (uses `@bytelyst/llm` now) |
| `expand` | Keep (uses `@bytelyst/llm` now) |
| `bulletize` | Keep (uses `@bytelyst/llm` now) |
| `grammar` | Replace with **"Fix & Rewrite"** (F1) — full rewrite, not just grammar |
| — | Add **"Change Tone"** (F2) — dropdown: formal/casual/professional/friendly |
| — | Add **"Continue Writing"** (F3) — inserts at cursor, streams token-by-token |
| — | Add **"Explain"** (F4) — tooltip popover with definition/explanation |
**F3 (Continue Writing)** implementation:
- Get text before cursor position from TipTap editor state
- Call `runPromptStream()` (SSE)
- Insert streamed tokens into editor in real-time via TipTap commands
**F4 (Inline Q&A)** implementation:
- Select text → right-click or toolbar button → "Explain this"
- Opens floating popover below selection
- Calls LLM with "Explain this term/concept concisely: {selection}"
- Shows result in popover (dismissible)
### 3.8 Duplicate detection UI (F8)
After note save, if `notelett_duplicate_check_enabled` flag is on:
- Call `checkDuplicates()`
- If similar notes found → show toast: "This note is similar to 'Note X' (87% match). View?"
- Click → opens side-by-side comparison
### 3.9 Related notes suggestion UI (F9)
After note creation:
- Call `suggestLinks()`
- If suggestions found → show panel: "Related notes you might want to link"
- Each suggestion: note title + similarity % + "Link" / "Dismiss" buttons
### 3.10 Auto-tag suggestion UI (F5)
On the note detail page SmartActionsPanel:
- "Suggest Tags" button
- Calls `/api/notes/:id/suggest-tags`
- Shows tag chips with + button to accept each
- Accepted tags are added to the note
### 3.11 Export actions UI (F20-F23)
In SmartActionsPanel, export templates appear with share icons:
- "Shareable Summary" → generates polished version → copy or share via note-shares
- "Presentation Outline" → generates outline → saves as new note
- "Email Draft" → generates email → copy to clipboard
- "Social Post" → generates post → copy to clipboard
### 3.12 Knowledge gap analysis UI (F12)
**File:** `web/src/app/(app)/workspaces/[id]/gaps/page.tsx`
- "Analyze Knowledge Gaps" button on workspace page
- Shows gap analysis: topics with thin coverage, suggested new note topics
- "Create Note" button for each gap → pre-fills title
### 3.13 Wire into note detail page + sidebar
- **`web/src/app/(app)/notes/[noteId]/page.tsx`** — add SmartActionsPanel, reading time, duplicate warning
- **`web/src/components/Sidebar.tsx`** — add "Prompts" nav item (sparkle icon)
- **Keyboard shortcut:** `Cmd+Shift+A` → open Smart Actions panel
### 3.14 Tests
| Test file | Coverage | Count |
|-----------|----------|-------|
| `prompt-client.test.ts` | API client functions | 8-10 |
| `SmartActionsPanel.test.tsx` | Render + click handlers | 4-6 |
| `RunPromptModal.test.tsx` | Form + submission + multi-note | 4-6 |
| `NoteEditor.test.tsx` | Copilot upgrade (F1-F4) | 6-8 |
| `e2e/smart-actions.spec.ts` | Full flow E2E | 6 |
**Phase 3 Deliverables:**
- [ ] `prompt-client.ts` API client (all endpoints)
- [ ] 5 new components: SmartActionsPanel, RunPromptModal, PromptResultView, PromptTemplateEditor, KnowledgeGapView
- [ ] `/prompts` template library page
- [ ] `/workspaces/[id]/gaps` knowledge gap page (F12)
- [ ] NoteEditor upgraded with F1-F4 (Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A)
- [ ] Duplicate detection toast (F8)
- [ ] Related notes suggestion panel (F9)
- [ ] Auto-tag suggestion UI (F5)
- [ ] Export actions UI (F20-F23)
- [ ] Reading time display (F10)
- [ ] Sidebar updated, keyboard shortcut
- [ ] **28-36 new tests** + 6 E2E tests
---
## Phase 4 — Mobile: Smart Actions + AI-Enhanced Capture
**Repo:** `learning_ai_notes`
**Duration:** 4-5 days
**Depends on:** Phase 1
**Features:** F15 (voice), F16 (screenshot), F17 (URL), F18 (multi-image), F19 (clipboard)
**Can run in parallel with:** Phase 3, Phase 5
### 4.1 New dependencies
| Package | Purpose |
|---------|---------|
| `expo-image-picker` | Camera capture + gallery selection |
| `expo-av` | Audio recording for voice-to-note (F15) |
| `expo-clipboard` | Clipboard access for AI paste (F19) |
| `expo-sharing` | Share results |
### 4.2 API client: `mobile/src/api/note-prompts.ts`
```typescript
listPromptTemplates(): Promise<PromptTemplate[]>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
suggestTags(noteId, workspaceId): Promise<string[]>
```
### 4.3 Enhance blob upload: `mobile/src/api/blob-upload.ts`
Upgrade existing stub:
- Camera capture via `expo-image-picker` (photo + gallery)
- Image resize (max 2048px, compress to < 4MB)
- Upload to blob storage via `@bytelyst/blob-client`
- Return `blobPath` + SAS URL
### 4.4 Zustand store: `mobile/src/store/prompt-store.ts`
```typescript
interface PromptState {
templates: PromptTemplate[];
isRunning: boolean;
lastResult: RunPromptOutput | null;
error: string | null;
fetchTemplates(): Promise<void>;
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>;
clearResult(): void;
}
```
### 4.5 SmartActionsSheet component
**File:** `mobile/src/app/note/SmartActionsSheet.tsx`
Bottom sheet (react-native-gesture-handler) that slides up from note detail:
- Scrollable grid of action buttons (icon + name)
- Category filter tabs (All, Text, Image, Custom)
- Actions trigger either:
- Direct run (text actions)
- Camera/gallery picker then run (image actions)
### 4.6 PromptResultScreen
**File:** `mobile/src/app/note/prompt-result.tsx`
- Markdown-rendered LLM response
- "Save as Note" / "Discard" buttons
- Model info + token count
- Navigate to new note after saving
### 4.7 Voice-to-note (F15)
**File:** `mobile/src/app/capture/voice.tsx` (sub-route of capture, NOT a new tab)
1. Record audio via `expo-av` (Audio.Recording)
2. Upload audio file to blob storage
3. Call backend transcription endpoint (or use extraction-service with speech task)
4. Show transcribed text for review/edit
5. Save as note
6. Optionally run a Smart Action on the result (e.g., "Extract Key Facts")
### 4.8 Screenshot-to-note (F16)
On the capture tab:
- "From Screenshot" button gallery picker (images only)
- Upload image blob storage
- Run "handwriting-to-text" or custom OCR prompt
- Show result for review save as note
### 4.9 URL-to-note (F17)
On the capture tab:
- "From URL" input field
- Backend endpoint: `POST /api/note-prompts/url-extract`
- Fetches URL content (server-side to avoid CORS)
- Strips HTML extracts main content
- Runs "summarize" template
- Returns structured result
- Show summary for review save as note
### 4.10 Multi-image capture (F18)
On the capture tab:
- "Scan Document" button camera in continuous mode
- Take multiple photos (whiteboard pages, multi-page document)
- Upload all images blob storage
- Run each through vision model sequentially
- Combine results into single note body
- Show merged result for review save
### 4.11 Clipboard AI paste (F19)
On the capture tab:
- "Paste & Clean" button
- Read clipboard via `expo-clipboard`
- If clipboard contains text run "fix-rewrite" template
- If clipboard contains URL trigger URL-to-note flow (F17)
- Show cleaned result save as note
### 4.12 Enhance capture tab
**File:** `mobile/src/app/(tabs)/capture.tsx`
Add new capture methods alongside existing text draft:
```
┌─────────────────────────────────────┐
│ Quick Capture │
│ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │ Text │ │ Photo │ │ Voice │ │
│ └───────┘ └───────┘ └───────┘ │
│ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │ URL │ │ Scan │ │ Paste │ │
│ └───────┘ └───────┘ └───────┘ │
│ │
│ [existing text capture form] │
└─────────────────────────────────────┘
```
### 4.13 Wire Smart Actions into note detail
**File:** `mobile/src/app/note/[id].tsx`
- "AI Actions" button in the header
- Opens SmartActionsSheet
- Shows reading time (F10)
- Shows suggested tags after save (F5)
### 4.14 Offline queue integration
Prompt runs that fail queue via `@bytelyst/offline-queue` for retry.
### 4.15 Tests
| Test file | Coverage | Count |
|-----------|----------|-------|
| `api/note-prompts.test.ts` | API client | 4-6 |
| `store/prompt-store.test.ts` | Store actions | 6-8 |
| `SmartActionsSheet.test.tsx` | Render + interactions | 4-6 |
| `capture.test.tsx` | New capture methods | 4-6 |
**Phase 4 Deliverables:**
- [ ] `note-prompts.ts` API client + `prompt-store.ts` Zustand store
- [ ] Camera capture + image resize + blob upload
- [ ] SmartActionsSheet bottom sheet + PromptResultScreen
- [ ] Voice-to-note flow (F15) `expo-av` recording
- [ ] Screenshot-to-note (F16) gallery + vision OCR
- [ ] URL-to-note (F17) server-side fetch + summarize
- [ ] Multi-image scan (F18) continuous camera + combine
- [ ] Clipboard AI paste (F19) read + clean
- [ ] Enhanced capture tab with 6 capture modes
- [ ] Smart Actions on note detail
- [ ] Offline queue for failed runs
- [ ] **18-26 new tests**
---
## Phase 5 — Agent & Workflow Intelligence
**Repo:** `learning_ai_notes`
**Duration:** 2-3 days
**Depends on:** Phase 2
**Features:** F11 (weekly digest), F25 (scheduled), F26 (webhooks), F27 (approval-gated)
**Can run in parallel with:** Phases 3, 4
### 5.1 Scheduled Smart Actions (F25)
**File:** `backend/src/modules/note-prompts/scheduler.ts`
| Component | Detail |
|-----------|--------|
| `PromptScheduleDoc` | New Cosmos doc: scheduleId, templateId, workspaceId, cron expression, enabled, lastRunAt, nextRunAt |
| Cosmos container | `note_prompt_schedules` (partition: `/workspaceId`) |
| Scheduler loop | In-process interval (60s check), matches cron invokes `runner.ts` |
| API endpoints | `POST /api/prompt-schedules` (create), `GET` (list), `PATCH/:id` (update), `DELETE/:id` (delete) |
Example: "Summarize all notes in 'Research' workspace every Friday at 5pm"
### 5.2 Weekly workspace digest (F11)
Built on F25 a special scheduled action:
- Pre-configured template: `weekly-digest`
- Runs weekly, collects all notes created/modified in workspace that week
- Produces a digest note with: summary, key themes, new notes list, most active areas
- Linked to workspace
Add template #21: `weekly-digest` (system-only, runs via scheduler)
### 5.3 Webhook-triggered actions (F26)
**File:** `backend/src/modules/note-prompts/webhooks.ts`
| Component | Detail |
|-----------|--------|
| `PromptWebhookDoc` | webhookId, templateId, workspaceId, triggerEvent, enabled |
| API endpoint | `POST /api/prompt-webhooks` (create), `GET` (list), `DELETE/:id` |
| Trigger endpoint | `POST /api/prompt-webhooks/:id/trigger` accepts `{ noteId, payload }` |
| Supported events | `note.created`, `note.updated`, `note.tagged`, `external` |
Example: "When a note is tagged 'receipt', auto-run Parse Receipt"
### 5.4 Approval-gated actions (F27)
Leverages existing `NoteAgentActionDoc` with approval states.
| Change | Detail |
|--------|--------|
| New prompt template field | `requiresApproval: boolean` (default: false) |
| Runner modification | If template has `requiresApproval`, create action with `state: 'proposed'` instead of `state: 'applied'` |
| Review endpoint | Already exists: `POST /api/agent-actions/:id/review` (approve/reject) |
| Post-approval hook | On approval, execute the saved output action (create note / update / artifact) |
| Web UI | ProposalReviewCard already exists add Smart Action context |
### 5.5 Tests
| Test file | Coverage | Count |
|-----------|----------|-------|
| `note-prompts/scheduler.test.ts` | Cron matching, schedule CRUD, execution | 8-10 |
| `note-prompts/webhooks.test.ts` | Webhook CRUD, trigger, event matching | 6-8 |
| `note-prompts/runner.test.ts` | Approval-gated flow | 3-4 |
**Phase 5 Deliverables:**
- [ ] `scheduler.ts` cron-based scheduled prompt execution (F25)
- [ ] `weekly-digest` template + scheduled action (F11)
- [ ] `webhooks.ts` event-triggered prompt execution (F26)
- [ ] Approval-gated actions in runner (F27)
- [ ] `note_prompt_schedules` Cosmos container
- [ ] API endpoints for schedules + webhooks
- [ ] **17-22 new tests**
---
## Phase 6 — Polish, Integration Tests, Documentation
**Duration:** 2-3 days
**Depends on:** Phases 3-5
### 6.1 End-to-end integration testing
| Test | Flow |
|------|------|
| Web E2E: Food label | Create note attach image run "Rate Food Label" verify result note |
| Web E2E: Summarize | Create long note run "Summarize" verify summary artifact |
| Web E2E: Compare | Select 2 notes compare verify comparison note |
| Web E2E: Template CRUD | Create custom template use it edit delete |
| Mobile E2E: Camera capture | Photo upload run prompt verify result |
| Mobile E2E: Voice-to-note | Record transcribe review save |
| MCP E2E: Agent prompt | Agent calls `notes.prompts.run` verify audit trail |
| Webhook E2E | Tag note webhook fires prompt runs automatically |
| Scheduler E2E | Schedule created time triggers digest generated |
### 6.2 Error handling
| Scenario | Handling |
|----------|----------|
| LLM API key not configured | Clear error, disable Smart Actions UI, show setup guide |
| LLM rate limit (429) | Retry with exponential backoff (3 attempts), show "try again later" |
| LLM timeout | 60s timeout, graceful error, suggest retry |
| Image too large | Client-side resize before upload (max 2048px, < 4MB) |
| Prompt template not found | 404 with helpful message |
| Empty note body (text prompt) | Require body or show warning |
| No images on note (image prompt) | Prompt to upload/capture first |
| Embedding service unavailable | Skip duplicate check/auto-link gracefully |
| Audio recording fails | Fallback to text capture, show error |
| URL fetch fails | Show error with suggestion to paste content manually |
### 6.3 Feature flags
| Flag | Default | Controls |
|------|---------|----------|
| `notelett_smart_actions_enabled` | false | All Smart Actions UI + API |
| `notelett_auto_summarize_enabled` | false | F6 auto-summarize on save |
| `notelett_duplicate_check_enabled` | false | F8 duplicate detection |
| `notelett_auto_link_enabled` | false | F9 auto-link suggestions |
| `notelett_copilot_llm_enabled` | false | F1-F4 editor AI (vs extraction fallback) |
| `notelett_voice_capture_enabled` | false | F15 voice-to-note |
| `notelett_scheduled_actions_enabled` | false | F25 scheduled actions |
| `notelett_webhooks_enabled` | false | F26 webhook triggers |
### 6.4 Telemetry events
| Event | Properties |
|-------|------------|
| `smart_action_run` | templateSlug, inputType, model, durationMs, tokenUsage |
| `smart_action_result_saved` | outputAction, resultType |
| `smart_action_template_created` | category, inputType |
| `smart_action_error` | errorType, templateSlug |
| `copilot_transform` | action (rewrite/tone/continue/explain), durationMs |
| `auto_summarize_triggered` | wordCount, durationMs |
| `duplicate_detected` | similarityScore, noteId |
| `voice_capture_completed` | durationSecs, wordCount |
| `url_extract_completed` | domain, wordCount |
| `scheduled_action_fired` | scheduleId, templateSlug |
| `webhook_triggered` | webhookId, triggerEvent |
### 6.5 Documentation updates
- Update `docs/PRD.md` Smart Actions section 5.2 AI features)
- Update `AGENTS.md` new MCP tool, new module, new env vars
- Update `docs/roadmaps/02_BACKEND_ROADMAP.md` mark Smart Actions complete
- API reference for all new endpoints (15+ endpoints)
- `docs/SMART_ACTIONS_USER_GUIDE.md` end-user documentation
### 6.6 Docker / CI updates
- Add LLM env vars to `.env.example`
- Add `@bytelyst/llm` to `scripts/docker-prep.sh` tarball list
- Update `backend/Dockerfile` for new deps
- Add `expo-image-picker`, `expo-av` to mobile CI build matrix
**Phase 6 Deliverables:**
- [ ] 9+ E2E integration tests + 1-6 additional integration tests
- [ ] Error handling for all edge cases
- [ ] 8 feature flags for gradual rollout
- [ ] 11 telemetry events
- [ ] Documentation updated (PRD, AGENTS.md, roadmaps, user guide)
- [ ] Docker + CI updated
---
## Test Budget Summary
| Phase | Unit Tests | E2E Tests | Total |
|-------|-----------|-----------|-------|
| 0 Common-plat LLM | 25-30 | | 25-30 |
| 1 Backend core | 45-57 | | 45-57 |
| 2 Note intelligence | 20-26 | | 20-26 |
| 3 Web UI + editor AI | 22-30 | 6 | 28-36 |
| 4 Mobile + capture | 18-26 | | 18-26 |
| 5 Agent/workflow | 17-22 | | 17-22 |
| 6 Integration/polish | | 10-15 | 10-15 |
| **Total** | **147-191** | **16-21** | **163-212** |
---
## New Files Summary
### `learning_ai_common_plat` (Phase 0) — 6-8 files modified
| File | Change |
|------|--------|
| `packages/llm/src/types.ts` | Add `ContentPart`, `EmbeddingRequest/Response`, update `ChatMessage` |
| `packages/llm/src/helpers.ts` | New: `isVisionMessage()`, `buildVisionMessage()` |
| `packages/llm/src/providers/openai.ts` | Vision + embedding support |
| `packages/llm/src/providers/azure-openai.ts` | Vision + embedding support |
| `packages/llm/src/providers/mock.ts` | Vision + embedding mocks |
| `packages/llm/src/index.ts` | Export new types + helpers |
| `packages/llm-router/src/types.ts` | Add `supportsVision`, `supportsEmbedding` |
| `packages/llm-router/src/classifier.ts` | Detect image content |
### `learning_ai_notes/backend` (Phases 1, 2, 5) — 11 new + 7 modified
| File | Status | Phase |
|------|--------|-------|
| `src/lib/llm.ts` | New | 1 |
| `src/lib/config.ts` | Modified | 1 |
| `src/lib/cosmos-init.ts` | Modified | 1 |
| `src/lib/copilot-transform.ts` | Modified | 1 |
| `src/lib/reading-time.ts` | New | 1 |
| `src/lib/embeddings.ts` | New | 2 |
| `src/lib/note-hooks.ts` | New | 2 |
| `src/modules/note-prompts/types.ts` | New | 1 |
| `src/modules/note-prompts/repository.ts` | New | 1 |
| `src/modules/note-prompts/runner.ts` | New | 1 |
| `src/modules/note-prompts/routes.ts` | New | 1 |
| `src/modules/note-prompts/seed.ts` | New | 1 |
| `src/modules/note-prompts/scheduler.ts` | New | 5 |
| `src/modules/note-prompts/webhooks.ts` | New | 5 |
| `src/modules/note-agent-actions/types.ts` | Modified | 1 |
| `src/mcp/note-tool-contracts.ts` | Modified | 1 |
| `src/mcp/note-tools.ts` | Modified | 1 |
| `src/server.ts` | Modified | 1 |
### `learning_ai_notes/web` (Phase 3) — 8 new + 5 modified
| File | Status |
|------|--------|
| `src/lib/prompt-client.ts` | New |
| `src/components/SmartActionsPanel.tsx` | New |
| `src/components/RunPromptModal.tsx` | New |
| `src/components/PromptResultView.tsx` | New |
| `src/components/PromptTemplateEditor.tsx` | New |
| `src/app/(app)/prompts/page.tsx` | New |
| `src/app/(app)/workspaces/[id]/gaps/page.tsx` | New |
| `e2e/smart-actions.spec.ts` | New |
| `src/app/(app)/notes/[noteId]/page.tsx` | Modified |
| `src/components/NoteEditor.tsx` | Modified |
| `src/components/Sidebar.tsx` | Modified |
| `src/lib/copilot-client.ts` | Modified (add new CopilotAction types) |
| `src/lib/types.ts` | Modified (add PromptTemplate, RunPromptInput/Output, etc.) |
### `learning_ai_notes/mobile` (Phase 4) — 8 new + 3 modified
| File | Status |
|------|--------|
| `src/api/note-prompts.ts` | New |
| `src/api/blob-upload.ts` | Modified |
| `src/store/prompt-store.ts` | New |
| `src/app/note/SmartActionsSheet.tsx` | New |
| `src/app/note/prompt-result.tsx` | New |
| `src/app/capture/voice.tsx` | New (sub-route of capture, NOT a tab) |
| `src/app/capture/url.tsx` | New (sub-route of capture, NOT a tab) |
| `src/app/capture/scan.tsx` | New (sub-route of capture, NOT a tab) |
| `src/app/(tabs)/capture.tsx` | Modified |
| `src/app/note/[id].tsx` | Modified |
---
## 20 Built-in Prompt Templates
| # | Slug | Name | Input | Output | Category |
|---|------|------|-------|--------|----------|
| 1 | `summarize` | Summarize | text | new_note | transform |
| 2 | `translate` | Translate | text | new_note | transform |
| 3 | `simplify` | Simplify / ELI5 | text | artifact | transform |
| 4 | `extract-key-facts` | Extract Key Facts | text | artifact | extract |
| 5 | `food-label-rating` | Rate Food Label | image | new_note | analysis |
| 6 | `parse-receipt` | Parse Receipt | image | new_note | extract |
| 7 | `read-business-card` | Read Business Card | image | new_note | extract |
| 8 | `handwriting-to-text` | Handwriting to Text | image | new_note | transform |
| 9 | `generate-flashcards` | Generate Flashcards | text | new_note | generate |
| 10 | `pros-and-cons` | Pros & Cons | text | artifact | analysis |
| 11 | `presentation-outline` | Presentation Outline | text | new_note | generate |
| 12 | `email-draft` | Email Draft | text | new_note | generate |
| 13 | `social-post` | Social Post | text | artifact | generate |
| 14 | `shareable-summary` | Shareable Summary | text | new_note | transform |
| 15 | `compare-notes` | Compare Notes | multi-note | new_note | analysis |
| 16 | `merge-notes` | Merge Notes | multi-note | new_note | transform |
| 17 | `fix-rewrite` | Fix & Rewrite | text | update_note | transform |
| 18 | `change-tone` | Change Tone | text | update_note | transform |
| 19 | `continue-writing` | Continue Writing | text | update_note | generate |
| 20 | `auto-tag` | Auto-Tag | text | update_note | extract |
---
## New Dependencies
| Package | Where | Purpose |
|---------|-------|---------|
| `@bytelyst/llm@^0.2.0` | backend | LLM with vision + embedding |
| `expo-image-picker` | mobile | Camera + gallery |
| `expo-av` | mobile | Audio recording (F15) |
| `expo-clipboard` | mobile | Clipboard access (F19) |
All other integrations use existing `@bytelyst/*` packages already in `package.json`.
---
## New Cosmos Containers
| Container | Partition Key | Phase | Purpose |
|-----------|--------------|-------|---------|
| `note_prompts` | `/userId` | 1 | Prompt templates (built-in + custom) |
| `note_prompt_schedules` | `/workspaceId` | 5 | Scheduled action definitions |
Prompt run results don't need containers they produce notes (`notes`) and artifacts (`note_artifacts`).
---
## New Environment Variables
| Variable | Default | Phase | Description |
|----------|---------|-------|-------------|
| `LLM_PROVIDER` | `openai` | 1 | `openai` / `azure` / `mock` |
| `OPENAI_API_KEY` | | 1 | OpenAI API key |
| `OPENAI_BASE_URL` | | 1 | Optional base URL override |
| `AZURE_OPENAI_ENDPOINT` | | 1 | Azure OpenAI endpoint |
| `AZURE_OPENAI_API_KEY` | | 1 | Azure OpenAI key |
| `LLM_DEFAULT_MODEL` | `gpt-4o-mini` | 1 | Default text model |
| `LLM_VISION_MODEL` | `gpt-4o` | 1 | Default vision model |
| `LLM_EMBEDDING_MODEL` | `text-embedding-3-small` | 2 | Default embedding model |
---
## New API Endpoints (15 endpoints)
| Method | Path | Phase | Feature |
|--------|------|-------|---------|
| `GET` | `/api/prompt-templates` | 1 | List templates |
| `GET` | `/api/prompt-templates/:id` | 1 | Get template |
| `POST` | `/api/prompt-templates` | 1 | Create template |
| `PATCH` | `/api/prompt-templates/:id` | 1 | Update template |
| `DELETE` | `/api/prompt-templates/:id` | 1 | Delete template |
| `POST` | `/api/note-prompts/run` | 1 | Run prompt |
| `POST` | `/api/note-prompts/run-stream` | 1 | Run prompt (SSE) |
| `GET` | `/api/note-prompts/history` | 1 | Prompt run history |
| `POST` | `/api/notes/:id/suggest-tags` | 1 | F5 |
| `POST` | `/api/notes/compare` | 1 | F14 |
| `POST` | `/api/notes/merge` | 1 | F13 |
| `POST` | `/api/notes/:id/check-duplicates` | 2 | F8 |
| `POST` | `/api/notes/:id/suggest-links` | 2 | F9 |
| `POST` | `/api/workspaces/:id/knowledge-gaps` | 2 | F12 |
| `POST` | `/api/note-prompts/url-extract` | 4 | F17 |
| `CRUD` | `/api/prompt-schedules` | 5 | F25 |
| `CRUD` | `/api/prompt-webhooks` | 5 | F26 |
---
## Commit Strategy
### Phase 0 commits (common-plat)
```
feat(llm): add ContentPart type + multipart ChatMessage.content support
feat(llm): update OpenAI + Azure providers for vision messages
feat(llm): add embedding support (EmbeddingRequest/Response, embed())
feat(llm): add isVisionMessage + buildVisionMessage helpers
test(llm): add vision + embedding tests (30 tests)
feat(llm-router): add supportsVision + supportsEmbedding model capability flags
chore(llm): bump to 0.2.0 + publish
```
### Phase 1 commits (notelett backend)
```
feat(backend): add @bytelyst/llm + lib/llm.ts singleton + LLM config
feat(note-prompts): types + Zod schemas for templates and run input/output
feat(note-prompts): repository — template CRUD
feat(note-prompts): runner — LLM orchestration + multi-note + chains
feat(note-prompts): routes — REST API endpoints (12 routes)
feat(note-prompts): seed 20 built-in prompt templates
feat(backend): upgrade copilot-transform.ts to use @bytelyst/llm
feat(backend): add reading-time utility
feat(agent-actions): add smart_action + auto_enrich types
feat(mcp): add notes.prompts.run tool
test(note-prompts): full test suite (55 tests)
```
### Phase 2 commits
```
feat(backend): embeddings service — embed text + cosine similarity
feat(backend): note embedding storage on create/update
feat(backend): auto-summarize on save (feature-flag gated)
feat(backend): duplicate detection endpoint
feat(backend): related notes suggestion endpoint
feat(backend): knowledge gap detection endpoint
test(backend): intelligence tests (25 tests)
```
### Phase 3 commits
```
feat(web): prompt-client API client
feat(web): SmartActionsPanel + RunPromptModal + PromptResultView
feat(web): PromptTemplateEditor + /prompts library page
feat(web): upgrade NoteEditor — Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A
feat(web): duplicate detection toast + related notes panel
feat(web): auto-tag suggestion UI + export actions
feat(web): knowledge gap analysis page
feat(web): wire Smart Actions into note detail + sidebar
test(web): Smart Actions unit + E2E tests (36 tests)
```
### Phase 4 commits
```
feat(mobile): note-prompts API client + prompt-store
feat(mobile): camera capture + image resize + blob upload
feat(mobile): SmartActionsSheet bottom sheet + PromptResultScreen
feat(mobile): voice-to-note — expo-av recording + transcription
feat(mobile): screenshot-to-note + multi-image scan
feat(mobile): URL-to-note + clipboard AI paste
feat(mobile): enhanced capture tab with 6 capture modes
test(mobile): Smart Actions tests (26 tests)
```
### Phase 5 commits
```
feat(backend): scheduled Smart Actions — cron scheduler + CRUD
feat(backend): weekly workspace digest template + scheduled action
feat(backend): webhook-triggered actions — CRUD + trigger endpoint
feat(backend): approval-gated actions in runner
test(backend): scheduler + webhook tests (22 tests)
```
### Phase 6 commits
```
feat(all): 8 feature flags for gradual rollout
feat(all): 11 telemetry events for Smart Actions
docs: update PRD, AGENTS.md, roadmaps for Smart Actions
docs: Smart Actions user guide
chore: update .env.example + Docker for LLM support
test: end-to-end integration tests (15 tests)
```
---
## Risk Mitigation
| Risk | Mitigation |
|------|------------|
| OpenAI API costs | Per-user daily quota, model tier selection (gpt-4o-mini default, gpt-4o vision only), feature flag gating |
| Vision prompt latency (5-15s) | Progress indicator, allow background processing, cache identical requests |
| Image size limits | Client-side resize to max 2048px, compress < 4MB before upload |
| Prompt injection | System prompt hardening, output validation, truncate excessively long inputs |
| LLM hallucination | JSON mode where possible, output schema validation, clear UI disclaimer |
| Corporate proxy blocking OpenAI | Support Azure OpenAI as alternative (already in `@bytelyst/llm`) |
| Embedding cost at scale | Batch embeddings, cache embeddings on note doc, recompute only on content change |
| Audio transcription accuracy | Show editable preview before saving, allow manual corrections |
| Scheduler reliability | In-process interval (simple), log missed runs, diagnostics endpoint |
---
## Future Extensions (Not in This Roadmap)
- **RAG context** include related notes as context in prompts for better answers
- **Agent marketplace prompts** share templates across ByteLyst products
- **Multi-step workflow builder** visual chain editor (drag-and-drop)
- **Streaming for mobile** SSE on React Native for real-time token display
- **Collaborative Smart Actions** run prompts across shared workspaces
- **Custom model support** plug in local Ollama models via `@bytelyst/ollama-client`
- **Action replay** re-run a previous Smart Action with same parameters
- **Template versioning** track changes to custom templates over time
---
## Appendix: Review Findings & Resolutions
Systematic code-level audit of this roadmap against the actual NoteLett and common-plat codebases, conducted April 2026. Each finding cross-references the real source files.
### Finding 1 — FIXED: Timeline diagram showed wrong dependency flow
**Severity:** Medium Incorrect diagram could mislead parallel scheduling
**Was:** Phase 3/4 branching from Phase 2
**Fix:** Phase 3/4 branch from Phase 1. Phase 2 Phase 5. Diagram corrected above.
### Finding 2 — FIXED: `PromptTemplateDoc` was missing `productId` field
**Severity:** Critical violates NoteLett convention: every Cosmos document MUST include `productId: "notelett"`
**Source:** `backend/src/modules/notes/types.ts` all other docs (NoteDoc, NoteArtifactDoc, NoteAgentActionDoc) have `productId`
**Fix:** Added `productId`, `userId`, `createdAt`, `updatedAt` to the type definition in §1.5.
**Note:** `PromptScheduleDoc` 5.1) and `PromptWebhookDoc` 5.3) must also include `productId` and `userId` when implemented.
### Finding 3 — FIXED: Reading time endpoint was `POST`, should be `GET`
**Severity:** Low Pure calculation with no side effects
**Source:** REST convention `GET` for idempotent read operations
**Fix:** Changed to `GET /api/notes/:id/reading-time` in §1.8.
### Finding 4 — FIXED: Backend file count was wrong (claimed 18+8, actual 11+7)
**Severity:** Low Documentation accuracy
**Fix:** Corrected to "11 new + 7 modified" in New Files Summary.
### Finding 5 — FIXED: Web file count missing `copilot-client.ts` and `types.ts`
**Severity:** Medium These files MUST be updated but were omitted
**Source:** `web/src/lib/copilot-client.ts` defines `CopilotAction = 'shorten' | 'expand' | 'bulletize' | 'grammar'` needs new types for F1/F2.
`web/src/lib/types.ts` needs `PromptTemplate`, `RunPromptInput`, `RunPromptOutput`, `SimilarNote`, `SuggestedLink`, `GapAnalysis` types.
**Fix:** Added both to web modified files list. Count corrected to "8 new + 5 modified".
### Finding 6 — FIXED: Mobile capture sub-routes were listed as tabs
**Severity:** High Would break the 5-tab navigator
**Source:** `mobile/src/app/(tabs)/_layout.tsx` has exactly 5 tabs: Home, Search, Capture, Inbox, Settings. Adding 3 more tabs (voice-capture, url-capture, scan-capture) would overflow the tab bar.
**Fix:** Changed to sub-routes of capture: `src/app/capture/voice.tsx`, `src/app/capture/url.tsx`, `src/app/capture/scan.tsx`. These are navigated to FROM the capture tab, not separate tabs.
### Finding 7 — FIXED: Phase 6 deliverables listed test count twice (redundant)
**Severity:** Low
**Fix:** Consolidated into single line.
### Finding 8 — OPEN: Embedding storage strategy needs decision
**Severity:** High Affects Cosmos RU cost and query patterns
**Issue:** §2.2 proposes storing `embedding: number[]` directly on `NoteDoc`. For `text-embedding-3-small`, each embedding is 1536 floats (~6KB). This increases every `NoteDoc` read by ~6KB, affecting list queries and the `notes` container partition-level throughput.
**Recommendation:** Either:
- **(a)** Store embeddings in a SEPARATE `note_embeddings` container (partition: `/workspaceId`), with documents keyed by `noteId`. Keeps `NoteDoc` lean.
- **(b)** Store inline but use Cosmos projection queries (`SELECT c.id, c.title, c.embedding FROM c`) to avoid pulling full note bodies when only embeddings are needed.
- Option (a) is preferred for scale. Adds 1 new Cosmos container.
**Action:** Implementer should choose (a) or (b) at Phase 2 start and update `cosmos-init.ts` accordingly.
### Finding 9 — OPEN: Voice-to-note (F15) transcription backend not fully specified
**Severity:** Medium Implementation decision needed
**Issue:** §4.7 says "Call backend transcription endpoint (or use extraction-service with speech task)" but no endpoint or extraction task is defined.
**Options:**
- **(a)** Add `speech_transcription` task to extraction-service (Python sidecar already supports Whisper/Azure STT)
- **(b)** New backend endpoint `POST /api/note-prompts/transcribe` that calls Azure Speech SDK
- **(c)** Client-side transcription via `expo-speech` (limited quality)
**Recommendation:** Option (a) extraction-service already has Python sidecar infrastructure. Add task type `speech_transcription` and a new endpoint `POST /api/note-prompts/transcribe` that wraps extraction-service.
### Finding 10 — OPEN: URL-to-note backend endpoint assigned to Phase 4 but needs backend work
**Severity:** Medium Mobile Phase 4 depends on backend route that isn't in Phase 1
**Issue:** `POST /api/note-prompts/url-extract` is listed in the API endpoints table as Phase 4, but this is a SERVER-SIDE endpoint (URL fetch, HTML strip, summarize). It must be implemented in the BACKEND before mobile can use it.
**Recommendation:** Move this endpoint to Phase 1 (backend routes) since the runner infrastructure is already being built there.
### Finding 11 — OPEN: Phase 5 `PromptWebhookDoc` needs its own Cosmos container
**Severity:** Low Currently untracked
**Issue:** §5.3 defines `PromptWebhookDoc` but no Cosmos container is mentioned for it. The "New Cosmos Containers" section only lists `note_prompts` and `note_prompt_schedules`.
**Recommendation:** Add `note_prompt_webhooks` container (partition: `/workspaceId`) or store webhooks in `note_prompt_schedules` with a discriminator.
### Finding 12 — OPEN: `@bytelyst/llm` factory reads env vars directly, not via Zod config
**Severity:** Low Clarification needed, not a bug
**Issue:** `factory.ts` in `@bytelyst/llm` reads `process.env.LLM_PROVIDER`, `process.env.OPENAI_API_KEY`, etc. directly. The roadmap also adds these to NoteLett's Zod config schema 1.3). These serve different purposes:
- **`@bytelyst/llm` factory** reads env at provider instantiation time
- **NoteLett config.ts** validates env at startup for fail-fast
**Clarification:** Both are correct. Config.ts validates upfront, but the LLM package uses its own env reads. No code conflict, but implementers should know the LLM package ignores NoteLett's parsed `config` object.
### Finding 13 — RESOLVED: `chatCompletionStream()` now implemented in all providers
**Severity:** Medium F3 (Continue Writing) depends on streaming
**Source:** `packages/llm/src/providers/openai.ts`, `azure-openai.ts`, `mock.ts`
**Resolution:** `chatCompletionStream()` is fully implemented in OpenAI (SSE parsing, buffer handling, `[DONE]` sentinel), Azure OpenAI (same pattern), and Mock (word-by-word simulation). 3 streaming tests in `llm.test.ts`. No further work needed.
### Finding 14 — RESOLVED: CopilotAction union expanded
**Severity:** Medium F1/F2 require new action types
**Resolution:** `CopilotAction` expanded to include `'fix-rewrite'`, `'change-tone'`, `'continue'`, `'explain'`. `CopilotBodySchema` in `notes/routes.ts` updated. `grammar` kept as deprecated alias via `fix-rewrite`.
### Finding 15 — NOTE: `@bytelyst/llm` has zero runtime dependencies
**Severity:** Info
**Source:** `packages/llm/package.json` only `devDependencies: { vitest }`. Uses native `fetch()`.
**Impact:** No extra bundling concerns. Requires Node 18+ or a fetch polyfill.
### Finding 16 — NOTE: `note_artifacts` has `summary` as an existing artifact type
**Severity:** Info
**Source:** `backend/src/modules/note-artifacts/types.ts` line 3: `NOTE_ARTIFACT_TYPES = ['file', 'summary', 'extraction', 'citation', 'export']`
**Impact:** F6 (auto-summarize) can use `artifactType: 'summary'` directly no schema changes needed for artifact types. F20-F23 (export actions) can use `artifactType: 'export'`. Good alignment.
### Finding 17 — NOTE: Agent action type `'summarize'` already exists
**Severity:** Info
**Source:** `backend/src/modules/note-agent-actions/types.ts` line 3: `NOTE_AGENT_ACTION_TYPES = ['create', 'update', 'summarize', 'extract_tasks', 'attach_citation']`
**Impact:** We can reuse `'summarize'` for F6 (auto-summarize) or still add `'smart_action'` as a general-purpose type. Recommendation: add `'smart_action'` and `'auto_enrich'` as planned, and use `'smart_action'` for all prompt runs (the template slug provides the specificity).
### Finding 18 — NOTE: Phase 5 `weekly-digest` is template #21, but §1.9 seeds only 20
**Severity:** Low Consistency
**Issue:** §5.2 says "Add template #21: `weekly-digest`". This means Phase 5 adds a 21st template, seeded separately from the initial 20.
**Clarification:** This is correct behavior built-in template count grows from 20 to 21 in Phase 5. The seed file should support incremental additions (upsert by slug, not hard-coded count).
---
### Summary of Inline Fixes Applied
| # | Finding | Severity | Status |
|---|---------|----------|--------|
| 1 | Timeline diagram wrong dependency | Medium | **Fixed** |
| 2 | Missing `productId` in PromptTemplateDoc | Critical | **Fixed** |
| 3 | Reading time `POST` `GET` | Low | **Fixed** |
| 4 | Backend file count 18+8 11+7 | Low | **Fixed** |
| 5 | Web missing copilot-client.ts + types.ts | Medium | **Fixed** |
| 6 | Mobile tabs overflow (voice/url/scan) | High | **Fixed** |
| 7 | Phase 6 duplicate test count | Low | **Fixed** |
| 8 | Embedding storage strategy | High | **Resolved** `@bytelyst/llm` `embed()` implemented in OpenAI, Azure, Mock providers. Separate `note_embeddings` container deferred to Phase 2. |
| 9 | Voice transcription backend unspecified | Medium | **Resolved** `POST /api/transcribe` added to extraction-service (OpenAI Whisper API). `transcribe()` added to `@bytelyst/extraction` client. |
| 10 | URL-extract endpoint in wrong phase | Medium | **Resolved** `POST /api/note-prompts/url-extract` implemented in Phase 1 backend routes. |
| 11 | Webhook container missing | Low | **Resolved** `note_prompt_webhooks` container added to `cosmos-init.ts`. |
| 12 | LLM factory vs Zod config clarification | Low | **Resolved** info only, no conflict. |
| 13 | Streaming not implemented in providers | Medium | **Resolved** `chatCompletionStream()` implemented in OpenAI, Azure, Mock providers with SSE parsing. |
| 14 | CopilotAction union needs expansion | Medium | **Resolved** expanded to include `fix-rewrite`, `change-tone`, `continue`, `explain`. |
| 15 | Zero runtime deps in @bytelyst/llm | Info | Noted |
| 16 | Artifact type 'summary' already exists | Info | Noted |
| 17 | Agent action 'summarize' already exists | Info | Noted |
| 18 | Template #21 added in Phase 5 | Low | Noted |