learning_ai_notes/docs/SMART_ACTIONS_ROADMAP.md

63 KiB

Smart Actions Roadmap — End-to-End Implementation (v2)

Feature: AI-powered note intelligence — Smart Actions, inline editor AI, capture enhancement, cross-note intelligence, agent workflows Scope: learning_ai_common_plat (shared LLM packages) + learning_ai_notes (backend + web + mobile) Author: Product Team Date: April 2026 Version: 2.0 — Full 27-feature roadmap across 6 categories, 7 phases


Executive Summary

This roadmap delivers a comprehensive AI layer for NoteLett across 27 features in 6 categories, spanning the shared platform, backend, web, and mobile. It transforms NoteLett into an AI-native knowledge workspace where:

  • Users run Smart Actions on notes (text + images) — summarize, translate, rate food labels, parse receipts
  • The editor has inline AI — rewrite, change tone, continue writing, explain highlighted text
  • Note intelligence runs in the background — auto-summarize, auto-tag, detect duplicates, suggest links
  • Capture is AI-enhanced — voice-to-note, screenshot OCR, URL extraction, multi-image processing
  • Cross-note intelligence — weekly digests, knowledge gap detection, note merge/compare
  • Agent workflows — scheduled actions, webhook triggers, approval-gated actions, action chains

The feature spans two codebases:

  1. learning_ai_common_plat — Enhance @bytelyst/llm with vision + embedding support
  2. learning_ai_notes — Backend module + web UI + mobile app

Timeline Overview

Phase What Where Duration Depends on Parallel?
0 LLM vision + embedding support common-plat 3 days
1 Note prompts core + copilot upgrade backend 4-5 days Phase 0
2 Note intelligence (background AI) backend 2-3 days Phase 1
3 Smart Actions web UI + editor AI web 4-5 days Phase 1 Yes (with 4, 5)
4 Smart Actions mobile + capture mobile 4-5 days Phase 1 Yes (with 3, 5)
5 Agent & workflow intelligence backend + web 2-3 days Phase 2 Yes (with 3, 4)
6 Polish, E2E, documentation all 2-3 days Phases 3-5

Total: ~18-24 days sequential, ~14-18 days with parallel execution

Phase 0 (3d) → Phase 1 (5d) ─┬──→ Phase 2 (3d) ──→ Phase 5 (3d) ──┐
                              ├──→ Phase 3 (5d) ────────────────────→ Phase 6 (3d)
                              └──→ Phase 4 (5d) ────────────────────┘

Feature Master List — 27 Features, 6 Categories

Cat 1: Inline Editor AI

# Feature Description Phase
F1 Fix & Rewrite Select text → rewrite with proper grammar, tone, clarity 3
F2 Change Tone Rewrite selection as formal / casual / professional / friendly 3
F3 Continue Writing LLM generates next 2-3 paragraphs from cursor context (streaming) 3
F4 Inline Q&A Highlight term → "Explain this" → tooltip with definition 3
F5 Auto-tag suggestion After save, LLM suggests 3-5 tags based on content 1

Cat 2: Note Intelligence

# Feature Description Phase
F6 Auto-summarize on save Note body > 300 words → auto-generate summary artifact 2
F7 Smart title suggestion Upgrade existing suggestTitleFromBody() to use @bytelyst/llm 1
F8 Duplicate/similar note detection Before save, warn if semantically similar notes exist 2
F9 Auto-link related notes After creation, suggest 3-5 related notes to link to 2
F10 Reading time estimate Display estimated reading time on each note 1

Cat 3: Multi-Note Intelligence

# Feature Description Phase
F11 Weekly workspace digest Auto-generate summary of all workspace activity this week 5
F12 Knowledge gap detection Identify topics mentioned but under-covered in workspace 2
F13 Note merge Select 2+ notes → LLM merges into single coherent note 1
F14 Compare notes Select 2 notes → LLM produces comparison summary 1

Cat 4: Capture Enhancement

# Feature Description Phase
F15 Voice-to-note Record audio → transcribe → save as note 4
F16 Screenshot-to-note Share screenshot → OCR + LLM cleanup → structured note 4
F17 URL-to-note Paste URL → extract content → summarize → save 4
F18 Multi-image capture Photograph multiple pages → combine into one note 4
F19 Clipboard AI paste Paste messy text → LLM cleans and structures it 4

Cat 5: Export & Sharing Intelligence

# Feature Description Phase
F20 Shareable summary One-click polished shareable version of a note 1+3
F21 Presentation outline Note → structured slide outline (title + bullets) 1+3
F22 Email draft Note → formatted email with subject, greeting, body 1+3
F23 Social post Note → Twitter/LinkedIn post draft 1+3

Cat 6: Agent & Workflow Intelligence

# Feature Description Phase
F24 Smart Action chains Pipe output of one action as input to next 1
F25 Scheduled Smart Actions Cron-like: "Summarize workspace every Friday" 5
F26 Webhook-triggered actions External event → auto-run a Smart Action 5
F27 Approval-gated actions High-risk actions require human review before applying 5

Phase 0 — Common Platform: LLM Vision + Embedding Support

Repo: learning_ai_common_plat Duration: 3 days Depends on: Nothing Features enabled: Foundation for all F1-F27

0.1 Enhance @bytelyst/llm ChatMessage for multipart content

The current ChatMessage.content is string-only. Vision models (GPT-4o, Gemini) require multipart content arrays.

File: packages/llm/src/types.ts

Change Detail
New ContentPart type { type: 'text'; text: string } | { type: 'image_url'; image_url: { url: string; detail?: 'auto' | 'low' | 'high' } }
Update ChatMessage.content string | ContentPart[]
New isVisionMessage() helper Type guard to check if a message contains image parts
New buildVisionMessage() helper Convenience: (text: string, imageUrl: string) => ChatMessage

Tests: 8-10 new tests

0.2 Update OpenAIProvider for vision

File: packages/llm/src/providers/openai.ts

Change Detail
Pass multipart content to API When content is an array, send as-is (OpenAI format)
Default model upgrade If any message has image content, auto-suggest gpt-4o

Tests: 4-6 new tests (mock HTTP)

0.3 Update AzureOpenAIProvider for vision

Same multipart content handling as OpenAI provider. Tests: 4-6 new tests.

0.4 Update MockLLMProvider

Return deterministic mock responses when vision content is detected, for downstream test use.

0.5 Add streaming support enhancement

Ensure chatCompletionStream() works with multipart content for F3 (Continue Writing).

0.6 Add embedding support (for F8, F9, F12)

File: packages/llm/src/types.ts + providers

Change Detail
New EmbeddingRequest type { input: string | string[]; model?: string }
New EmbeddingResponse type { embeddings: number[][]; model: string; usage: TokenUsage }
Add embed() to LLMProvider Optional method for embedding generation
Implement in OpenAIProvider Call /v1/embeddings endpoint
Implement in AzureOpenAIProvider Call Azure embeddings endpoint
Implement in MockLLMProvider Return deterministic fake embeddings

Tests: 6-8 new tests

0.7 Export new types + helpers

File: packages/llm/src/index.ts — export ContentPart, EmbeddingRequest, EmbeddingResponse, isVisionMessage, buildVisionMessage

0.8 Update @bytelyst/llm-router

Change Detail
Vision-aware routing classifyPrompt() detects image content → routes to vision-capable models
Model capability flags Add supportsVision: boolean and supportsEmbedding: boolean to ModelConfig

0.9 Publish updated packages

Bump versions → publish to Gitea npm registry.

Phase 0 Deliverables:

  • @bytelyst/llm@0.2.0 — vision + embedding + streaming enhancements
  • @bytelyst/llm-router@0.2.0 — vision-aware routing + capability flags
  • All existing tests pass + 25-30 new tests
  • Published to Gitea npm registry

Phase 1 — Backend: Note Prompts Core + Copilot Upgrade

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 0 Features: F5 (auto-tag), F7 (smart title), F10 (reading time), F13 (merge), F14 (compare), F20-F23 (templates), F24 (chains)

1.1 Add LLM dependency to backend

File: backend/package.json — add "@bytelyst/llm": "^0.2.0"

1.2 Create backend/src/lib/llm.ts

Singleton wrapper over @bytelyst/llm:

import { getLLM, type LLMProvider } from '@bytelyst/llm';

let _llm: LLMProvider | null = null;
export function getNoteLettLLM(): LLMProvider {
  if (!_llm) _llm = getLLM();
  return _llm;
}

1.3 Add LLM env vars to config

File: backend/src/lib/config.ts

Variable Default Description
LLM_PROVIDER openai openai / azure / mock
OPENAI_API_KEY OpenAI API key
OPENAI_BASE_URL Optional base URL override
AZURE_OPENAI_ENDPOINT Azure OpenAI endpoint
AZURE_OPENAI_API_KEY Azure OpenAI key
LLM_DEFAULT_MODEL gpt-4o-mini Default model for text prompts
LLM_VISION_MODEL gpt-4o Default model for image prompts
LLM_EMBEDDING_MODEL text-embedding-3-small Default model for embeddings

1.4 New Cosmos container: note_prompts

File: backend/src/lib/cosmos-init.ts — register note_prompts container (partition key: /userId)

1.5 Create backend/src/modules/note-prompts/types.ts

Key types:

  • PromptTemplateDoc — id, productId, userId, slug, name, description, category, systemPrompt, userPromptTemplate, inputType (text/image/text+image/multi-note), outputFormat, outputAction (new_note/artifact/update_note), parameters, builtIn, createdAt, updatedAt
  • PromptParameter — key, label, type (string/select), options, default, required
  • RunPromptInput — noteId, workspaceId, promptTemplateId OR inlinePrompt, parameters, imageUrls, additionalNoteIds (for F13/F14 merge/compare), previousResultNoteId (for F24 chains), dryRun, agentId
  • RunPromptOutput — resultNoteId, resultArtifactId, content, model, tokenUsage, agentActionId, suggestedTags (for F5)

Zod schemas for all of the above.

1.6 Create backend/src/modules/note-prompts/repository.ts

CRUD for PromptTemplateDoc:

  • listTemplates(userId) — returns built-in + user's custom templates
  • getTemplate(id, userId)
  • createTemplate(doc)
  • updateTemplate(id, userId, updates)
  • deleteTemplate(id, userId) — cannot delete built-in

1.7 Create backend/src/modules/note-prompts/runner.ts

The core orchestration logic:

1. Validate input (template or inline prompt)
2. Fetch the source note (verify ownership + productId)
3. If additionalNoteIds provided (F13/F14 merge/compare):
   a. Fetch all additional notes
   b. Combine content into multi-note context
4. If template has inputType 'image' or 'text+image':
   a. Fetch artifact images from blob storage via SAS URLs
   b. Build vision message with buildVisionMessage()
5. If previousResultNoteId provided (F24 chains):
   a. Fetch previous result note
   b. Include its content as additional context
6. Build LLM messages array:
   - System: template.systemPrompt (or default)
   - User: interpolated template with note content + images + additional notes
7. Call getNoteLettLLM().chatCompletion(request)
8. Post-process response:
   - If template slug is 'auto-tag': parse tags from response → return as suggestedTags
   - If outputAction is 'new_note': createNote() → link to source via note-relationships
   - If outputAction is 'artifact': createNoteArtifact() on source note
   - If outputAction is 'update_note': updateNote() body/tags on source note
9. Record NoteAgentActionDoc (actionType: 'smart_action')
10. Return RunPromptOutput

1.8 Create backend/src/modules/note-prompts/routes.ts

Method Path Auth Description Features
GET /api/prompt-templates viewer List built-in + user templates Core
GET /api/prompt-templates/:id viewer Get single template Core
POST /api/prompt-templates admin Create custom template Core
PATCH /api/prompt-templates/:id admin Update custom template Core
DELETE /api/prompt-templates/:id admin Delete custom template Core
POST /api/note-prompts/run admin Run a prompt on a note Core
POST /api/note-prompts/run-stream admin Run with SSE streaming F3
GET /api/note-prompts/history viewer List past prompt runs Core
POST /api/notes/:id/suggest-tags admin Suggest tags via LLM F5
GET /api/notes/:id/reading-time viewer Calculate reading time F10
POST /api/notes/compare admin Compare 2+ notes F14
POST /api/notes/merge admin Merge 2+ notes F13

1.9 Seed 20 built-in prompt templates

File: backend/src/modules/note-prompts/seed.ts

# Slug Name Input Output Category Feature
1 summarize Summarize text new_note transform Core
2 translate Translate text new_note transform Core
3 simplify Simplify / ELI5 text artifact transform Core
4 extract-key-facts Extract Key Facts text artifact extract Core
5 food-label-rating Rate Food Label image new_note analysis Core
6 parse-receipt Parse Receipt image new_note extract Core
7 read-business-card Read Business Card image new_note extract Core
8 handwriting-to-text Handwriting to Text image new_note transform Core
9 generate-flashcards Generate Flashcards text new_note generate Core
10 pros-and-cons Pros & Cons text artifact analysis Core
11 presentation-outline Presentation Outline text new_note generate F21
12 email-draft Email Draft text new_note generate F22
13 social-post Social Post text artifact generate F23
14 shareable-summary Shareable Summary text new_note transform F20
15 compare-notes Compare Notes multi-note new_note analysis F14
16 merge-notes Merge Notes multi-note new_note transform F13
17 fix-rewrite Fix & Rewrite text update_note transform F1
18 change-tone Change Tone text update_note transform F2
19 continue-writing Continue Writing text update_note generate F3
20 auto-tag Auto-Tag text update_note extract F5

1.10 Upgrade copilot-transform.ts to use @bytelyst/llm (F1, F2, F7)

File: backend/src/lib/copilot-transform.ts

Replace extraction-service calls with direct @bytelyst/llm calls:

  • runCopilotTransform()getNoteLettLLM().chatCompletion() with action-specific system prompts
  • suggestTitleFromBody()getNoteLettLLM().chatCompletion() with title-suggestion prompt
  • Add rewriteText(text, style) for F1/F2 — accepts tone parameter
  • Keep extraction-service fallback for graceful degradation

1.11 Reading time utility (F10)

File: backend/src/lib/reading-time.ts

export function estimateReadingTime(html: string): { minutes: number; words: number } {
  const plain = html.replace(/<[^>]*>/g, ' ').replace(/\s+/g, ' ').trim();
  const words = plain.split(/\s+/).length;
  return { minutes: Math.max(1, Math.ceil(words / 238)), words };
}

Pure calculation — no LLM needed. Expose via GET /api/notes/:id response and note detail endpoints.

1.12 Extend agent action types

File: backend/src/modules/note-agent-actions/types.ts

Add 'smart_action' and 'auto_enrich' to NOTE_AGENT_ACTION_TYPES.

1.13 Register routes in server.ts + MCP tool

  • backend/src/server.ts — register notePromptRoutes
  • backend/src/mcp/note-tool-contracts.ts — add notes.prompts.run to NOTES_MCP_TOOL_NAMES
  • backend/src/mcp/note-tools.ts — implement executeRunPrompt()

1.14 Tests

Test file Coverage Count
note-prompts/repository.test.ts Template CRUD 8-10
note-prompts/runner.test.ts Prompt execution with mock LLM, chains, multi-note 15-18
note-prompts/routes.test.ts API endpoint integration 10-12
lib/copilot-transform.test.ts Upgraded copilot with LLM 4-6
lib/reading-time.test.ts Reading time calculation 4-5
mcp/note-tools.test.ts notes.prompts.run MCP tool 4-6

Phase 1 Deliverables:

  • note-prompts module: types, repository, runner, routes, seed (20 templates)
  • lib/llm.ts singleton + config extended with LLM env vars
  • lib/reading-time.ts pure utility (F10)
  • Upgraded copilot-transform.ts using @bytelyst/llm (F1, F2, F7)
  • Multi-note support in runner (F13 merge, F14 compare)
  • Chain support in runner via previousResultNoteId (F24)
  • smart_action + auto_enrich agent action types
  • notes.prompts.run MCP tool
  • note_prompts Cosmos container
  • 45-57 new tests

Phase 2 — Backend: Note Intelligence (Background AI)

Repo: learning_ai_notes Duration: 2-3 days Depends on: Phase 1 Features: F6 (auto-summarize), F8 (duplicate detection), F9 (auto-link), F12 (knowledge gaps)

2.1 Embedding service: backend/src/lib/embeddings.ts (F8, F9, F12)

import { getNoteLettLLM } from './llm.js';

export async function embedText(text: string): Promise<number[]> {
  const llm = getNoteLettLLM();
  if (!llm.embed) throw new Error('Embedding not supported by current LLM provider');
  const res = await llm.embed({ input: text });
  return res.embeddings[0];
}

export function cosineSimilarity(a: number[], b: number[]): number {
  let dot = 0, magA = 0, magB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    magA += a[i] * a[i];
    magB += b[i] * b[i];
  }
  return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}

2.2 Note embedding storage

File: backend/src/modules/notes/types.ts — add optional embedding: number[] field to NoteDoc

On note create/update, compute embedding in background (non-blocking). Store in Cosmos alongside the note.

2.3 Auto-summarize on save (F6)

File: backend/src/lib/note-hooks.ts

After a note is saved with body > 300 words:

  1. Run "summarize" template via runner.ts
  2. Store result as artifact (type: summary) on the note
  3. Record agent action with actionType: 'auto_enrich'
  4. Gated behind feature flag notelett_auto_summarize_enabled

2.4 Duplicate/similar note detection (F8)

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/notes/:id/check-duplicates

  1. Embed the current note's body
  2. Fetch all notes in workspace with embeddings
  3. Compute cosine similarity
  4. Return notes with similarity > 0.85 threshold

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/notes/:id/suggest-links

  1. Embed the current note
  2. Find top 5 most similar notes (similarity > 0.6, excluding self)
  3. Return as suggested links with similarity scores
  4. UI can accept/dismiss suggestions

2.6 Knowledge gap detection (F12)

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/workspaces/:id/knowledge-gaps

  1. Fetch all notes in workspace
  2. Extract topics from each note (via auto-tag or LLM)
  3. Build topic frequency map
  4. Send to LLM: "Given these topics and their coverage depth, what's missing?"
  5. Return gap analysis as structured JSON

2.7 Tests

Test file Coverage Count
lib/embeddings.test.ts Embed + cosine similarity 6-8
lib/note-hooks.test.ts Auto-summarize trigger logic 4-6
note-prompts/routes.test.ts Duplicate check, suggest links, knowledge gaps 10-12

Phase 2 Deliverables:

  • lib/embeddings.ts — embed text + cosine similarity
  • Note embedding storage on create/update
  • Auto-summarize on save (F6) — feature-flag gated
  • Duplicate detection endpoint (F8)
  • Related notes suggestion endpoint (F9)
  • Knowledge gap detection endpoint (F12)
  • 20-26 new tests

Phase 3 — Web: Smart Actions UI + Editor AI

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 1 (Phase 2 optional for F8/F9 UI) Features: F1-F4 (editor AI), F5 (tag UI), F8-F9 (duplicate/link UI), F10 (reading time UI), F14 (compare UI), F20-F23 (export UI) Can run in parallel with: Phase 4, Phase 5

3.1 API client: web/src/lib/prompt-client.ts

listPromptTemplates(): Promise<PromptTemplate[]>
getPromptTemplate(id: string): Promise<PromptTemplate>
createPromptTemplate(input): Promise<PromptTemplate>
updatePromptTemplate(id, input): Promise<PromptTemplate>
deletePromptTemplate(id): Promise<void>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
runPromptStream(input: RunPromptInput): AsyncIterable<string>  // F3
listPromptHistory(noteId?, limit?): Promise<AgentAction[]>
suggestTags(noteId, workspaceId): Promise<string[]>           // F5
checkDuplicates(noteId, workspaceId): Promise<SimilarNote[]>  // F8
suggestLinks(noteId, workspaceId): Promise<SuggestedLink[]>   // F9
compareNotes(noteIds, workspaceId): Promise<RunPromptOutput>   // F14
mergeNotes(noteIds, workspaceId): Promise<RunPromptOutput>     // F13
getKnowledgeGaps(workspaceId): Promise<GapAnalysis>            // F12

3.2 SmartActionsPanel component

File: web/src/components/SmartActionsPanel.tsx

Renders on the note detail page:

  • Grid of action buttons grouped by category (built-in + custom)
  • Each button: icon + name + inputType badge (text/image/multi)
  • Click → opens RunPromptModal
  • Shows recent prompt runs for this note
  • Reading time display (F10)
  • "Suggest tags" button (F5)

3.3 RunPromptModal component

File: web/src/components/RunPromptModal.tsx

  • Template selector (filtered by input type compatibility)
  • Parameter inputs (e.g., target language, tone)
  • Image picker (browse note artifacts or upload new)
  • Multi-note selector (for merge/compare — F13, F14)
  • Inline prompt textarea (for custom one-off prompts)
  • Chain toggle: "Continue from previous result" (F24)
  • Dry-run checkbox
  • "Run" button with loading spinner

3.4 PromptResultView component

File: web/src/components/PromptResultView.tsx

  • Markdown renderer for LLM response
  • Action buttons: "Save as Note", "Save as Artifact", "Apply to Note", "Discard"
  • Token usage + model info footer
  • Link to created note or artifact

3.5 Prompt Template Library page

File: web/src/app/(app)/prompts/page.tsx

  • Browse all 20 built-in templates (read-only cards)
  • User's custom templates (edit/delete)
  • "Create Custom Prompt" → opens PromptTemplateEditor
  • Category filter tabs: All, Analysis, Transform, Extract, Generate, Custom

3.6 PromptTemplateEditor component

File: web/src/components/PromptTemplateEditor.tsx

  • Form: name, slug, description, category, system prompt, user prompt template
  • Input type selector (text / image / text+image / multi-note)
  • Output format + output action selectors
  • Parameter builder (add/remove dynamic parameters)
  • Template variable reference: {{note.title}}, {{note.body}}, {{note.tags}}, {{params.X}}
  • Live preview

3.7 Upgrade NoteEditor with advanced Copilot (F1-F4)

File: web/src/components/NoteEditor.tsx

Enhance existing Copilot toolbar:

Current Upgraded
shorten Keep (uses @bytelyst/llm now)
expand Keep (uses @bytelyst/llm now)
bulletize Keep (uses @bytelyst/llm now)
grammar Replace with "Fix & Rewrite" (F1) — full rewrite, not just grammar
Add "Change Tone" (F2) — dropdown: formal/casual/professional/friendly
Add "Continue Writing" (F3) — inserts at cursor, streams token-by-token
Add "Explain" (F4) — tooltip popover with definition/explanation

F3 (Continue Writing) implementation:

  • Get text before cursor position from TipTap editor state
  • Call runPromptStream() (SSE)
  • Insert streamed tokens into editor in real-time via TipTap commands

F4 (Inline Q&A) implementation:

  • Select text → right-click or toolbar button → "Explain this"
  • Opens floating popover below selection
  • Calls LLM with "Explain this term/concept concisely: {selection}"
  • Shows result in popover (dismissible)

3.8 Duplicate detection UI (F8)

After note save, if notelett_duplicate_check_enabled flag is on:

  • Call checkDuplicates()
  • If similar notes found → show toast: "This note is similar to 'Note X' (87% match). View?"
  • Click → opens side-by-side comparison

After note creation:

  • Call suggestLinks()
  • If suggestions found → show panel: "Related notes you might want to link"
  • Each suggestion: note title + similarity % + "Link" / "Dismiss" buttons

3.10 Auto-tag suggestion UI (F5)

On the note detail page SmartActionsPanel:

  • "Suggest Tags" button
  • Calls /api/notes/:id/suggest-tags
  • Shows tag chips with + button to accept each
  • Accepted tags are added to the note

3.11 Export actions UI (F20-F23)

In SmartActionsPanel, export templates appear with share icons:

  • "Shareable Summary" → generates polished version → copy or share via note-shares
  • "Presentation Outline" → generates outline → saves as new note
  • "Email Draft" → generates email → copy to clipboard
  • "Social Post" → generates post → copy to clipboard

3.12 Knowledge gap analysis UI (F12)

File: web/src/app/(app)/workspaces/[id]/gaps/page.tsx

  • "Analyze Knowledge Gaps" button on workspace page
  • Shows gap analysis: topics with thin coverage, suggested new note topics
  • "Create Note" button for each gap → pre-fills title

3.13 Wire into note detail page + sidebar

  • web/src/app/(app)/notes/[noteId]/page.tsx — add SmartActionsPanel, reading time, duplicate warning
  • web/src/components/Sidebar.tsx — add "Prompts" nav item (sparkle icon)
  • Keyboard shortcut: Cmd+Shift+A → open Smart Actions panel

3.14 Tests

Test file Coverage Count
prompt-client.test.ts API client functions 8-10
SmartActionsPanel.test.tsx Render + click handlers 4-6
RunPromptModal.test.tsx Form + submission + multi-note 4-6
NoteEditor.test.tsx Copilot upgrade (F1-F4) 6-8
e2e/smart-actions.spec.ts Full flow E2E 6

Phase 3 Deliverables:

  • prompt-client.ts API client (all endpoints)
  • 5 new components: SmartActionsPanel, RunPromptModal, PromptResultView, PromptTemplateEditor, KnowledgeGapView
  • /prompts template library page
  • /workspaces/[id]/gaps knowledge gap page (F12)
  • NoteEditor upgraded with F1-F4 (Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A)
  • Duplicate detection toast (F8)
  • Related notes suggestion panel (F9)
  • Auto-tag suggestion UI (F5)
  • Export actions UI (F20-F23)
  • Reading time display (F10)
  • Sidebar updated, keyboard shortcut
  • 28-36 new tests + 6 E2E tests

Phase 4 — Mobile: Smart Actions + AI-Enhanced Capture

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 1 Features: F15 (voice), F16 (screenshot), F17 (URL), F18 (multi-image), F19 (clipboard) Can run in parallel with: Phase 3, Phase 5

4.1 New dependencies

Package Purpose
expo-image-picker Camera capture + gallery selection
expo-av Audio recording for voice-to-note (F15)
expo-clipboard Clipboard access for AI paste (F19)
expo-sharing Share results

4.2 API client: mobile/src/api/note-prompts.ts

listPromptTemplates(): Promise<PromptTemplate[]>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
suggestTags(noteId, workspaceId): Promise<string[]>

4.3 Enhance blob upload: mobile/src/api/blob-upload.ts

Upgrade existing stub:

  • Camera capture via expo-image-picker (photo + gallery)
  • Image resize (max 2048px, compress to < 4MB)
  • Upload to blob storage via @bytelyst/blob-client
  • Return blobPath + SAS URL

4.4 Zustand store: mobile/src/store/prompt-store.ts

interface PromptState {
  templates: PromptTemplate[];
  isRunning: boolean;
  lastResult: RunPromptOutput | null;
  error: string | null;
  fetchTemplates(): Promise<void>;
  runPrompt(input: RunPromptInput): Promise<RunPromptOutput>;
  clearResult(): void;
}

4.5 SmartActionsSheet component

File: mobile/src/app/note/SmartActionsSheet.tsx

Bottom sheet (react-native-gesture-handler) that slides up from note detail:

  • Scrollable grid of action buttons (icon + name)
  • Category filter tabs (All, Text, Image, Custom)
  • Actions trigger either:
    • Direct run (text actions)
    • Camera/gallery picker → then run (image actions)

4.6 PromptResultScreen

File: mobile/src/app/note/prompt-result.tsx

  • Markdown-rendered LLM response
  • "Save as Note" / "Discard" buttons
  • Model info + token count
  • Navigate to new note after saving

4.7 Voice-to-note (F15)

File: mobile/src/app/capture/voice.tsx (sub-route of capture, NOT a new tab)

  1. Record audio via expo-av (Audio.Recording)
  2. Upload audio file to blob storage
  3. Call backend transcription endpoint (or use extraction-service with speech task)
  4. Show transcribed text for review/edit
  5. Save as note
  6. Optionally run a Smart Action on the result (e.g., "Extract Key Facts")

4.8 Screenshot-to-note (F16)

On the capture tab:

  • "From Screenshot" button → gallery picker (images only)
  • Upload image → blob storage
  • Run "handwriting-to-text" or custom OCR prompt
  • Show result for review → save as note

4.9 URL-to-note (F17)

On the capture tab:

  • "From URL" input field
  • Backend endpoint: POST /api/note-prompts/url-extract
    • Fetches URL content (server-side to avoid CORS)
    • Strips HTML → extracts main content
    • Runs "summarize" template
    • Returns structured result
  • Show summary for review → save as note

4.10 Multi-image capture (F18)

On the capture tab:

  • "Scan Document" button → camera in continuous mode
  • Take multiple photos (whiteboard pages, multi-page document)
  • Upload all images → blob storage
  • Run each through vision model sequentially
  • Combine results into single note body
  • Show merged result for review → save

4.11 Clipboard AI paste (F19)

On the capture tab:

  • "Paste & Clean" button
  • Read clipboard via expo-clipboard
  • If clipboard contains text → run "fix-rewrite" template
  • If clipboard contains URL → trigger URL-to-note flow (F17)
  • Show cleaned result → save as note

4.12 Enhance capture tab

File: mobile/src/app/(tabs)/capture.tsx

Add new capture methods alongside existing text draft:

┌─────────────────────────────────────┐
│  Quick Capture                       │
│  ┌───────┐ ┌───────┐ ┌───────┐     │
│  │  Text  │ │ Photo │ │ Voice │     │
│  └───────┘ └───────┘ └───────┘     │
│  ┌───────┐ ┌───────┐ ┌───────┐     │
│  │  URL  │ │  Scan │ │ Paste │     │
│  └───────┘ └───────┘ └───────┘     │
│                                      │
│  [existing text capture form]        │
└─────────────────────────────────────┘

4.13 Wire Smart Actions into note detail

File: mobile/src/app/note/[id].tsx

  • "AI Actions" button in the header
  • Opens SmartActionsSheet
  • Shows reading time (F10)
  • Shows suggested tags after save (F5)

4.14 Offline queue integration

Prompt runs that fail → queue via @bytelyst/offline-queue for retry.

4.15 Tests

Test file Coverage Count
api/note-prompts.test.ts API client 4-6
store/prompt-store.test.ts Store actions 6-8
SmartActionsSheet.test.tsx Render + interactions 4-6
capture.test.tsx New capture methods 4-6

Phase 4 Deliverables:

  • note-prompts.ts API client + prompt-store.ts Zustand store
  • Camera capture + image resize + blob upload
  • SmartActionsSheet bottom sheet + PromptResultScreen
  • Voice-to-note flow (F15) — expo-av recording
  • Screenshot-to-note (F16) — gallery + vision OCR
  • URL-to-note (F17) — server-side fetch + summarize
  • Multi-image scan (F18) — continuous camera + combine
  • Clipboard AI paste (F19) — read + clean
  • Enhanced capture tab with 6 capture modes
  • Smart Actions on note detail
  • Offline queue for failed runs
  • 18-26 new tests

Phase 5 — Agent & Workflow Intelligence

Repo: learning_ai_notes Duration: 2-3 days Depends on: Phase 2 Features: F11 (weekly digest), F25 (scheduled), F26 (webhooks), F27 (approval-gated) Can run in parallel with: Phases 3, 4

5.1 Scheduled Smart Actions (F25)

File: backend/src/modules/note-prompts/scheduler.ts

Component Detail
PromptScheduleDoc New Cosmos doc: scheduleId, templateId, workspaceId, cron expression, enabled, lastRunAt, nextRunAt
Cosmos container note_prompt_schedules (partition: /workspaceId)
Scheduler loop In-process interval (60s check), matches cron → invokes runner.ts
API endpoints POST /api/prompt-schedules (create), GET (list), PATCH/:id (update), DELETE/:id (delete)

Example: "Summarize all notes in 'Research' workspace every Friday at 5pm"

5.2 Weekly workspace digest (F11)

Built on F25 — a special scheduled action:

  • Pre-configured template: weekly-digest
  • Runs weekly, collects all notes created/modified in workspace that week
  • Produces a digest note with: summary, key themes, new notes list, most active areas
  • Linked to workspace

Add template #21: weekly-digest (system-only, runs via scheduler)

5.3 Webhook-triggered actions (F26)

File: backend/src/modules/note-prompts/webhooks.ts

Component Detail
PromptWebhookDoc webhookId, templateId, workspaceId, triggerEvent, enabled
API endpoint POST /api/prompt-webhooks (create), GET (list), DELETE/:id
Trigger endpoint POST /api/prompt-webhooks/:id/trigger — accepts { noteId, payload }
Supported events note.created, note.updated, note.tagged, external

Example: "When a note is tagged 'receipt', auto-run Parse Receipt"

5.4 Approval-gated actions (F27)

Leverages existing NoteAgentActionDoc with approval states.

Change Detail
New prompt template field requiresApproval: boolean (default: false)
Runner modification If template has requiresApproval, create action with state: 'proposed' instead of state: 'applied'
Review endpoint Already exists: POST /api/agent-actions/:id/review (approve/reject)
Post-approval hook On approval, execute the saved output action (create note / update / artifact)
Web UI ProposalReviewCard already exists — add Smart Action context

5.5 Tests

Test file Coverage Count
note-prompts/scheduler.test.ts Cron matching, schedule CRUD, execution 8-10
note-prompts/webhooks.test.ts Webhook CRUD, trigger, event matching 6-8
note-prompts/runner.test.ts Approval-gated flow 3-4

Phase 5 Deliverables:

  • scheduler.ts — cron-based scheduled prompt execution (F25)
  • weekly-digest template + scheduled action (F11)
  • webhooks.ts — event-triggered prompt execution (F26)
  • Approval-gated actions in runner (F27)
  • note_prompt_schedules Cosmos container
  • API endpoints for schedules + webhooks
  • 17-22 new tests

Phase 6 — Polish, Integration Tests, Documentation

Duration: 2-3 days Depends on: Phases 3-5

6.1 End-to-end integration testing

Test Flow
Web E2E: Food label Create note → attach image → run "Rate Food Label" → verify result note
Web E2E: Summarize Create long note → run "Summarize" → verify summary artifact
Web E2E: Compare Select 2 notes → compare → verify comparison note
Web E2E: Template CRUD Create custom template → use it → edit → delete
Mobile E2E: Camera capture Photo → upload → run prompt → verify result
Mobile E2E: Voice-to-note Record → transcribe → review → save
MCP E2E: Agent prompt Agent calls notes.prompts.run → verify audit trail
Webhook E2E Tag note → webhook fires → prompt runs automatically
Scheduler E2E Schedule created → time triggers → digest generated

6.2 Error handling

Scenario Handling
LLM API key not configured Clear error, disable Smart Actions UI, show setup guide
LLM rate limit (429) Retry with exponential backoff (3 attempts), show "try again later"
LLM timeout 60s timeout, graceful error, suggest retry
Image too large Client-side resize before upload (max 2048px, < 4MB)
Prompt template not found 404 with helpful message
Empty note body (text prompt) Require body or show warning
No images on note (image prompt) Prompt to upload/capture first
Embedding service unavailable Skip duplicate check/auto-link gracefully
Audio recording fails Fallback to text capture, show error
URL fetch fails Show error with suggestion to paste content manually

6.3 Feature flags

Flag Default Controls
notelett_smart_actions_enabled false All Smart Actions UI + API
notelett_auto_summarize_enabled false F6 auto-summarize on save
notelett_duplicate_check_enabled false F8 duplicate detection
notelett_auto_link_enabled false F9 auto-link suggestions
notelett_copilot_llm_enabled false F1-F4 editor AI (vs extraction fallback)
notelett_voice_capture_enabled false F15 voice-to-note
notelett_scheduled_actions_enabled false F25 scheduled actions
notelett_webhooks_enabled false F26 webhook triggers

6.4 Telemetry events

Event Properties
smart_action_run templateSlug, inputType, model, durationMs, tokenUsage
smart_action_result_saved outputAction, resultType
smart_action_template_created category, inputType
smart_action_error errorType, templateSlug
copilot_transform action (rewrite/tone/continue/explain), durationMs
auto_summarize_triggered wordCount, durationMs
duplicate_detected similarityScore, noteId
voice_capture_completed durationSecs, wordCount
url_extract_completed domain, wordCount
scheduled_action_fired scheduleId, templateSlug
webhook_triggered webhookId, triggerEvent

6.5 Documentation updates

  • Update docs/PRD.md — Smart Actions section (§5.2 AI features)
  • Update AGENTS.md — new MCP tool, new module, new env vars
  • Update docs/roadmaps/02_BACKEND_ROADMAP.md — mark Smart Actions complete
  • API reference for all new endpoints (15+ endpoints)
  • docs/SMART_ACTIONS_USER_GUIDE.md — end-user documentation

6.6 Docker / CI updates

  • Add LLM env vars to .env.example
  • Add @bytelyst/llm to scripts/docker-prep.sh tarball list
  • Update backend/Dockerfile for new deps
  • Add expo-image-picker, expo-av to mobile CI build matrix

Phase 6 Deliverables:

  • 9+ E2E integration tests + 1-6 additional integration tests
  • Error handling for all edge cases
  • 8 feature flags for gradual rollout
  • 11 telemetry events
  • Documentation updated (PRD, AGENTS.md, roadmaps, user guide)
  • Docker + CI updated

Test Budget Summary

Phase Unit Tests E2E Tests Total
0 — Common-plat LLM 25-30 25-30
1 — Backend core 45-57 45-57
2 — Note intelligence 20-26 20-26
3 — Web UI + editor AI 22-30 6 28-36
4 — Mobile + capture 18-26 18-26
5 — Agent/workflow 17-22 17-22
6 — Integration/polish 10-15 10-15
Total 147-191 16-21 163-212

New Files Summary

learning_ai_common_plat (Phase 0) — 6-8 files modified

File Change
packages/llm/src/types.ts Add ContentPart, EmbeddingRequest/Response, update ChatMessage
packages/llm/src/helpers.ts New: isVisionMessage(), buildVisionMessage()
packages/llm/src/providers/openai.ts Vision + embedding support
packages/llm/src/providers/azure-openai.ts Vision + embedding support
packages/llm/src/providers/mock.ts Vision + embedding mocks
packages/llm/src/index.ts Export new types + helpers
packages/llm-router/src/types.ts Add supportsVision, supportsEmbedding
packages/llm-router/src/classifier.ts Detect image content

learning_ai_notes/backend (Phases 1, 2, 5) — 11 new + 7 modified

File Status Phase
src/lib/llm.ts New 1
src/lib/config.ts Modified 1
src/lib/cosmos-init.ts Modified 1
src/lib/copilot-transform.ts Modified 1
src/lib/reading-time.ts New 1
src/lib/embeddings.ts New 2
src/lib/note-hooks.ts New 2
src/modules/note-prompts/types.ts New 1
src/modules/note-prompts/repository.ts New 1
src/modules/note-prompts/runner.ts New 1
src/modules/note-prompts/routes.ts New 1
src/modules/note-prompts/seed.ts New 1
src/modules/note-prompts/scheduler.ts New 5
src/modules/note-prompts/webhooks.ts New 5
src/modules/note-agent-actions/types.ts Modified 1
src/mcp/note-tool-contracts.ts Modified 1
src/mcp/note-tools.ts Modified 1
src/server.ts Modified 1

learning_ai_notes/web (Phase 3) — 8 new + 5 modified

File Status
src/lib/prompt-client.ts New
src/components/SmartActionsPanel.tsx New
src/components/RunPromptModal.tsx New
src/components/PromptResultView.tsx New
src/components/PromptTemplateEditor.tsx New
src/app/(app)/prompts/page.tsx New
src/app/(app)/workspaces/[id]/gaps/page.tsx New
e2e/smart-actions.spec.ts New
src/app/(app)/notes/[noteId]/page.tsx Modified
src/components/NoteEditor.tsx Modified
src/components/Sidebar.tsx Modified
src/lib/copilot-client.ts Modified (add new CopilotAction types)
src/lib/types.ts Modified (add PromptTemplate, RunPromptInput/Output, etc.)

learning_ai_notes/mobile (Phase 4) — 8 new + 3 modified

File Status
src/api/note-prompts.ts New
src/api/blob-upload.ts Modified
src/store/prompt-store.ts New
src/app/note/SmartActionsSheet.tsx New
src/app/note/prompt-result.tsx New
src/app/capture/voice.tsx New (sub-route of capture, NOT a tab)
src/app/capture/url.tsx New (sub-route of capture, NOT a tab)
src/app/capture/scan.tsx New (sub-route of capture, NOT a tab)
src/app/(tabs)/capture.tsx Modified
src/app/note/[id].tsx Modified

20 Built-in Prompt Templates

# Slug Name Input Output Category
1 summarize Summarize text new_note transform
2 translate Translate text new_note transform
3 simplify Simplify / ELI5 text artifact transform
4 extract-key-facts Extract Key Facts text artifact extract
5 food-label-rating Rate Food Label image new_note analysis
6 parse-receipt Parse Receipt image new_note extract
7 read-business-card Read Business Card image new_note extract
8 handwriting-to-text Handwriting to Text image new_note transform
9 generate-flashcards Generate Flashcards text new_note generate
10 pros-and-cons Pros & Cons text artifact analysis
11 presentation-outline Presentation Outline text new_note generate
12 email-draft Email Draft text new_note generate
13 social-post Social Post text artifact generate
14 shareable-summary Shareable Summary text new_note transform
15 compare-notes Compare Notes multi-note new_note analysis
16 merge-notes Merge Notes multi-note new_note transform
17 fix-rewrite Fix & Rewrite text update_note transform
18 change-tone Change Tone text update_note transform
19 continue-writing Continue Writing text update_note generate
20 auto-tag Auto-Tag text update_note extract

New Dependencies

Package Where Purpose
@bytelyst/llm@^0.2.0 backend LLM with vision + embedding
expo-image-picker mobile Camera + gallery
expo-av mobile Audio recording (F15)
expo-clipboard mobile Clipboard access (F19)

All other integrations use existing @bytelyst/* packages already in package.json.


New Cosmos Containers

Container Partition Key Phase Purpose
note_prompts /userId 1 Prompt templates (built-in + custom)
note_prompt_schedules /workspaceId 5 Scheduled action definitions

Prompt run results don't need containers — they produce notes (notes) and artifacts (note_artifacts).


New Environment Variables

Variable Default Phase Description
LLM_PROVIDER openai 1 openai / azure / mock
OPENAI_API_KEY 1 OpenAI API key
OPENAI_BASE_URL 1 Optional base URL override
AZURE_OPENAI_ENDPOINT 1 Azure OpenAI endpoint
AZURE_OPENAI_API_KEY 1 Azure OpenAI key
LLM_DEFAULT_MODEL gpt-4o-mini 1 Default text model
LLM_VISION_MODEL gpt-4o 1 Default vision model
LLM_EMBEDDING_MODEL text-embedding-3-small 2 Default embedding model

New API Endpoints (15 endpoints)

Method Path Phase Feature
GET /api/prompt-templates 1 List templates
GET /api/prompt-templates/:id 1 Get template
POST /api/prompt-templates 1 Create template
PATCH /api/prompt-templates/:id 1 Update template
DELETE /api/prompt-templates/:id 1 Delete template
POST /api/note-prompts/run 1 Run prompt
POST /api/note-prompts/run-stream 1 Run prompt (SSE)
GET /api/note-prompts/history 1 Prompt run history
POST /api/notes/:id/suggest-tags 1 F5
POST /api/notes/compare 1 F14
POST /api/notes/merge 1 F13
POST /api/notes/:id/check-duplicates 2 F8
POST /api/notes/:id/suggest-links 2 F9
POST /api/workspaces/:id/knowledge-gaps 2 F12
POST /api/note-prompts/url-extract 4 F17
CRUD /api/prompt-schedules 5 F25
CRUD /api/prompt-webhooks 5 F26

Commit Strategy

Phase 0 commits (common-plat)

feat(llm): add ContentPart type + multipart ChatMessage.content support
feat(llm): update OpenAI + Azure providers for vision messages
feat(llm): add embedding support (EmbeddingRequest/Response, embed())
feat(llm): add isVisionMessage + buildVisionMessage helpers
test(llm): add vision + embedding tests (30 tests)
feat(llm-router): add supportsVision + supportsEmbedding model capability flags
chore(llm): bump to 0.2.0 + publish

Phase 1 commits (notelett backend)

feat(backend): add @bytelyst/llm + lib/llm.ts singleton + LLM config
feat(note-prompts): types + Zod schemas for templates and run input/output
feat(note-prompts): repository — template CRUD
feat(note-prompts): runner — LLM orchestration + multi-note + chains
feat(note-prompts): routes — REST API endpoints (12 routes)
feat(note-prompts): seed 20 built-in prompt templates
feat(backend): upgrade copilot-transform.ts to use @bytelyst/llm
feat(backend): add reading-time utility
feat(agent-actions): add smart_action + auto_enrich types
feat(mcp): add notes.prompts.run tool
test(note-prompts): full test suite (55 tests)

Phase 2 commits

feat(backend): embeddings service — embed text + cosine similarity
feat(backend): note embedding storage on create/update
feat(backend): auto-summarize on save (feature-flag gated)
feat(backend): duplicate detection endpoint
feat(backend): related notes suggestion endpoint
feat(backend): knowledge gap detection endpoint
test(backend): intelligence tests (25 tests)

Phase 3 commits

feat(web): prompt-client API client
feat(web): SmartActionsPanel + RunPromptModal + PromptResultView
feat(web): PromptTemplateEditor + /prompts library page
feat(web): upgrade NoteEditor — Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A
feat(web): duplicate detection toast + related notes panel
feat(web): auto-tag suggestion UI + export actions
feat(web): knowledge gap analysis page
feat(web): wire Smart Actions into note detail + sidebar
test(web): Smart Actions unit + E2E tests (36 tests)

Phase 4 commits

feat(mobile): note-prompts API client + prompt-store
feat(mobile): camera capture + image resize + blob upload
feat(mobile): SmartActionsSheet bottom sheet + PromptResultScreen
feat(mobile): voice-to-note — expo-av recording + transcription
feat(mobile): screenshot-to-note + multi-image scan
feat(mobile): URL-to-note + clipboard AI paste
feat(mobile): enhanced capture tab with 6 capture modes
test(mobile): Smart Actions tests (26 tests)

Phase 5 commits

feat(backend): scheduled Smart Actions — cron scheduler + CRUD
feat(backend): weekly workspace digest template + scheduled action
feat(backend): webhook-triggered actions — CRUD + trigger endpoint
feat(backend): approval-gated actions in runner
test(backend): scheduler + webhook tests (22 tests)

Phase 6 commits

feat(all): 8 feature flags for gradual rollout
feat(all): 11 telemetry events for Smart Actions
docs: update PRD, AGENTS.md, roadmaps for Smart Actions
docs: Smart Actions user guide
chore: update .env.example + Docker for LLM support
test: end-to-end integration tests (15 tests)

Risk Mitigation

Risk Mitigation
OpenAI API costs Per-user daily quota, model tier selection (gpt-4o-mini default, gpt-4o vision only), feature flag gating
Vision prompt latency (5-15s) Progress indicator, allow background processing, cache identical requests
Image size limits Client-side resize to max 2048px, compress < 4MB before upload
Prompt injection System prompt hardening, output validation, truncate excessively long inputs
LLM hallucination JSON mode where possible, output schema validation, clear UI disclaimer
Corporate proxy blocking OpenAI Support Azure OpenAI as alternative (already in @bytelyst/llm)
Embedding cost at scale Batch embeddings, cache embeddings on note doc, recompute only on content change
Audio transcription accuracy Show editable preview before saving, allow manual corrections
Scheduler reliability In-process interval (simple), log missed runs, diagnostics endpoint

Future Extensions (Not in This Roadmap)

  • RAG context — include related notes as context in prompts for better answers
  • Agent marketplace prompts — share templates across ByteLyst products
  • Multi-step workflow builder — visual chain editor (drag-and-drop)
  • Streaming for mobile — SSE on React Native for real-time token display
  • Collaborative Smart Actions — run prompts across shared workspaces
  • Custom model support — plug in local Ollama models via @bytelyst/ollama-client
  • Action replay — re-run a previous Smart Action with same parameters
  • Template versioning — track changes to custom templates over time

Appendix: Review Findings & Resolutions

Systematic code-level audit of this roadmap against the actual NoteLett and common-plat codebases, conducted April 2026. Each finding cross-references the real source files.

Finding 1 — FIXED: Timeline diagram showed wrong dependency flow

Severity: Medium — Incorrect diagram could mislead parallel scheduling Was: Phase 3/4 branching from Phase 2 Fix: Phase 3/4 branch from Phase 1. Phase 2 → Phase 5. Diagram corrected above.

Finding 2 — FIXED: PromptTemplateDoc was missing productId field

Severity: Critical — violates NoteLett convention: every Cosmos document MUST include productId: "notelett" Source: backend/src/modules/notes/types.ts — all other docs (NoteDoc, NoteArtifactDoc, NoteAgentActionDoc) have productId Fix: Added productId, userId, createdAt, updatedAt to the type definition in §1.5. Note: PromptScheduleDoc (§5.1) and PromptWebhookDoc (§5.3) must also include productId and userId when implemented.

Finding 3 — FIXED: Reading time endpoint was POST, should be GET

Severity: Low — Pure calculation with no side effects Source: REST convention — GET for idempotent read operations Fix: Changed to GET /api/notes/:id/reading-time in §1.8.

Finding 4 — FIXED: Backend file count was wrong (claimed 18+8, actual 11+7)

Severity: Low — Documentation accuracy Fix: Corrected to "11 new + 7 modified" in New Files Summary.

Finding 5 — FIXED: Web file count missing copilot-client.ts and types.ts

Severity: Medium — These files MUST be updated but were omitted Source: web/src/lib/copilot-client.ts defines CopilotAction = 'shorten' | 'expand' | 'bulletize' | 'grammar' — needs new types for F1/F2. web/src/lib/types.ts needs PromptTemplate, RunPromptInput, RunPromptOutput, SimilarNote, SuggestedLink, GapAnalysis types. Fix: Added both to web modified files list. Count corrected to "8 new + 5 modified".

Finding 6 — FIXED: Mobile capture sub-routes were listed as tabs

Severity: High — Would break the 5-tab navigator Source: mobile/src/app/(tabs)/_layout.tsx has exactly 5 tabs: Home, Search, Capture, Inbox, Settings. Adding 3 more tabs (voice-capture, url-capture, scan-capture) would overflow the tab bar. Fix: Changed to sub-routes of capture: src/app/capture/voice.tsx, src/app/capture/url.tsx, src/app/capture/scan.tsx. These are navigated to FROM the capture tab, not separate tabs.

Finding 7 — FIXED: Phase 6 deliverables listed test count twice (redundant)

Severity: Low Fix: Consolidated into single line.

Finding 8 — OPEN: Embedding storage strategy needs decision

Severity: High — Affects Cosmos RU cost and query patterns Issue: §2.2 proposes storing embedding: number[] directly on NoteDoc. For text-embedding-3-small, each embedding is 1536 floats (~6KB). This increases every NoteDoc read by ~6KB, affecting list queries and the notes container partition-level throughput. Recommendation: Either:

  • (a) Store embeddings in a SEPARATE note_embeddings container (partition: /workspaceId), with documents keyed by noteId. Keeps NoteDoc lean.
  • (b) Store inline but use Cosmos projection queries (SELECT c.id, c.title, c.embedding FROM c) to avoid pulling full note bodies when only embeddings are needed.
  • Option (a) is preferred for scale. Adds 1 new Cosmos container. Action: Implementer should choose (a) or (b) at Phase 2 start and update cosmos-init.ts accordingly.

Finding 9 — OPEN: Voice-to-note (F15) transcription backend not fully specified

Severity: Medium — Implementation decision needed Issue: §4.7 says "Call backend transcription endpoint (or use extraction-service with speech task)" but no endpoint or extraction task is defined. Options:

  • (a) Add speech_transcription task to extraction-service (Python sidecar already supports Whisper/Azure STT)
  • (b) New backend endpoint POST /api/note-prompts/transcribe that calls Azure Speech SDK
  • (c) Client-side transcription via expo-speech (limited quality) Recommendation: Option (a) — extraction-service already has Python sidecar infrastructure. Add task type speech_transcription and a new endpoint POST /api/note-prompts/transcribe that wraps extraction-service.

Finding 10 — OPEN: URL-to-note backend endpoint assigned to Phase 4 but needs backend work

Severity: Medium — Mobile Phase 4 depends on backend route that isn't in Phase 1 Issue: POST /api/note-prompts/url-extract is listed in the API endpoints table as Phase 4, but this is a SERVER-SIDE endpoint (URL fetch, HTML strip, summarize). It must be implemented in the BACKEND before mobile can use it. Recommendation: Move this endpoint to Phase 1 (backend routes) since the runner infrastructure is already being built there.

Finding 11 — OPEN: Phase 5 PromptWebhookDoc needs its own Cosmos container

Severity: Low — Currently untracked Issue: §5.3 defines PromptWebhookDoc but no Cosmos container is mentioned for it. The "New Cosmos Containers" section only lists note_prompts and note_prompt_schedules. Recommendation: Add note_prompt_webhooks container (partition: /workspaceId) or store webhooks in note_prompt_schedules with a discriminator.

Finding 12 — OPEN: @bytelyst/llm factory reads env vars directly, not via Zod config

Severity: Low — Clarification needed, not a bug Issue: factory.ts in @bytelyst/llm reads process.env.LLM_PROVIDER, process.env.OPENAI_API_KEY, etc. directly. The roadmap also adds these to NoteLett's Zod config schema (§1.3). These serve different purposes:

  • @bytelyst/llm factory — reads env at provider instantiation time
  • NoteLett config.ts — validates env at startup for fail-fast Clarification: Both are correct. Config.ts validates upfront, but the LLM package uses its own env reads. No code conflict, but implementers should know the LLM package ignores NoteLett's parsed config object.

Finding 13 — RESOLVED: chatCompletionStream() now implemented in all providers

Severity: Medium — F3 (Continue Writing) depends on streaming Source: packages/llm/src/providers/openai.ts, azure-openai.ts, mock.ts Resolution: chatCompletionStream() is fully implemented in OpenAI (SSE parsing, buffer handling, [DONE] sentinel), Azure OpenAI (same pattern), and Mock (word-by-word simulation). 3 streaming tests in llm.test.ts. No further work needed.

Finding 14 — RESOLVED: CopilotAction union expanded

Severity: Medium — F1/F2 require new action types Resolution: CopilotAction expanded to include 'fix-rewrite', 'change-tone', 'continue', 'explain'. CopilotBodySchema in notes/routes.ts updated. grammar kept as deprecated alias via fix-rewrite.

Finding 15 — NOTE: @bytelyst/llm has zero runtime dependencies

Severity: Info Source: packages/llm/package.json — only devDependencies: { vitest }. Uses native fetch(). Impact: No extra bundling concerns. Requires Node 18+ or a fetch polyfill.

Finding 16 — NOTE: note_artifacts has summary as an existing artifact type

Severity: Info Source: backend/src/modules/note-artifacts/types.ts line 3: NOTE_ARTIFACT_TYPES = ['file', 'summary', 'extraction', 'citation', 'export'] Impact: F6 (auto-summarize) can use artifactType: 'summary' directly — no schema changes needed for artifact types. F20-F23 (export actions) can use artifactType: 'export'. Good alignment.

Finding 17 — NOTE: Agent action type 'summarize' already exists

Severity: Info Source: backend/src/modules/note-agent-actions/types.ts line 3: NOTE_AGENT_ACTION_TYPES = ['create', 'update', 'summarize', 'extract_tasks', 'attach_citation'] Impact: We can reuse 'summarize' for F6 (auto-summarize) or still add 'smart_action' as a general-purpose type. Recommendation: add 'smart_action' and 'auto_enrich' as planned, and use 'smart_action' for all prompt runs (the template slug provides the specificity).

Finding 18 — NOTE: Phase 5 weekly-digest is template #21, but §1.9 seeds only 20

Severity: Low — Consistency Issue: §5.2 says "Add template #21: weekly-digest". This means Phase 5 adds a 21st template, seeded separately from the initial 20. Clarification: This is correct behavior — built-in template count grows from 20 to 21 in Phase 5. The seed file should support incremental additions (upsert by slug, not hard-coded count).


Summary of Inline Fixes Applied

# Finding Severity Status
1 Timeline diagram wrong dependency Medium Fixed
2 Missing productId in PromptTemplateDoc Critical Fixed
3 Reading time POSTGET Low Fixed
4 Backend file count 18+8 → 11+7 Low Fixed
5 Web missing copilot-client.ts + types.ts Medium Fixed
6 Mobile tabs overflow (voice/url/scan) High Fixed
7 Phase 6 duplicate test count Low Fixed
8 Embedding storage strategy High Resolved@bytelyst/llm embed() implemented in OpenAI, Azure, Mock providers. Separate note_embeddings container deferred to Phase 2.
9 Voice transcription backend unspecified Medium ResolvedPOST /api/transcribe added to extraction-service (OpenAI Whisper API). transcribe() added to @bytelyst/extraction client.
10 URL-extract endpoint in wrong phase Medium ResolvedPOST /api/note-prompts/url-extract implemented in Phase 1 backend routes.
11 Webhook container missing Low Resolvednote_prompt_webhooks container added to cosmos-init.ts.
12 LLM factory vs Zod config clarification Low Resolved — info only, no conflict.
13 Streaming not implemented in providers Medium ResolvedchatCompletionStream() implemented in OpenAI, Azure, Mock providers with SSE parsing.
14 CopilotAction union needs expansion Medium Resolved — expanded to include fix-rewrite, change-tone, continue, explain.
15 Zero runtime deps in @bytelyst/llm Info Noted
16 Artifact type 'summary' already exists Info Noted
17 Agent action 'summarize' already exists Info Noted
18 Template #21 added in Phase 5 Low Noted