saravanakumardb1 f1c08d1a83 docs: update AGENTS.md test counts and SMART_ACTIONS_ROADMAP Phase 6 status (G20)

2026-04-06 13:44:41 -07:00

63 KiB

Raw Blame History

Smart Actions Roadmap — End-to-End Implementation (v2)

Feature: AI-powered note intelligence — Smart Actions, inline editor AI, capture enhancement, cross-note intelligence, agent workflows Scope: learning_ai_common_plat (shared LLM packages) + learning_ai_notes (backend + web + mobile) Author: Product Team Date: April 2026 Version: 2.0 — Full 27-feature roadmap across 6 categories, 7 phases

Executive Summary

This roadmap delivers a comprehensive AI layer for NoteLett across 27 features in 6 categories, spanning the shared platform, backend, web, and mobile. It transforms NoteLett into an AI-native knowledge workspace where:

Users run Smart Actions on notes (text + images) — summarize, translate, rate food labels, parse receipts
The editor has inline AI — rewrite, change tone, continue writing, explain highlighted text
Note intelligence runs in the background — auto-summarize, auto-tag, detect duplicates, suggest links
Capture is AI-enhanced — voice-to-note, screenshot OCR, URL extraction, multi-image processing
Cross-note intelligence — weekly digests, knowledge gap detection, note merge/compare
Agent workflows — scheduled actions, webhook triggers, approval-gated actions, action chains

The feature spans two codebases:

learning_ai_common_plat — Enhance @bytelyst/llm with vision + embedding support
learning_ai_notes — Backend module + web UI + mobile app

Timeline Overview

Phase	What	Where	Duration	Depends on	Parallel?
0	LLM vision + embedding support	common-plat	3 days	—	—
1	Note prompts core + copilot upgrade	backend	4-5 days	Phase 0	—
2	Note intelligence (background AI)	backend	2-3 days	Phase 1	—
3	Smart Actions web UI + editor AI	web	4-5 days	Phase 1	Yes (with 4, 5)
4	Smart Actions mobile + capture	mobile	4-5 days	Phase 1	Yes (with 3, 5)
5	Agent & workflow intelligence	backend + web	2-3 days	Phase 2	Yes (with 3, 4)
6	Polish, E2E, documentation	all	2-3 days	Phases 3-5	—

Total: ~18-24 days sequential, ~14-18 days with parallel execution

Phase 0 (3d) → Phase 1 (5d) ─┬──→ Phase 2 (3d) ──→ Phase 5 (3d) ──┐
                              ├──→ Phase 3 (5d) ────────────────────→ Phase 6 (3d)
                              └──→ Phase 4 (5d) ────────────────────┘

Feature Master List — 27 Features, 6 Categories

Cat 1: Inline Editor AI

#	Feature	Description	Phase
F1	Fix & Rewrite	Select text → rewrite with proper grammar, tone, clarity	3
F2	Change Tone	Rewrite selection as formal / casual / professional / friendly	3
F3	Continue Writing	LLM generates next 2-3 paragraphs from cursor context (streaming)	3
F4	Inline Q&A	Highlight term → "Explain this" → tooltip with definition	3
F5	Auto-tag suggestion	After save, LLM suggests 3-5 tags based on content	1

Cat 2: Note Intelligence

#	Feature	Description	Phase
F6	Auto-summarize on save	Note body > 300 words → auto-generate summary artifact	2
F7	Smart title suggestion	Upgrade existing `suggestTitleFromBody()` to use `@bytelyst/llm`	1
F8	Duplicate/similar note detection	Before save, warn if semantically similar notes exist	2
F9	Auto-link related notes	After creation, suggest 3-5 related notes to link to	2
F10	Reading time estimate	Display estimated reading time on each note	1

Cat 3: Multi-Note Intelligence

#	Feature	Description	Phase
F11	Weekly workspace digest	Auto-generate summary of all workspace activity this week	5
F12	Knowledge gap detection	Identify topics mentioned but under-covered in workspace	2
F13	Note merge	Select 2+ notes → LLM merges into single coherent note	1
F14	Compare notes	Select 2 notes → LLM produces comparison summary	1

Cat 4: Capture Enhancement

#	Feature	Description	Phase
F15	Voice-to-note	Record audio → transcribe → save as note	4
F16	Screenshot-to-note	Share screenshot → OCR + LLM cleanup → structured note	4
F17	URL-to-note	Paste URL → extract content → summarize → save	4
F18	Multi-image capture	Photograph multiple pages → combine into one note	4
F19	Clipboard AI paste	Paste messy text → LLM cleans and structures it	4

#	Feature	Description	Phase
F20	Shareable summary	One-click polished shareable version of a note	1+3
F21	Presentation outline	Note → structured slide outline (title + bullets)	1+3
F22	Email draft	Note → formatted email with subject, greeting, body	1+3
F23	Social post	Note → Twitter/LinkedIn post draft	1+3

Cat 6: Agent & Workflow Intelligence

#	Feature	Description	Phase
F24	Smart Action chains	Pipe output of one action as input to next	1
F25	Scheduled Smart Actions	Cron-like: "Summarize workspace every Friday"	5
F26	Webhook-triggered actions	External event → auto-run a Smart Action	5
F27	Approval-gated actions	High-risk actions require human review before applying	5

Phase 0 — Common Platform: LLM Vision + Embedding Support

Repo: learning_ai_common_plat Duration: 3 days Depends on: Nothing Features enabled: Foundation for all F1-F27

0.1 Enhance `@bytelyst/llm` ChatMessage for multipart content

The current ChatMessage.content is string-only. Vision models (GPT-4o, Gemini) require multipart content arrays.

File: packages/llm/src/types.ts

Change	Detail
New `ContentPart` type	`{ type: 'text'; text: string } \| { type: 'image_url'; image_url: { url: string; detail?: 'auto' \| 'low' \| 'high' } }`
Update `ChatMessage.content`	`string \| ContentPart[]`
New `isVisionMessage()` helper	Type guard to check if a message contains image parts
New `buildVisionMessage()` helper	Convenience: `(text: string, imageUrl: string) => ChatMessage`

Tests: 8-10 new tests

0.2 Update `OpenAIProvider` for vision

File: packages/llm/src/providers/openai.ts

Change	Detail
Pass multipart `content` to API	When `content` is an array, send as-is (OpenAI format)
Default model upgrade	If any message has image content, auto-suggest `gpt-4o`

Tests: 4-6 new tests (mock HTTP)

0.3 Update `AzureOpenAIProvider` for vision

Same multipart content handling as OpenAI provider. Tests: 4-6 new tests.

0.4 Update `MockLLMProvider`

Return deterministic mock responses when vision content is detected, for downstream test use.

0.5 Add streaming support enhancement

Ensure chatCompletionStream() works with multipart content for F3 (Continue Writing).

0.6 Add embedding support (for F8, F9, F12)

File: packages/llm/src/types.ts + providers

Change	Detail
New `EmbeddingRequest` type	`{ input: string \| string[]; model?: string }`
New `EmbeddingResponse` type	`{ embeddings: number[][]; model: string; usage: TokenUsage }`
Add `embed()` to `LLMProvider`	Optional method for embedding generation
Implement in `OpenAIProvider`	Call `/v1/embeddings` endpoint
Implement in `AzureOpenAIProvider`	Call Azure embeddings endpoint
Implement in `MockLLMProvider`	Return deterministic fake embeddings

Tests: 6-8 new tests

0.7 Export new types + helpers

File: packages/llm/src/index.ts — export ContentPart, EmbeddingRequest, EmbeddingResponse, isVisionMessage, buildVisionMessage

0.8 Update `@bytelyst/llm-router`

Change	Detail
Vision-aware routing	`classifyPrompt()` detects image content → routes to vision-capable models
Model capability flags	Add `supportsVision: boolean` and `supportsEmbedding: boolean` to `ModelConfig`

0.9 Publish updated packages

Bump versions → publish to Gitea npm registry.

Phase 0 Deliverables:

@bytelyst/llm@0.2.0 — vision + embedding + streaming enhancements
@bytelyst/llm-router@0.2.0 — vision-aware routing + capability flags
All existing tests pass + 25-30 new tests
Published to Gitea npm registry

Phase 1 — Backend: Note Prompts Core + Copilot Upgrade

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 0 Features: F5 (auto-tag), F7 (smart title), F10 (reading time), F13 (merge), F14 (compare), F20-F23 (templates), F24 (chains)

1.1 Add LLM dependency to backend

File: backend/package.json — add "@bytelyst/llm": "^0.2.0"

1.2 Create `backend/src/lib/llm.ts`

Singleton wrapper over @bytelyst/llm:

import { getLLM, type LLMProvider } from '@bytelyst/llm';

let _llm: LLMProvider | null = null;
export function getNoteLettLLM(): LLMProvider {
  if (!_llm) _llm = getLLM();
  return _llm;
}

1.3 Add LLM env vars to config

File: backend/src/lib/config.ts

Variable	Default	Description
`LLM_PROVIDER`	`openai`	`openai` / `azure` / `mock`
`OPENAI_API_KEY`	—	OpenAI API key
`OPENAI_BASE_URL`	—	Optional base URL override
`AZURE_OPENAI_ENDPOINT`	—	Azure OpenAI endpoint
`AZURE_OPENAI_API_KEY`	—	Azure OpenAI key
`LLM_DEFAULT_MODEL`	`gpt-4o-mini`	Default model for text prompts
`LLM_VISION_MODEL`	`gpt-4o`	Default model for image prompts
`LLM_EMBEDDING_MODEL`	`text-embedding-3-small`	Default model for embeddings

1.4 New Cosmos container: `note_prompts`

File: backend/src/lib/cosmos-init.ts — register note_prompts container (partition key: /userId)

1.5 Create `backend/src/modules/note-prompts/types.ts`

Key types:

PromptTemplateDoc — id, productId, userId, slug, name, description, category, systemPrompt, userPromptTemplate, inputType (text/image/text+image/multi-note), outputFormat, outputAction (new_note/artifact/update_note), parameters, builtIn, createdAt, updatedAt
PromptParameter — key, label, type (string/select), options, default, required
RunPromptInput — noteId, workspaceId, promptTemplateId OR inlinePrompt, parameters, imageUrls, additionalNoteIds (for F13/F14 merge/compare), previousResultNoteId (for F24 chains), dryRun, agentId
RunPromptOutput — resultNoteId, resultArtifactId, content, model, tokenUsage, agentActionId, suggestedTags (for F5)

Zod schemas for all of the above.

1.6 Create `backend/src/modules/note-prompts/repository.ts`

CRUD for PromptTemplateDoc:

listTemplates(userId) — returns built-in + user's custom templates
getTemplate(id, userId)
createTemplate(doc)
updateTemplate(id, userId, updates)
deleteTemplate(id, userId) — cannot delete built-in

1.7 Create `backend/src/modules/note-prompts/runner.ts`

The core orchestration logic:

1. Validate input (template or inline prompt)
2. Fetch the source note (verify ownership + productId)
3. If additionalNoteIds provided (F13/F14 merge/compare):
   a. Fetch all additional notes
   b. Combine content into multi-note context
4. If template has inputType 'image' or 'text+image':
   a. Fetch artifact images from blob storage via SAS URLs
   b. Build vision message with buildVisionMessage()
5. If previousResultNoteId provided (F24 chains):
   a. Fetch previous result note
   b. Include its content as additional context
6. Build LLM messages array:
   - System: template.systemPrompt (or default)
   - User: interpolated template with note content + images + additional notes
7. Call getNoteLettLLM().chatCompletion(request)
8. Post-process response:
   - If template slug is 'auto-tag': parse tags from response → return as suggestedTags
   - If outputAction is 'new_note': createNote() → link to source via note-relationships
   - If outputAction is 'artifact': createNoteArtifact() on source note
   - If outputAction is 'update_note': updateNote() body/tags on source note
9. Record NoteAgentActionDoc (actionType: 'smart_action')
10. Return RunPromptOutput

1.8 Create `backend/src/modules/note-prompts/routes.ts`

Method	Path	Auth	Description	Features
`GET`	`/api/prompt-templates`	viewer	List built-in + user templates	Core
`GET`	`/api/prompt-templates/:id`	viewer	Get single template	Core
`POST`	`/api/prompt-templates`	admin	Create custom template	Core
`PATCH`	`/api/prompt-templates/:id`	admin	Update custom template	Core
`DELETE`	`/api/prompt-templates/:id`	admin	Delete custom template	Core
`POST`	`/api/note-prompts/run`	admin	Run a prompt on a note	Core
`POST`	`/api/note-prompts/run-stream`	admin	Run with SSE streaming	F3
`GET`	`/api/note-prompts/history`	viewer	List past prompt runs	Core
`POST`	`/api/notes/:id/suggest-tags`	admin	Suggest tags via LLM	F5
`GET`	`/api/notes/:id/reading-time`	viewer	Calculate reading time	F10
`POST`	`/api/notes/compare`	admin	Compare 2+ notes	F14
`POST`	`/api/notes/merge`	admin	Merge 2+ notes	F13

1.9 Seed 20 built-in prompt templates

File: backend/src/modules/note-prompts/seed.ts

#	Slug	Name	Input	Output	Category	Feature
1	`summarize`	Summarize	text	new_note	transform	Core
2	`translate`	Translate	text	new_note	transform	Core
3	`simplify`	Simplify / ELI5	text	artifact	transform	Core
4	`extract-key-facts`	Extract Key Facts	text	artifact	extract	Core
5	`food-label-rating`	Rate Food Label	image	new_note	analysis	Core
6	`parse-receipt`	Parse Receipt	image	new_note	extract	Core
7	`read-business-card`	Read Business Card	image	new_note	extract	Core
8	`handwriting-to-text`	Handwriting to Text	image	new_note	transform	Core
9	`generate-flashcards`	Generate Flashcards	text	new_note	generate	Core
10	`pros-and-cons`	Pros & Cons	text	artifact	analysis	Core
11	`presentation-outline`	Presentation Outline	text	new_note	generate	F21
12	`email-draft`	Email Draft	text	new_note	generate	F22
13	`social-post`	Social Post	text	artifact	generate	F23
14	`shareable-summary`	Shareable Summary	text	new_note	transform	F20
15	`compare-notes`	Compare Notes	multi-note	new_note	analysis	F14
16	`merge-notes`	Merge Notes	multi-note	new_note	transform	F13
17	`fix-rewrite`	Fix & Rewrite	text	update_note	transform	F1
18	`change-tone`	Change Tone	text	update_note	transform	F2
19	`continue-writing`	Continue Writing	text	update_note	generate	F3
20	`auto-tag`	Auto-Tag	text	update_note	extract	F5

1.10 Upgrade `copilot-transform.ts` to use `@bytelyst/llm` (F1, F2, F7)

File: backend/src/lib/copilot-transform.ts

Replace extraction-service calls with direct @bytelyst/llm calls:

runCopilotTransform() → getNoteLettLLM().chatCompletion() with action-specific system prompts
suggestTitleFromBody() → getNoteLettLLM().chatCompletion() with title-suggestion prompt
Add rewriteText(text, style) for F1/F2 — accepts tone parameter
Keep extraction-service fallback for graceful degradation

1.11 Reading time utility (F10)

File: backend/src/lib/reading-time.ts

export function estimateReadingTime(html: string): { minutes: number; words: number } {
  const plain = html.replace(/<[^>]*>/g, ' ').replace(/\s+/g, ' ').trim();
  const words = plain.split(/\s+/).length;
  return { minutes: Math.max(1, Math.ceil(words / 238)), words };
}

Pure calculation — no LLM needed. Expose via GET /api/notes/:id response and note detail endpoints.

1.12 Extend agent action types

File: backend/src/modules/note-agent-actions/types.ts

Add 'smart_action' and 'auto_enrich' to NOTE_AGENT_ACTION_TYPES.

1.13 Register routes in server.ts + MCP tool

backend/src/server.ts — register notePromptRoutes
backend/src/mcp/note-tool-contracts.ts — add notes.prompts.run to NOTES_MCP_TOOL_NAMES
backend/src/mcp/note-tools.ts — implement executeRunPrompt()

1.14 Tests

Test file	Coverage	Count
`note-prompts/repository.test.ts`	Template CRUD	8-10
`note-prompts/runner.test.ts`	Prompt execution with mock LLM, chains, multi-note	15-18
`note-prompts/routes.test.ts`	API endpoint integration	10-12
`lib/copilot-transform.test.ts`	Upgraded copilot with LLM	4-6
`lib/reading-time.test.ts`	Reading time calculation	4-5
`mcp/note-tools.test.ts`	`notes.prompts.run` MCP tool	4-6

Phase 1 Deliverables:

note-prompts module: types, repository, runner, routes, seed (20 templates)
lib/llm.ts singleton + config extended with LLM env vars
lib/reading-time.ts pure utility (F10)
Upgraded copilot-transform.ts using @bytelyst/llm (F1, F2, F7)
Multi-note support in runner (F13 merge, F14 compare)
Chain support in runner via previousResultNoteId (F24)
smart_action + auto_enrich agent action types
notes.prompts.run MCP tool
note_prompts Cosmos container
45-57 new tests

Phase 2 — Backend: Note Intelligence (Background AI)

Repo: learning_ai_notes Duration: 2-3 days Depends on: Phase 1 Features: F6 (auto-summarize), F8 (duplicate detection), F9 (auto-link), F12 (knowledge gaps)

2.1 Embedding service: `backend/src/lib/embeddings.ts` (F8, F9, F12)

import { getNoteLettLLM } from './llm.js';

export async function embedText(text: string): Promise<number[]> {
  const llm = getNoteLettLLM();
  if (!llm.embed) throw new Error('Embedding not supported by current LLM provider');
  const res = await llm.embed({ input: text });
  return res.embeddings[0];
}

export function cosineSimilarity(a: number[], b: number[]): number {
  let dot = 0, magA = 0, magB = 0;
  for (let i = 0; i < a.length; i++) {
    dot += a[i] * b[i];
    magA += a[i] * a[i];
    magB += b[i] * b[i];
  }
  return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}

2.2 Note embedding storage

File: backend/src/modules/notes/types.ts — add optional embedding: number[] field to NoteDoc

On note create/update, compute embedding in background (non-blocking). Store in Cosmos alongside the note.

2.3 Auto-summarize on save (F6)

File: backend/src/lib/note-hooks.ts

After a note is saved with body > 300 words:

Run "summarize" template via runner.ts
Store result as artifact (type: summary) on the note
Record agent action with actionType: 'auto_enrich'
Gated behind feature flag notelett_auto_summarize_enabled

2.4 Duplicate/similar note detection (F8)

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/notes/:id/check-duplicates

Embed the current note's body
Fetch all notes in workspace with embeddings
Compute cosine similarity
Return notes with similarity > 0.85 threshold

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/notes/:id/suggest-links

Embed the current note
Find top 5 most similar notes (similarity > 0.6, excluding self)
Return as suggested links with similarity scores
UI can accept/dismiss suggestions

2.6 Knowledge gap detection (F12)

File: backend/src/modules/note-prompts/routes.ts

New endpoint: POST /api/workspaces/:id/knowledge-gaps

Fetch all notes in workspace
Extract topics from each note (via auto-tag or LLM)
Build topic frequency map
Send to LLM: "Given these topics and their coverage depth, what's missing?"
Return gap analysis as structured JSON

2.7 Tests

Test file	Coverage	Count
`lib/embeddings.test.ts`	Embed + cosine similarity	6-8
`lib/note-hooks.test.ts`	Auto-summarize trigger logic	4-6
`note-prompts/routes.test.ts`	Duplicate check, suggest links, knowledge gaps	10-12

Phase 2 Deliverables:

lib/embeddings.ts — embed text + cosine similarity
Note embedding storage on create/update
Auto-summarize on save (F6) — feature-flag gated
Duplicate detection endpoint (F8)
Related notes suggestion endpoint (F9)
Knowledge gap detection endpoint (F12)
20-26 new tests

Phase 3 — Web: Smart Actions UI + Editor AI

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 1 (Phase 2 optional for F8/F9 UI) Features: F1-F4 (editor AI), F5 (tag UI), F8-F9 (duplicate/link UI), F10 (reading time UI), F14 (compare UI), F20-F23 (export UI) Can run in parallel with: Phase 4, Phase 5

3.1 API client: `web/src/lib/prompt-client.ts`

listPromptTemplates(): Promise<PromptTemplate[]>
getPromptTemplate(id: string): Promise<PromptTemplate>
createPromptTemplate(input): Promise<PromptTemplate>
updatePromptTemplate(id, input): Promise<PromptTemplate>
deletePromptTemplate(id): Promise<void>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
runPromptStream(input: RunPromptInput): AsyncIterable<string>  // F3
listPromptHistory(noteId?, limit?): Promise<AgentAction[]>
suggestTags(noteId, workspaceId): Promise<string[]>           // F5
checkDuplicates(noteId, workspaceId): Promise<SimilarNote[]>  // F8
suggestLinks(noteId, workspaceId): Promise<SuggestedLink[]>   // F9
compareNotes(noteIds, workspaceId): Promise<RunPromptOutput>   // F14
mergeNotes(noteIds, workspaceId): Promise<RunPromptOutput>     // F13
getKnowledgeGaps(workspaceId): Promise<GapAnalysis>            // F12

3.2 SmartActionsPanel component

File: web/src/components/SmartActionsPanel.tsx

Renders on the note detail page:

Grid of action buttons grouped by category (built-in + custom)
Each button: icon + name + inputType badge (text/image/multi)
Click → opens RunPromptModal
Shows recent prompt runs for this note
Reading time display (F10)
"Suggest tags" button (F5)

3.3 RunPromptModal component

File: web/src/components/RunPromptModal.tsx

Template selector (filtered by input type compatibility)
Parameter inputs (e.g., target language, tone)
Image picker (browse note artifacts or upload new)
Multi-note selector (for merge/compare — F13, F14)
Inline prompt textarea (for custom one-off prompts)
Chain toggle: "Continue from previous result" (F24)
Dry-run checkbox
"Run" button with loading spinner

3.4 PromptResultView component

File: web/src/components/PromptResultView.tsx

Markdown renderer for LLM response
Action buttons: "Save as Note", "Save as Artifact", "Apply to Note", "Discard"
Token usage + model info footer
Link to created note or artifact

3.5 Prompt Template Library page

File: web/src/app/(app)/prompts/page.tsx

Browse all 20 built-in templates (read-only cards)
User's custom templates (edit/delete)
"Create Custom Prompt" → opens PromptTemplateEditor
Category filter tabs: All, Analysis, Transform, Extract, Generate, Custom

3.6 PromptTemplateEditor component

File: web/src/components/PromptTemplateEditor.tsx

Form: name, slug, description, category, system prompt, user prompt template
Input type selector (text / image / text+image / multi-note)
Output format + output action selectors
Parameter builder (add/remove dynamic parameters)
Template variable reference: {{note.title}}, {{note.body}}, {{note.tags}}, {{params.X}}
Live preview

3.7 Upgrade NoteEditor with advanced Copilot (F1-F4)

File: web/src/components/NoteEditor.tsx

Enhance existing Copilot toolbar:

Current	Upgraded
`shorten`	Keep (uses `@bytelyst/llm` now)
`expand`	Keep (uses `@bytelyst/llm` now)
`bulletize`	Keep (uses `@bytelyst/llm` now)
`grammar`	Replace with "Fix & Rewrite" (F1) — full rewrite, not just grammar
—	Add "Change Tone" (F2) — dropdown: formal/casual/professional/friendly
—	Add "Continue Writing" (F3) — inserts at cursor, streams token-by-token
—	Add "Explain" (F4) — tooltip popover with definition/explanation

F3 (Continue Writing) implementation:

Get text before cursor position from TipTap editor state
Call runPromptStream() (SSE)
Insert streamed tokens into editor in real-time via TipTap commands

F4 (Inline Q&A) implementation:

Select text → right-click or toolbar button → "Explain this"
Opens floating popover below selection
Calls LLM with "Explain this term/concept concisely: {selection}"
Shows result in popover (dismissible)

3.8 Duplicate detection UI (F8)

After note save, if notelett_duplicate_check_enabled flag is on:

Call checkDuplicates()
If similar notes found → show toast: "This note is similar to 'Note X' (87% match). View?"
Click → opens side-by-side comparison

After note creation:

Call suggestLinks()
If suggestions found → show panel: "Related notes you might want to link"
Each suggestion: note title + similarity % + "Link" / "Dismiss" buttons

3.10 Auto-tag suggestion UI (F5)

On the note detail page SmartActionsPanel:

"Suggest Tags" button
Calls /api/notes/:id/suggest-tags
Shows tag chips with + button to accept each
Accepted tags are added to the note

3.11 Export actions UI (F20-F23)

In SmartActionsPanel, export templates appear with share icons:

"Shareable Summary" → generates polished version → copy or share via note-shares
"Presentation Outline" → generates outline → saves as new note
"Email Draft" → generates email → copy to clipboard
"Social Post" → generates post → copy to clipboard

3.12 Knowledge gap analysis UI (F12)

File: web/src/app/(app)/workspaces/[id]/gaps/page.tsx

"Analyze Knowledge Gaps" button on workspace page
Shows gap analysis: topics with thin coverage, suggested new note topics
"Create Note" button for each gap → pre-fills title

3.13 Wire into note detail page + sidebar

web/src/app/(app)/notes/[noteId]/page.tsx — add SmartActionsPanel, reading time, duplicate warning
web/src/components/Sidebar.tsx — add "Prompts" nav item (sparkle icon)
Keyboard shortcut: Cmd+Shift+A → open Smart Actions panel

3.14 Tests

Test file	Coverage	Count
`prompt-client.test.ts`	API client functions	8-10
`SmartActionsPanel.test.tsx`	Render + click handlers	4-6
`RunPromptModal.test.tsx`	Form + submission + multi-note	4-6
`NoteEditor.test.tsx`	Copilot upgrade (F1-F4)	6-8
`e2e/smart-actions.spec.ts`	Full flow E2E	6

Phase 3 Deliverables:

prompt-client.ts API client (all endpoints)
5 new components: SmartActionsPanel, RunPromptModal, PromptResultView, PromptTemplateEditor, KnowledgeGapView
/prompts template library page
/workspaces/[id]/gaps knowledge gap page (F12)
NoteEditor upgraded with F1-F4 (Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A)
Duplicate detection toast (F8)
Related notes suggestion panel (F9)
Auto-tag suggestion UI (F5)
Export actions UI (F20-F23)
Reading time display (F10)
Sidebar updated, keyboard shortcut
28-36 new tests + 6 E2E tests

Phase 4 — Mobile: Smart Actions + AI-Enhanced Capture

Repo: learning_ai_notes Duration: 4-5 days Depends on: Phase 1 Features: F15 (voice), F16 (screenshot), F17 (URL), F18 (multi-image), F19 (clipboard) Can run in parallel with: Phase 3, Phase 5

4.1 New dependencies

Package	Purpose
`expo-image-picker`	Camera capture + gallery selection
`expo-av`	Audio recording for voice-to-note (F15)
`expo-clipboard`	Clipboard access for AI paste (F19)
`expo-sharing`	Share results

4.2 API client: `mobile/src/api/note-prompts.ts`

listPromptTemplates(): Promise<PromptTemplate[]>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
suggestTags(noteId, workspaceId): Promise<string[]>

4.3 Enhance blob upload: `mobile/src/api/blob-upload.ts`

Upgrade existing stub:

Camera capture via expo-image-picker (photo + gallery)
Image resize (max 2048px, compress to < 4MB)
Upload to blob storage via @bytelyst/blob-client
Return blobPath + SAS URL

4.4 Zustand store: `mobile/src/store/prompt-store.ts`

interface PromptState {
  templates: PromptTemplate[];
  isRunning: boolean;
  lastResult: RunPromptOutput | null;
  error: string | null;
  fetchTemplates(): Promise<void>;
  runPrompt(input: RunPromptInput): Promise<RunPromptOutput>;
  clearResult(): void;
}

4.5 SmartActionsSheet component

File: mobile/src/app/note/SmartActionsSheet.tsx

Bottom sheet (react-native-gesture-handler) that slides up from note detail:

Scrollable grid of action buttons (icon + name)
Category filter tabs (All, Text, Image, Custom)
Actions trigger either:
- Direct run (text actions)
- Camera/gallery picker → then run (image actions)

4.6 PromptResultScreen

File: mobile/src/app/note/prompt-result.tsx

Markdown-rendered LLM response
"Save as Note" / "Discard" buttons
Model info + token count
Navigate to new note after saving

4.7 Voice-to-note (F15)

File: mobile/src/app/capture/voice.tsx (sub-route of capture, NOT a new tab)

Record audio via expo-av (Audio.Recording)
Upload audio file to blob storage
Call backend transcription endpoint (or use extraction-service with speech task)
Show transcribed text for review/edit
Save as note
Optionally run a Smart Action on the result (e.g., "Extract Key Facts")

4.8 Screenshot-to-note (F16)

On the capture tab:

"From Screenshot" button → gallery picker (images only)
Upload image → blob storage
Run "handwriting-to-text" or custom OCR prompt
Show result for review → save as note

4.9 URL-to-note (F17)

On the capture tab:

"From URL" input field
Backend endpoint: POST /api/note-prompts/url-extract
- Fetches URL content (server-side to avoid CORS)
- Strips HTML → extracts main content
- Runs "summarize" template
- Returns structured result
Show summary for review → save as note

4.10 Multi-image capture (F18)

On the capture tab:

"Scan Document" button → camera in continuous mode
Take multiple photos (whiteboard pages, multi-page document)
Upload all images → blob storage
Run each through vision model sequentially
Combine results into single note body
Show merged result for review → save

4.11 Clipboard AI paste (F19)

On the capture tab:

"Paste & Clean" button
Read clipboard via expo-clipboard
If clipboard contains text → run "fix-rewrite" template
If clipboard contains URL → trigger URL-to-note flow (F17)
Show cleaned result → save as note

4.12 Enhance capture tab

File: mobile/src/app/(tabs)/capture.tsx

Add new capture methods alongside existing text draft:

┌─────────────────────────────────────┐
│  Quick Capture                       │
│  ┌───────┐ ┌───────┐ ┌───────┐     │
│  │  Text  │ │ Photo │ │ Voice │     │
│  └───────┘ └───────┘ └───────┘     │
│  ┌───────┐ ┌───────┐ ┌───────┐     │
│  │  URL  │ │  Scan │ │ Paste │     │
│  └───────┘ └───────┘ └───────┘     │
│                                      │
│  [existing text capture form]        │
└─────────────────────────────────────┘

4.13 Wire Smart Actions into note detail

File: mobile/src/app/note/[id].tsx

"AI Actions" button in the header
Opens SmartActionsSheet
Shows reading time (F10)
Shows suggested tags after save (F5)

4.14 Offline queue integration

Prompt runs that fail → queue via @bytelyst/offline-queue for retry.

4.15 Tests

Test file	Coverage	Count
`api/note-prompts.test.ts`	API client	4-6
`store/prompt-store.test.ts`	Store actions	6-8
`SmartActionsSheet.test.tsx`	Render + interactions	4-6
`capture.test.tsx`	New capture methods	4-6

Phase 4 Deliverables:

note-prompts.ts API client + prompt-store.ts Zustand store
Camera capture + image resize + blob upload
SmartActionsSheet bottom sheet + PromptResultScreen
Voice-to-note flow (F15) — expo-av recording
Screenshot-to-note (F16) — gallery + vision OCR
URL-to-note (F17) — server-side fetch + summarize
Multi-image scan (F18) — continuous camera + combine
Clipboard AI paste (F19) — read + clean
Enhanced capture tab with 6 capture modes
Smart Actions on note detail
Offline queue for failed runs
18-26 new tests

Phase 5 — Agent & Workflow Intelligence

Repo: learning_ai_notes Duration: 2-3 days Depends on: Phase 2 Features: F11 (weekly digest), F25 (scheduled), F26 (webhooks), F27 (approval-gated) Can run in parallel with: Phases 3, 4

5.1 Scheduled Smart Actions (F25)

File: backend/src/modules/note-prompts/scheduler.ts

Component	Detail
`PromptScheduleDoc`	New Cosmos doc: scheduleId, templateId, workspaceId, cron expression, enabled, lastRunAt, nextRunAt
Cosmos container	`note_prompt_schedules` (partition: `/workspaceId`)
Scheduler loop	In-process interval (60s check), matches cron → invokes `runner.ts`
API endpoints	`POST /api/prompt-schedules` (create), `GET` (list), `PATCH/:id` (update), `DELETE/:id` (delete)

Example: "Summarize all notes in 'Research' workspace every Friday at 5pm"

5.2 Weekly workspace digest (F11)

Built on F25 — a special scheduled action:

Pre-configured template: weekly-digest
Runs weekly, collects all notes created/modified in workspace that week
Produces a digest note with: summary, key themes, new notes list, most active areas
Linked to workspace

Add template #21: weekly-digest (system-only, runs via scheduler)

5.3 Webhook-triggered actions (F26)

File: backend/src/modules/note-prompts/webhooks.ts

Component	Detail
`PromptWebhookDoc`	webhookId, templateId, workspaceId, triggerEvent, enabled
API endpoint	`POST /api/prompt-webhooks` (create), `GET` (list), `DELETE/:id`
Trigger endpoint	`POST /api/prompt-webhooks/:id/trigger` — accepts `{ noteId, payload }`
Supported events	`note.created`, `note.updated`, `note.tagged`, `external`

Example: "When a note is tagged 'receipt', auto-run Parse Receipt"

5.4 Approval-gated actions (F27)

Leverages existing NoteAgentActionDoc with approval states.

Change	Detail
New prompt template field	`requiresApproval: boolean` (default: false)
Runner modification	If template has `requiresApproval`, create action with `state: 'proposed'` instead of `state: 'applied'`
Review endpoint	Already exists: `POST /api/agent-actions/:id/review` (approve/reject)
Post-approval hook	On approval, execute the saved output action (create note / update / artifact)
Web UI	ProposalReviewCard already exists — add Smart Action context

5.5 Tests

Test file	Coverage	Count
`note-prompts/scheduler.test.ts`	Cron matching, schedule CRUD, execution	8-10
`note-prompts/webhooks.test.ts`	Webhook CRUD, trigger, event matching	6-8
`note-prompts/runner.test.ts`	Approval-gated flow	3-4

Phase 5 Deliverables:

scheduler.ts — cron-based scheduled prompt execution (F25)
weekly-digest template + scheduled action (F11)
webhooks.ts — event-triggered prompt execution (F26)
Approval-gated actions in runner (F27)
note_prompt_schedules Cosmos container
API endpoints for schedules + webhooks
17-22 new tests

Phase 6 — Polish, Integration Tests, Documentation

Duration: 2-3 days Depends on: Phases 3-5

6.1 End-to-end integration testing

Test	Flow
Web E2E: Food label	Create note → attach image → run "Rate Food Label" → verify result note
Web E2E: Summarize	Create long note → run "Summarize" → verify summary artifact
Web E2E: Compare	Select 2 notes → compare → verify comparison note
Web E2E: Template CRUD	Create custom template → use it → edit → delete
Mobile E2E: Camera capture	Photo → upload → run prompt → verify result
Mobile E2E: Voice-to-note	Record → transcribe → review → save
MCP E2E: Agent prompt	Agent calls `notes.prompts.run` → verify audit trail
Webhook E2E	Tag note → webhook fires → prompt runs automatically
Scheduler E2E	Schedule created → time triggers → digest generated

6.2 Error handling

Scenario	Handling
LLM API key not configured	Clear error, disable Smart Actions UI, show setup guide
LLM rate limit (429)	Retry with exponential backoff (3 attempts), show "try again later"
LLM timeout	60s timeout, graceful error, suggest retry
Image too large	Client-side resize before upload (max 2048px, < 4MB)
Prompt template not found	404 with helpful message
Empty note body (text prompt)	Require body or show warning
No images on note (image prompt)	Prompt to upload/capture first
Embedding service unavailable	Skip duplicate check/auto-link gracefully
Audio recording fails	Fallback to text capture, show error
URL fetch fails	Show error with suggestion to paste content manually

6.3 Feature flags

Flag	Default	Controls
`notelett_smart_actions_enabled`	false	All Smart Actions UI + API
`notelett_auto_summarize_enabled`	false	F6 auto-summarize on save
`notelett_duplicate_check_enabled`	false	F8 duplicate detection
`notelett_auto_link_enabled`	false	F9 auto-link suggestions
`notelett_copilot_llm_enabled`	false	F1-F4 editor AI (vs extraction fallback)
`notelett_voice_capture_enabled`	false	F15 voice-to-note
`notelett_scheduled_actions_enabled`	false	F25 scheduled actions
`notelett_webhooks_enabled`	false	F26 webhook triggers

6.4 Telemetry events

Event	Properties
`smart_action_run`	templateSlug, inputType, model, durationMs, tokenUsage
`smart_action_result_saved`	outputAction, resultType
`smart_action_template_created`	category, inputType
`smart_action_error`	errorType, templateSlug
`copilot_transform`	action (rewrite/tone/continue/explain), durationMs
`auto_summarize_triggered`	wordCount, durationMs
`duplicate_detected`	similarityScore, noteId
`voice_capture_completed`	durationSecs, wordCount
`url_extract_completed`	domain, wordCount
`scheduled_action_fired`	scheduleId, templateSlug
`webhook_triggered`	webhookId, triggerEvent

6.5 Documentation updates

Update docs/PRD.md — Smart Actions section (§5.2 AI features)
Update AGENTS.md — new MCP tool, new module, new env vars
Update docs/roadmaps/02_BACKEND_ROADMAP.md — mark Smart Actions complete
API reference for all new endpoints (15+ endpoints)
docs/SMART_ACTIONS_USER_GUIDE.md — end-user documentation

6.6 Docker / CI updates

Add LLM env vars to .env.example
Add @bytelyst/llm to scripts/docker-prep.sh tarball list
Update backend/Dockerfile for new deps
Add expo-image-picker, expo-av to mobile CI build matrix

Phase 6 Deliverables:

9+ E2E integration tests + 1-6 additional integration tests
Error handling for all edge cases
8 feature flags for gradual rollout
11 telemetry events
Documentation updated (PRD, AGENTS.md, roadmaps, user guide)
Docker + CI updated

Test Budget Summary

Phase	Unit Tests	E2E Tests	Total
0 — Common-plat LLM	25-30	—	25-30
1 — Backend core	45-57	—	45-57
2 — Note intelligence	20-26	—	20-26
3 — Web UI + editor AI	22-30	6	28-36
4 — Mobile + capture	18-26	—	18-26
5 — Agent/workflow	17-22	—	17-22
6 — Integration/polish	—	10-15	10-15
Total	147-191	16-21	163-212

New Files Summary

`learning_ai_common_plat` (Phase 0) — 6-8 files modified

File	Change
`packages/llm/src/types.ts`	Add `ContentPart`, `EmbeddingRequest/Response`, update `ChatMessage`
`packages/llm/src/helpers.ts`	New: `isVisionMessage()`, `buildVisionMessage()`
`packages/llm/src/providers/openai.ts`	Vision + embedding support
`packages/llm/src/providers/azure-openai.ts`	Vision + embedding support
`packages/llm/src/providers/mock.ts`	Vision + embedding mocks
`packages/llm/src/index.ts`	Export new types + helpers
`packages/llm-router/src/types.ts`	Add `supportsVision`, `supportsEmbedding`
`packages/llm-router/src/classifier.ts`	Detect image content

`learning_ai_notes/backend` (Phases 1, 2, 5) — 11 new + 7 modified

File	Status	Phase
`src/lib/llm.ts`	New	1
`src/lib/config.ts`	Modified	1
`src/lib/cosmos-init.ts`	Modified	1
`src/lib/copilot-transform.ts`	Modified	1
`src/lib/reading-time.ts`	New	1
`src/lib/embeddings.ts`	New	2
`src/lib/note-hooks.ts`	New	2
`src/modules/note-prompts/types.ts`	New	1
`src/modules/note-prompts/repository.ts`	New	1
`src/modules/note-prompts/runner.ts`	New	1
`src/modules/note-prompts/routes.ts`	New	1
`src/modules/note-prompts/seed.ts`	New	1
`src/modules/note-prompts/scheduler.ts`	New	5
`src/modules/note-prompts/webhooks.ts`	New	5
`src/modules/note-agent-actions/types.ts`	Modified	1
`src/mcp/note-tool-contracts.ts`	Modified	1
`src/mcp/note-tools.ts`	Modified	1
`src/server.ts`	Modified	1

`learning_ai_notes/web` (Phase 3) — 8 new + 5 modified

File	Status
`src/lib/prompt-client.ts`	New
`src/components/SmartActionsPanel.tsx`	New
`src/components/RunPromptModal.tsx`	New
`src/components/PromptResultView.tsx`	New
`src/components/PromptTemplateEditor.tsx`	New
`src/app/(app)/prompts/page.tsx`	New
`src/app/(app)/workspaces/[id]/gaps/page.tsx`	New
`e2e/smart-actions.spec.ts`	New
`src/app/(app)/notes/[noteId]/page.tsx`	Modified
`src/components/NoteEditor.tsx`	Modified
`src/components/Sidebar.tsx`	Modified
`src/lib/copilot-client.ts`	Modified (add new CopilotAction types)
`src/lib/types.ts`	Modified (add PromptTemplate, RunPromptInput/Output, etc.)

`learning_ai_notes/mobile` (Phase 4) — 8 new + 3 modified

File	Status
`src/api/note-prompts.ts`	New
`src/api/blob-upload.ts`	Modified
`src/store/prompt-store.ts`	New
`src/app/note/SmartActionsSheet.tsx`	New
`src/app/note/prompt-result.tsx`	New
`src/app/capture/voice.tsx`	New (sub-route of capture, NOT a tab)
`src/app/capture/url.tsx`	New (sub-route of capture, NOT a tab)
`src/app/capture/scan.tsx`	New (sub-route of capture, NOT a tab)
`src/app/(tabs)/capture.tsx`	Modified
`src/app/note/[id].tsx`	Modified

20 Built-in Prompt Templates

#	Slug	Name	Input	Output	Category
1	`summarize`	Summarize	text	new_note	transform
2	`translate`	Translate	text	new_note	transform
3	`simplify`	Simplify / ELI5	text	artifact	transform
4	`extract-key-facts`	Extract Key Facts	text	artifact	extract
5	`food-label-rating`	Rate Food Label	image	new_note	analysis
6	`parse-receipt`	Parse Receipt	image	new_note	extract
7	`read-business-card`	Read Business Card	image	new_note	extract
8	`handwriting-to-text`	Handwriting to Text	image	new_note	transform
9	`generate-flashcards`	Generate Flashcards	text	new_note	generate
10	`pros-and-cons`	Pros & Cons	text	artifact	analysis
11	`presentation-outline`	Presentation Outline	text	new_note	generate
12	`email-draft`	Email Draft	text	new_note	generate
13	`social-post`	Social Post	text	artifact	generate
14	`shareable-summary`	Shareable Summary	text	new_note	transform
15	`compare-notes`	Compare Notes	multi-note	new_note	analysis
16	`merge-notes`	Merge Notes	multi-note	new_note	transform
17	`fix-rewrite`	Fix & Rewrite	text	update_note	transform
18	`change-tone`	Change Tone	text	update_note	transform
19	`continue-writing`	Continue Writing	text	update_note	generate
20	`auto-tag`	Auto-Tag	text	update_note	extract

New Dependencies

Package	Where	Purpose
`@bytelyst/llm@^0.2.0`	backend	LLM with vision + embedding
`expo-image-picker`	mobile	Camera + gallery
`expo-av`	mobile	Audio recording (F15)
`expo-clipboard`	mobile	Clipboard access (F19)

All other integrations use existing @bytelyst/* packages already in package.json.

New Cosmos Containers

Container	Partition Key	Phase	Purpose
`note_prompts`	`/userId`	1	Prompt templates (built-in + custom)
`note_prompt_schedules`	`/workspaceId`	5	Scheduled action definitions

Prompt run results don't need containers — they produce notes (notes) and artifacts (note_artifacts).

New Environment Variables

Variable	Default	Phase	Description
`LLM_PROVIDER`	`openai`	1	`openai` / `azure` / `mock`
`OPENAI_API_KEY`	—	1	OpenAI API key
`OPENAI_BASE_URL`	—	1	Optional base URL override
`AZURE_OPENAI_ENDPOINT`	—	1	Azure OpenAI endpoint
`AZURE_OPENAI_API_KEY`	—	1	Azure OpenAI key
`LLM_DEFAULT_MODEL`	`gpt-4o-mini`	1	Default text model
`LLM_VISION_MODEL`	`gpt-4o`	1	Default vision model
`LLM_EMBEDDING_MODEL`	`text-embedding-3-small`	2	Default embedding model

New API Endpoints (15 endpoints)

Method	Path	Phase	Feature
`GET`	`/api/prompt-templates`	1	List templates
`GET`	`/api/prompt-templates/:id`	1	Get template
`POST`	`/api/prompt-templates`	1	Create template
`PATCH`	`/api/prompt-templates/:id`	1	Update template
`DELETE`	`/api/prompt-templates/:id`	1	Delete template
`POST`	`/api/note-prompts/run`	1	Run prompt
`POST`	`/api/note-prompts/run-stream`	1	Run prompt (SSE)
`GET`	`/api/note-prompts/history`	1	Prompt run history
`POST`	`/api/notes/:id/suggest-tags`	1	F5
`POST`	`/api/notes/compare`	1	F14
`POST`	`/api/notes/merge`	1	F13
`POST`	`/api/notes/:id/check-duplicates`	2	F8
`POST`	`/api/notes/:id/suggest-links`	2	F9
`POST`	`/api/workspaces/:id/knowledge-gaps`	2	F12
`POST`	`/api/note-prompts/url-extract`	4	F17
`CRUD`	`/api/prompt-schedules`	5	F25
`CRUD`	`/api/prompt-webhooks`	5	F26

Commit Strategy

Phase 0 commits (common-plat)

feat(llm): add ContentPart type + multipart ChatMessage.content support
feat(llm): update OpenAI + Azure providers for vision messages
feat(llm): add embedding support (EmbeddingRequest/Response, embed())
feat(llm): add isVisionMessage + buildVisionMessage helpers
test(llm): add vision + embedding tests (30 tests)
feat(llm-router): add supportsVision + supportsEmbedding model capability flags
chore(llm): bump to 0.2.0 + publish

Phase 1 commits (notelett backend)

feat(backend): add @bytelyst/llm + lib/llm.ts singleton + LLM config
feat(note-prompts): types + Zod schemas for templates and run input/output
feat(note-prompts): repository — template CRUD
feat(note-prompts): runner — LLM orchestration + multi-note + chains
feat(note-prompts): routes — REST API endpoints (12 routes)
feat(note-prompts): seed 20 built-in prompt templates
feat(backend): upgrade copilot-transform.ts to use @bytelyst/llm
feat(backend): add reading-time utility
feat(agent-actions): add smart_action + auto_enrich types
feat(mcp): add notes.prompts.run tool
test(note-prompts): full test suite (55 tests)

Phase 2 commits

feat(backend): embeddings service — embed text + cosine similarity
feat(backend): note embedding storage on create/update
feat(backend): auto-summarize on save (feature-flag gated)
feat(backend): duplicate detection endpoint
feat(backend): related notes suggestion endpoint
feat(backend): knowledge gap detection endpoint
test(backend): intelligence tests (25 tests)

Phase 3 commits

feat(web): prompt-client API client
feat(web): SmartActionsPanel + RunPromptModal + PromptResultView
feat(web): PromptTemplateEditor + /prompts library page
feat(web): upgrade NoteEditor — Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A
feat(web): duplicate detection toast + related notes panel
feat(web): auto-tag suggestion UI + export actions
feat(web): knowledge gap analysis page
feat(web): wire Smart Actions into note detail + sidebar
test(web): Smart Actions unit + E2E tests (36 tests)

Phase 4 commits

feat(mobile): note-prompts API client + prompt-store
feat(mobile): camera capture + image resize + blob upload
feat(mobile): SmartActionsSheet bottom sheet + PromptResultScreen
feat(mobile): voice-to-note — expo-av recording + transcription
feat(mobile): screenshot-to-note + multi-image scan
feat(mobile): URL-to-note + clipboard AI paste
feat(mobile): enhanced capture tab with 6 capture modes
test(mobile): Smart Actions tests (26 tests)

Phase 5 commits

feat(backend): scheduled Smart Actions — cron scheduler + CRUD
feat(backend): weekly workspace digest template + scheduled action
feat(backend): webhook-triggered actions — CRUD + trigger endpoint
feat(backend): approval-gated actions in runner
test(backend): scheduler + webhook tests (22 tests)

Phase 6 commits

feat(all): 8 feature flags for gradual rollout
feat(all): 11 telemetry events for Smart Actions
docs: update PRD, AGENTS.md, roadmaps for Smart Actions
docs: Smart Actions user guide
chore: update .env.example + Docker for LLM support
test: end-to-end integration tests (15 tests)

Risk Mitigation

Risk	Mitigation
OpenAI API costs	Per-user daily quota, model tier selection (gpt-4o-mini default, gpt-4o vision only), feature flag gating
Vision prompt latency (5-15s)	Progress indicator, allow background processing, cache identical requests
Image size limits	Client-side resize to max 2048px, compress < 4MB before upload
Prompt injection	System prompt hardening, output validation, truncate excessively long inputs
LLM hallucination	JSON mode where possible, output schema validation, clear UI disclaimer
Corporate proxy blocking OpenAI	Support Azure OpenAI as alternative (already in `@bytelyst/llm`)
Embedding cost at scale	Batch embeddings, cache embeddings on note doc, recompute only on content change
Audio transcription accuracy	Show editable preview before saving, allow manual corrections
Scheduler reliability	In-process interval (simple), log missed runs, diagnostics endpoint

Future Extensions (Not in This Roadmap)

RAG context — include related notes as context in prompts for better answers
Agent marketplace prompts — share templates across ByteLyst products
Multi-step workflow builder — visual chain editor (drag-and-drop)
Streaming for mobile — SSE on React Native for real-time token display
Collaborative Smart Actions — run prompts across shared workspaces
Custom model support — plug in local Ollama models via @bytelyst/ollama-client
Action replay — re-run a previous Smart Action with same parameters
Template versioning — track changes to custom templates over time

Appendix: Review Findings & Resolutions

Systematic code-level audit of this roadmap against the actual NoteLett and common-plat codebases, conducted April 2026. Each finding cross-references the real source files.

Finding 1 — FIXED: Timeline diagram showed wrong dependency flow

Severity: Medium — Incorrect diagram could mislead parallel scheduling Was: Phase 3/4 branching from Phase 2 Fix: Phase 3/4 branch from Phase 1. Phase 2 → Phase 5. Diagram corrected above.

Finding 2 — FIXED: `PromptTemplateDoc` was missing `productId` field

Severity: Critical — violates NoteLett convention: every Cosmos document MUST include productId: "notelett" Source: backend/src/modules/notes/types.ts — all other docs (NoteDoc, NoteArtifactDoc, NoteAgentActionDoc) have productId Fix: Added productId, userId, createdAt, updatedAt to the type definition in §1.5. Note: PromptScheduleDoc (§5.1) and PromptWebhookDoc (§5.3) must also include productId and userId when implemented.

Finding 3 — FIXED: Reading time endpoint was `POST`, should be `GET`

Severity: Low — Pure calculation with no side effects Source: REST convention — GET for idempotent read operations Fix: Changed to GET /api/notes/:id/reading-time in §1.8.

Finding 4 — FIXED: Backend file count was wrong (claimed 18+8, actual 11+7)

Severity: Low — Documentation accuracy Fix: Corrected to "11 new + 7 modified" in New Files Summary.

Finding 5 — FIXED: Web file count missing `copilot-client.ts` and `types.ts`

Severity: Medium — These files MUST be updated but were omitted Source: web/src/lib/copilot-client.ts defines CopilotAction = 'shorten' | 'expand' | 'bulletize' | 'grammar' — needs new types for F1/F2. web/src/lib/types.ts needs PromptTemplate, RunPromptInput, RunPromptOutput, SimilarNote, SuggestedLink, GapAnalysis types. Fix: Added both to web modified files list. Count corrected to "8 new + 5 modified".

Finding 6 — FIXED: Mobile capture sub-routes were listed as tabs

Severity: High — Would break the 5-tab navigator Source: mobile/src/app/(tabs)/_layout.tsx has exactly 5 tabs: Home, Search, Capture, Inbox, Settings. Adding 3 more tabs (voice-capture, url-capture, scan-capture) would overflow the tab bar. Fix: Changed to sub-routes of capture: src/app/capture/voice.tsx, src/app/capture/url.tsx, src/app/capture/scan.tsx. These are navigated to FROM the capture tab, not separate tabs.

Finding 7 — FIXED: Phase 6 deliverables listed test count twice (redundant)

Severity: Low Fix: Consolidated into single line.

Finding 8 — OPEN: Embedding storage strategy needs decision

Severity: High — Affects Cosmos RU cost and query patterns Issue: §2.2 proposes storing embedding: number[] directly on NoteDoc. For text-embedding-3-small, each embedding is 1536 floats (~6KB). This increases every NoteDoc read by ~6KB, affecting list queries and the notes container partition-level throughput. Recommendation: Either:

(a) Store embeddings in a SEPARATE note_embeddings container (partition: /workspaceId), with documents keyed by noteId. Keeps NoteDoc lean.
(b) Store inline but use Cosmos projection queries (SELECT c.id, c.title, c.embedding FROM c) to avoid pulling full note bodies when only embeddings are needed.
Option (a) is preferred for scale. Adds 1 new Cosmos container. Action: Implementer should choose (a) or (b) at Phase 2 start and update cosmos-init.ts accordingly.

Finding 9 — OPEN: Voice-to-note (F15) transcription backend not fully specified

Severity: Medium — Implementation decision needed Issue: §4.7 says "Call backend transcription endpoint (or use extraction-service with speech task)" but no endpoint or extraction task is defined. Options:

(a) Add speech_transcription task to extraction-service (Python sidecar already supports Whisper/Azure STT)
(b) New backend endpoint POST /api/note-prompts/transcribe that calls Azure Speech SDK
(c) Client-side transcription via expo-speech (limited quality) Recommendation: Option (a) — extraction-service already has Python sidecar infrastructure. Add task type speech_transcription and a new endpoint POST /api/note-prompts/transcribe that wraps extraction-service.

Finding 10 — OPEN: URL-to-note backend endpoint assigned to Phase 4 but needs backend work

Severity: Medium — Mobile Phase 4 depends on backend route that isn't in Phase 1 Issue: POST /api/note-prompts/url-extract is listed in the API endpoints table as Phase 4, but this is a SERVER-SIDE endpoint (URL fetch, HTML strip, summarize). It must be implemented in the BACKEND before mobile can use it. Recommendation: Move this endpoint to Phase 1 (backend routes) since the runner infrastructure is already being built there.

Finding 11 — OPEN: Phase 5 `PromptWebhookDoc` needs its own Cosmos container

Severity: Low — Currently untracked Issue: §5.3 defines PromptWebhookDoc but no Cosmos container is mentioned for it. The "New Cosmos Containers" section only lists note_prompts and note_prompt_schedules. Recommendation: Add note_prompt_webhooks container (partition: /workspaceId) or store webhooks in note_prompt_schedules with a discriminator.

Finding 12 — OPEN: `@bytelyst/llm` factory reads env vars directly, not via Zod config

Severity: Low — Clarification needed, not a bug Issue: factory.ts in @bytelyst/llm reads process.env.LLM_PROVIDER, process.env.OPENAI_API_KEY, etc. directly. The roadmap also adds these to NoteLett's Zod config schema (§1.3). These serve different purposes:

@bytelyst/llm factory — reads env at provider instantiation time
NoteLett config.ts — validates env at startup for fail-fast Clarification: Both are correct. Config.ts validates upfront, but the LLM package uses its own env reads. No code conflict, but implementers should know the LLM package ignores NoteLett's parsed config object.

Finding 13 — RESOLVED: `chatCompletionStream()` now implemented in all providers

Severity: Medium — F3 (Continue Writing) depends on streaming Source: packages/llm/src/providers/openai.ts, azure-openai.ts, mock.ts Resolution: chatCompletionStream() is fully implemented in OpenAI (SSE parsing, buffer handling, [DONE] sentinel), Azure OpenAI (same pattern), and Mock (word-by-word simulation). 3 streaming tests in llm.test.ts. No further work needed.

Finding 14 — RESOLVED: CopilotAction union expanded

Severity: Medium — F1/F2 require new action types Resolution: CopilotAction expanded to include 'fix-rewrite', 'change-tone', 'continue', 'explain'. CopilotBodySchema in notes/routes.ts updated. grammar kept as deprecated alias via fix-rewrite.

Finding 15 — NOTE: `@bytelyst/llm` has zero runtime dependencies

Severity: Info Source: packages/llm/package.json — only devDependencies: { vitest }. Uses native fetch(). Impact: No extra bundling concerns. Requires Node 18+ or a fetch polyfill.

Finding 16 — NOTE: `note_artifacts` has `summary` as an existing artifact type

Severity: Info Source: backend/src/modules/note-artifacts/types.ts line 3: NOTE_ARTIFACT_TYPES = ['file', 'summary', 'extraction', 'citation', 'export'] Impact: F6 (auto-summarize) can use artifactType: 'summary' directly — no schema changes needed for artifact types. F20-F23 (export actions) can use artifactType: 'export'. Good alignment.

Finding 17 — NOTE: Agent action type `'summarize'` already exists

Severity: Info Source: backend/src/modules/note-agent-actions/types.ts line 3: NOTE_AGENT_ACTION_TYPES = ['create', 'update', 'summarize', 'extract_tasks', 'attach_citation'] Impact: We can reuse 'summarize' for F6 (auto-summarize) or still add 'smart_action' as a general-purpose type. Recommendation: add 'smart_action' and 'auto_enrich' as planned, and use 'smart_action' for all prompt runs (the template slug provides the specificity).

Finding 18 — NOTE: Phase 5 `weekly-digest` is template #21, but §1.9 seeds only 20

Severity: Low — Consistency Issue: §5.2 says "Add template #21: weekly-digest". This means Phase 5 adds a 21st template, seeded separately from the initial 20. Clarification: This is correct behavior — built-in template count grows from 20 to 21 in Phase 5. The seed file should support incremental additions (upsert by slug, not hard-coded count).

Summary of Inline Fixes Applied

#	Finding	Severity	Status
1	Timeline diagram wrong dependency	Medium	Fixed
2	Missing `productId` in PromptTemplateDoc	Critical	Fixed
3	Reading time `POST` → `GET`	Low	Fixed
4	Backend file count 18+8 → 11+7	Low	Fixed
5	Web missing copilot-client.ts + types.ts	Medium	Fixed
6	Mobile tabs overflow (voice/url/scan)	High	Fixed
7	Phase 6 duplicate test count	Low	Fixed
8	Embedding storage strategy	High	Resolved — `@bytelyst/llm` `embed()` implemented in OpenAI, Azure, Mock providers. Separate `note_embeddings` container deferred to Phase 2.
9	Voice transcription backend unspecified	Medium	Resolved — `POST /api/transcribe` added to extraction-service (OpenAI Whisper API). `transcribe()` added to `@bytelyst/extraction` client.
10	URL-extract endpoint in wrong phase	Medium	Resolved — `POST /api/note-prompts/url-extract` implemented in Phase 1 backend routes.
11	Webhook container missing	Low	Resolved — `note_prompt_webhooks` container added to `cosmos-init.ts`.
12	LLM factory vs Zod config clarification	Low	Resolved — info only, no conflict.
13	Streaming not implemented in providers	Medium	Resolved — `chatCompletionStream()` implemented in OpenAI, Azure, Mock providers with SSE parsing.
14	CopilotAction union needs expansion	Medium	Resolved — expanded to include `fix-rewrite`, `change-tone`, `continue`, `explain`.
15	Zero runtime deps in @bytelyst/llm	Info	Noted
16	Artifact type 'summary' already exists	Info	Noted
17	Agent action 'summarize' already exists	Info	Noted
18	Template #21 added in Phase 5	Low	Noted

63 KiB Raw Blame History

Smart Actions Roadmap — End-to-End Implementation (v2)

Executive Summary

Timeline Overview

Feature Master List — 27 Features, 6 Categories

Cat 1: Inline Editor AI

Cat 2: Note Intelligence

Cat 3: Multi-Note Intelligence

Cat 4: Capture Enhancement

Cat 5: Export & Sharing Intelligence

Cat 6: Agent & Workflow Intelligence

Phase 0 — Common Platform: LLM Vision + Embedding Support

0.1 Enhance @bytelyst/llm ChatMessage for multipart content

0.2 Update OpenAIProvider for vision

0.3 Update AzureOpenAIProvider for vision

0.4 Update MockLLMProvider

0.5 Add streaming support enhancement

0.6 Add embedding support (for F8, F9, F12)

0.7 Export new types + helpers

0.8 Update @bytelyst/llm-router

0.9 Publish updated packages

Phase 1 — Backend: Note Prompts Core + Copilot Upgrade

1.1 Add LLM dependency to backend

1.2 Create backend/src/lib/llm.ts

1.3 Add LLM env vars to config

1.4 New Cosmos container: note_prompts

1.5 Create backend/src/modules/note-prompts/types.ts

1.6 Create backend/src/modules/note-prompts/repository.ts

1.7 Create backend/src/modules/note-prompts/runner.ts

1.8 Create backend/src/modules/note-prompts/routes.ts

1.9 Seed 20 built-in prompt templates

1.10 Upgrade copilot-transform.ts to use @bytelyst/llm (F1, F2, F7)

1.11 Reading time utility (F10)

1.12 Extend agent action types

1.13 Register routes in server.ts + MCP tool

1.14 Tests

Phase 2 — Backend: Note Intelligence (Background AI)

2.1 Embedding service: backend/src/lib/embeddings.ts (F8, F9, F12)

2.2 Note embedding storage

2.3 Auto-summarize on save (F6)

2.4 Duplicate/similar note detection (F8)

2.5 Auto-link related notes (F9)

2.6 Knowledge gap detection (F12)

2.7 Tests

Phase 3 — Web: Smart Actions UI + Editor AI

3.1 API client: web/src/lib/prompt-client.ts

3.2 SmartActionsPanel component

3.3 RunPromptModal component

3.4 PromptResultView component

3.5 Prompt Template Library page

3.6 PromptTemplateEditor component

3.7 Upgrade NoteEditor with advanced Copilot (F1-F4)

3.8 Duplicate detection UI (F8)

3.9 Related notes suggestion UI (F9)

3.10 Auto-tag suggestion UI (F5)

3.11 Export actions UI (F20-F23)

3.12 Knowledge gap analysis UI (F12)

3.13 Wire into note detail page + sidebar

3.14 Tests

Phase 4 — Mobile: Smart Actions + AI-Enhanced Capture

4.1 New dependencies

4.2 API client: mobile/src/api/note-prompts.ts

4.3 Enhance blob upload: mobile/src/api/blob-upload.ts

4.4 Zustand store: mobile/src/store/prompt-store.ts

4.5 SmartActionsSheet component

4.6 PromptResultScreen

4.7 Voice-to-note (F15)

4.8 Screenshot-to-note (F16)

4.9 URL-to-note (F17)

4.10 Multi-image capture (F18)

4.11 Clipboard AI paste (F19)

4.12 Enhance capture tab

4.13 Wire Smart Actions into note detail

4.14 Offline queue integration

4.15 Tests

Phase 5 — Agent & Workflow Intelligence

5.1 Scheduled Smart Actions (F25)

5.2 Weekly workspace digest (F11)

5.3 Webhook-triggered actions (F26)

5.4 Approval-gated actions (F27)

63 KiB

Raw Blame History

0.1 Enhance `@bytelyst/llm` ChatMessage for multipart content

0.2 Update `OpenAIProvider` for vision

0.3 Update `AzureOpenAIProvider` for vision

0.4 Update `MockLLMProvider`

0.8 Update `@bytelyst/llm-router`

1.2 Create `backend/src/lib/llm.ts`

1.4 New Cosmos container: `note_prompts`

1.5 Create `backend/src/modules/note-prompts/types.ts`

1.6 Create `backend/src/modules/note-prompts/repository.ts`

1.7 Create `backend/src/modules/note-prompts/runner.ts`

1.8 Create `backend/src/modules/note-prompts/routes.ts`

1.10 Upgrade `copilot-transform.ts` to use `@bytelyst/llm` (F1, F2, F7)

2.1 Embedding service: `backend/src/lib/embeddings.ts` (F8, F9, F12)

3.1 API client: `web/src/lib/prompt-client.ts`

4.2 API client: `mobile/src/api/note-prompts.ts`

4.3 Enhance blob upload: `mobile/src/api/blob-upload.ts`

4.4 Zustand store: `mobile/src/store/prompt-store.ts`

`learning_ai_common_plat` (Phase 0) — 6-8 files modified

`learning_ai_notes/backend` (Phases 1, 2, 5) — 11 new + 7 modified

`learning_ai_notes/web` (Phase 3) — 8 new + 5 modified

`learning_ai_notes/mobile` (Phase 4) — 8 new + 3 modified

Finding 2 — FIXED: `PromptTemplateDoc` was missing `productId` field

Finding 3 — FIXED: Reading time endpoint was `POST`, should be `GET`

Finding 5 — FIXED: Web file count missing `copilot-client.ts` and `types.ts`

Finding 11 — OPEN: Phase 5 `PromptWebhookDoc` needs its own Cosmos container

Finding 12 — OPEN: `@bytelyst/llm` factory reads env vars directly, not via Zod config

Finding 13 — RESOLVED: `chatCompletionStream()` now implemented in all providers

Finding 15 — NOTE: `@bytelyst/llm` has zero runtime dependencies

Finding 16 — NOTE: `note_artifacts` has `summary` as an existing artifact type

Finding 17 — NOTE: Agent action type `'summarize'` already exists

Finding 18 — NOTE: Phase 5 `weekly-digest` is template #21, but §1.9 seeds only 20