- #8 (embeddings): @bytelyst/llm embed() already implemented in all providers - #9 (voice transcription): POST /api/transcribe added to extraction-service - #10 (URL-extract phase): already implemented in Phase 1 - #11 (webhook container): note_prompt_webhooks already in cosmos-init.ts - #12 (LLM factory vs config): clarified, no conflict - #13 (streaming): chatCompletionStream() already in OpenAI/Azure/Mock - #14 (CopilotAction expansion): fix-rewrite, change-tone, continue, explain added All 18 findings now resolved or noted. Zero open items remain.
63 KiB
Smart Actions Roadmap — End-to-End Implementation (v2)
Feature: AI-powered note intelligence — Smart Actions, inline editor AI, capture enhancement, cross-note intelligence, agent workflows Scope:
learning_ai_common_plat(shared LLM packages) +learning_ai_notes(backend + web + mobile) Author: Product Team Date: April 2026 Version: 2.0 — Full 27-feature roadmap across 6 categories, 7 phases
Executive Summary
This roadmap delivers a comprehensive AI layer for NoteLett across 27 features in 6 categories, spanning the shared platform, backend, web, and mobile. It transforms NoteLett into an AI-native knowledge workspace where:
- Users run Smart Actions on notes (text + images) — summarize, translate, rate food labels, parse receipts
- The editor has inline AI — rewrite, change tone, continue writing, explain highlighted text
- Note intelligence runs in the background — auto-summarize, auto-tag, detect duplicates, suggest links
- Capture is AI-enhanced — voice-to-note, screenshot OCR, URL extraction, multi-image processing
- Cross-note intelligence — weekly digests, knowledge gap detection, note merge/compare
- Agent workflows — scheduled actions, webhook triggers, approval-gated actions, action chains
The feature spans two codebases:
learning_ai_common_plat— Enhance@bytelyst/llmwith vision + embedding supportlearning_ai_notes— Backend module + web UI + mobile app
Timeline Overview
| Phase | What | Where | Duration | Depends on | Parallel? |
|---|---|---|---|---|---|
| 0 | LLM vision + embedding support | common-plat | 3 days | — | — |
| 1 | Note prompts core + copilot upgrade | backend | 4-5 days | Phase 0 | — |
| 2 | Note intelligence (background AI) | backend | 2-3 days | Phase 1 | — |
| 3 | Smart Actions web UI + editor AI | web | 4-5 days | Phase 1 | Yes (with 4, 5) |
| 4 | Smart Actions mobile + capture | mobile | 4-5 days | Phase 1 | Yes (with 3, 5) |
| 5 | Agent & workflow intelligence | backend + web | 2-3 days | Phase 2 | Yes (with 3, 4) |
| 6 | Polish, E2E, documentation | all | 2-3 days | Phases 3-5 | — |
Total: ~18-24 days sequential, ~14-18 days with parallel execution
Phase 0 (3d) → Phase 1 (5d) ─┬──→ Phase 2 (3d) ──→ Phase 5 (3d) ──┐
├──→ Phase 3 (5d) ────────────────────→ Phase 6 (3d)
└──→ Phase 4 (5d) ────────────────────┘
Feature Master List — 27 Features, 6 Categories
Cat 1: Inline Editor AI
| # | Feature | Description | Phase |
|---|---|---|---|
| F1 | Fix & Rewrite | Select text → rewrite with proper grammar, tone, clarity | 3 |
| F2 | Change Tone | Rewrite selection as formal / casual / professional / friendly | 3 |
| F3 | Continue Writing | LLM generates next 2-3 paragraphs from cursor context (streaming) | 3 |
| F4 | Inline Q&A | Highlight term → "Explain this" → tooltip with definition | 3 |
| F5 | Auto-tag suggestion | After save, LLM suggests 3-5 tags based on content | 1 |
Cat 2: Note Intelligence
| # | Feature | Description | Phase |
|---|---|---|---|
| F6 | Auto-summarize on save | Note body > 300 words → auto-generate summary artifact | 2 |
| F7 | Smart title suggestion | Upgrade existing suggestTitleFromBody() to use @bytelyst/llm |
1 |
| F8 | Duplicate/similar note detection | Before save, warn if semantically similar notes exist | 2 |
| F9 | Auto-link related notes | After creation, suggest 3-5 related notes to link to | 2 |
| F10 | Reading time estimate | Display estimated reading time on each note | 1 |
Cat 3: Multi-Note Intelligence
| # | Feature | Description | Phase |
|---|---|---|---|
| F11 | Weekly workspace digest | Auto-generate summary of all workspace activity this week | 5 |
| F12 | Knowledge gap detection | Identify topics mentioned but under-covered in workspace | 2 |
| F13 | Note merge | Select 2+ notes → LLM merges into single coherent note | 1 |
| F14 | Compare notes | Select 2 notes → LLM produces comparison summary | 1 |
Cat 4: Capture Enhancement
| # | Feature | Description | Phase |
|---|---|---|---|
| F15 | Voice-to-note | Record audio → transcribe → save as note | 4 |
| F16 | Screenshot-to-note | Share screenshot → OCR + LLM cleanup → structured note | 4 |
| F17 | URL-to-note | Paste URL → extract content → summarize → save | 4 |
| F18 | Multi-image capture | Photograph multiple pages → combine into one note | 4 |
| F19 | Clipboard AI paste | Paste messy text → LLM cleans and structures it | 4 |
Cat 5: Export & Sharing Intelligence
| # | Feature | Description | Phase |
|---|---|---|---|
| F20 | Shareable summary | One-click polished shareable version of a note | 1+3 |
| F21 | Presentation outline | Note → structured slide outline (title + bullets) | 1+3 |
| F22 | Email draft | Note → formatted email with subject, greeting, body | 1+3 |
| F23 | Social post | Note → Twitter/LinkedIn post draft | 1+3 |
Cat 6: Agent & Workflow Intelligence
| # | Feature | Description | Phase |
|---|---|---|---|
| F24 | Smart Action chains | Pipe output of one action as input to next | 1 |
| F25 | Scheduled Smart Actions | Cron-like: "Summarize workspace every Friday" | 5 |
| F26 | Webhook-triggered actions | External event → auto-run a Smart Action | 5 |
| F27 | Approval-gated actions | High-risk actions require human review before applying | 5 |
Phase 0 — Common Platform: LLM Vision + Embedding Support
Repo: learning_ai_common_plat
Duration: 3 days
Depends on: Nothing
Features enabled: Foundation for all F1-F27
0.1 Enhance @bytelyst/llm ChatMessage for multipart content
The current ChatMessage.content is string-only. Vision models (GPT-4o, Gemini) require multipart content arrays.
File: packages/llm/src/types.ts
| Change | Detail |
|---|---|
New ContentPart type |
{ type: 'text'; text: string } | { type: 'image_url'; image_url: { url: string; detail?: 'auto' | 'low' | 'high' } } |
Update ChatMessage.content |
string | ContentPart[] |
New isVisionMessage() helper |
Type guard to check if a message contains image parts |
New buildVisionMessage() helper |
Convenience: (text: string, imageUrl: string) => ChatMessage |
Tests: 8-10 new tests
0.2 Update OpenAIProvider for vision
File: packages/llm/src/providers/openai.ts
| Change | Detail |
|---|---|
Pass multipart content to API |
When content is an array, send as-is (OpenAI format) |
| Default model upgrade | If any message has image content, auto-suggest gpt-4o |
Tests: 4-6 new tests (mock HTTP)
0.3 Update AzureOpenAIProvider for vision
Same multipart content handling as OpenAI provider. Tests: 4-6 new tests.
0.4 Update MockLLMProvider
Return deterministic mock responses when vision content is detected, for downstream test use.
0.5 Add streaming support enhancement
Ensure chatCompletionStream() works with multipart content for F3 (Continue Writing).
0.6 Add embedding support (for F8, F9, F12)
File: packages/llm/src/types.ts + providers
| Change | Detail |
|---|---|
New EmbeddingRequest type |
{ input: string | string[]; model?: string } |
New EmbeddingResponse type |
{ embeddings: number[][]; model: string; usage: TokenUsage } |
Add embed() to LLMProvider |
Optional method for embedding generation |
Implement in OpenAIProvider |
Call /v1/embeddings endpoint |
Implement in AzureOpenAIProvider |
Call Azure embeddings endpoint |
Implement in MockLLMProvider |
Return deterministic fake embeddings |
Tests: 6-8 new tests
0.7 Export new types + helpers
File: packages/llm/src/index.ts — export ContentPart, EmbeddingRequest, EmbeddingResponse, isVisionMessage, buildVisionMessage
0.8 Update @bytelyst/llm-router
| Change | Detail |
|---|---|
| Vision-aware routing | classifyPrompt() detects image content → routes to vision-capable models |
| Model capability flags | Add supportsVision: boolean and supportsEmbedding: boolean to ModelConfig |
0.9 Publish updated packages
Bump versions → publish to Gitea npm registry.
Phase 0 Deliverables:
@bytelyst/llm@0.2.0— vision + embedding + streaming enhancements@bytelyst/llm-router@0.2.0— vision-aware routing + capability flags- All existing tests pass + 25-30 new tests
- Published to Gitea npm registry
Phase 1 — Backend: Note Prompts Core + Copilot Upgrade
Repo: learning_ai_notes
Duration: 4-5 days
Depends on: Phase 0
Features: F5 (auto-tag), F7 (smart title), F10 (reading time), F13 (merge), F14 (compare), F20-F23 (templates), F24 (chains)
1.1 Add LLM dependency to backend
File: backend/package.json — add "@bytelyst/llm": "^0.2.0"
1.2 Create backend/src/lib/llm.ts
Singleton wrapper over @bytelyst/llm:
import { getLLM, type LLMProvider } from '@bytelyst/llm';
let _llm: LLMProvider | null = null;
export function getNoteLettLLM(): LLMProvider {
if (!_llm) _llm = getLLM();
return _llm;
}
1.3 Add LLM env vars to config
File: backend/src/lib/config.ts
| Variable | Default | Description |
|---|---|---|
LLM_PROVIDER |
openai |
openai / azure / mock |
OPENAI_API_KEY |
— | OpenAI API key |
OPENAI_BASE_URL |
— | Optional base URL override |
AZURE_OPENAI_ENDPOINT |
— | Azure OpenAI endpoint |
AZURE_OPENAI_API_KEY |
— | Azure OpenAI key |
LLM_DEFAULT_MODEL |
gpt-4o-mini |
Default model for text prompts |
LLM_VISION_MODEL |
gpt-4o |
Default model for image prompts |
LLM_EMBEDDING_MODEL |
text-embedding-3-small |
Default model for embeddings |
1.4 New Cosmos container: note_prompts
File: backend/src/lib/cosmos-init.ts — register note_prompts container (partition key: /userId)
1.5 Create backend/src/modules/note-prompts/types.ts
Key types:
PromptTemplateDoc— id, productId, userId, slug, name, description, category, systemPrompt, userPromptTemplate, inputType (text/image/text+image/multi-note), outputFormat, outputAction (new_note/artifact/update_note), parameters, builtIn, createdAt, updatedAtPromptParameter— key, label, type (string/select), options, default, requiredRunPromptInput— noteId, workspaceId, promptTemplateId OR inlinePrompt, parameters, imageUrls, additionalNoteIds (for F13/F14 merge/compare), previousResultNoteId (for F24 chains), dryRun, agentIdRunPromptOutput— resultNoteId, resultArtifactId, content, model, tokenUsage, agentActionId, suggestedTags (for F5)
Zod schemas for all of the above.
1.6 Create backend/src/modules/note-prompts/repository.ts
CRUD for PromptTemplateDoc:
listTemplates(userId)— returns built-in + user's custom templatesgetTemplate(id, userId)createTemplate(doc)updateTemplate(id, userId, updates)deleteTemplate(id, userId)— cannot delete built-in
1.7 Create backend/src/modules/note-prompts/runner.ts
The core orchestration logic:
1. Validate input (template or inline prompt)
2. Fetch the source note (verify ownership + productId)
3. If additionalNoteIds provided (F13/F14 merge/compare):
a. Fetch all additional notes
b. Combine content into multi-note context
4. If template has inputType 'image' or 'text+image':
a. Fetch artifact images from blob storage via SAS URLs
b. Build vision message with buildVisionMessage()
5. If previousResultNoteId provided (F24 chains):
a. Fetch previous result note
b. Include its content as additional context
6. Build LLM messages array:
- System: template.systemPrompt (or default)
- User: interpolated template with note content + images + additional notes
7. Call getNoteLettLLM().chatCompletion(request)
8. Post-process response:
- If template slug is 'auto-tag': parse tags from response → return as suggestedTags
- If outputAction is 'new_note': createNote() → link to source via note-relationships
- If outputAction is 'artifact': createNoteArtifact() on source note
- If outputAction is 'update_note': updateNote() body/tags on source note
9. Record NoteAgentActionDoc (actionType: 'smart_action')
10. Return RunPromptOutput
1.8 Create backend/src/modules/note-prompts/routes.ts
| Method | Path | Auth | Description | Features |
|---|---|---|---|---|
GET |
/api/prompt-templates |
viewer | List built-in + user templates | Core |
GET |
/api/prompt-templates/:id |
viewer | Get single template | Core |
POST |
/api/prompt-templates |
admin | Create custom template | Core |
PATCH |
/api/prompt-templates/:id |
admin | Update custom template | Core |
DELETE |
/api/prompt-templates/:id |
admin | Delete custom template | Core |
POST |
/api/note-prompts/run |
admin | Run a prompt on a note | Core |
POST |
/api/note-prompts/run-stream |
admin | Run with SSE streaming | F3 |
GET |
/api/note-prompts/history |
viewer | List past prompt runs | Core |
POST |
/api/notes/:id/suggest-tags |
admin | Suggest tags via LLM | F5 |
GET |
/api/notes/:id/reading-time |
viewer | Calculate reading time | F10 |
POST |
/api/notes/compare |
admin | Compare 2+ notes | F14 |
POST |
/api/notes/merge |
admin | Merge 2+ notes | F13 |
1.9 Seed 20 built-in prompt templates
File: backend/src/modules/note-prompts/seed.ts
| # | Slug | Name | Input | Output | Category | Feature |
|---|---|---|---|---|---|---|
| 1 | summarize |
Summarize | text | new_note | transform | Core |
| 2 | translate |
Translate | text | new_note | transform | Core |
| 3 | simplify |
Simplify / ELI5 | text | artifact | transform | Core |
| 4 | extract-key-facts |
Extract Key Facts | text | artifact | extract | Core |
| 5 | food-label-rating |
Rate Food Label | image | new_note | analysis | Core |
| 6 | parse-receipt |
Parse Receipt | image | new_note | extract | Core |
| 7 | read-business-card |
Read Business Card | image | new_note | extract | Core |
| 8 | handwriting-to-text |
Handwriting to Text | image | new_note | transform | Core |
| 9 | generate-flashcards |
Generate Flashcards | text | new_note | generate | Core |
| 10 | pros-and-cons |
Pros & Cons | text | artifact | analysis | Core |
| 11 | presentation-outline |
Presentation Outline | text | new_note | generate | F21 |
| 12 | email-draft |
Email Draft | text | new_note | generate | F22 |
| 13 | social-post |
Social Post | text | artifact | generate | F23 |
| 14 | shareable-summary |
Shareable Summary | text | new_note | transform | F20 |
| 15 | compare-notes |
Compare Notes | multi-note | new_note | analysis | F14 |
| 16 | merge-notes |
Merge Notes | multi-note | new_note | transform | F13 |
| 17 | fix-rewrite |
Fix & Rewrite | text | update_note | transform | F1 |
| 18 | change-tone |
Change Tone | text | update_note | transform | F2 |
| 19 | continue-writing |
Continue Writing | text | update_note | generate | F3 |
| 20 | auto-tag |
Auto-Tag | text | update_note | extract | F5 |
1.10 Upgrade copilot-transform.ts to use @bytelyst/llm (F1, F2, F7)
File: backend/src/lib/copilot-transform.ts
Replace extraction-service calls with direct @bytelyst/llm calls:
runCopilotTransform()→getNoteLettLLM().chatCompletion()with action-specific system promptssuggestTitleFromBody()→getNoteLettLLM().chatCompletion()with title-suggestion prompt- Add
rewriteText(text, style)for F1/F2 — accepts tone parameter - Keep extraction-service fallback for graceful degradation
1.11 Reading time utility (F10)
File: backend/src/lib/reading-time.ts
export function estimateReadingTime(html: string): { minutes: number; words: number } {
const plain = html.replace(/<[^>]*>/g, ' ').replace(/\s+/g, ' ').trim();
const words = plain.split(/\s+/).length;
return { minutes: Math.max(1, Math.ceil(words / 238)), words };
}
Pure calculation — no LLM needed. Expose via GET /api/notes/:id response and note detail endpoints.
1.12 Extend agent action types
File: backend/src/modules/note-agent-actions/types.ts
Add 'smart_action' and 'auto_enrich' to NOTE_AGENT_ACTION_TYPES.
1.13 Register routes in server.ts + MCP tool
backend/src/server.ts— registernotePromptRoutesbackend/src/mcp/note-tool-contracts.ts— addnotes.prompts.runtoNOTES_MCP_TOOL_NAMESbackend/src/mcp/note-tools.ts— implementexecuteRunPrompt()
1.14 Tests
| Test file | Coverage | Count |
|---|---|---|
note-prompts/repository.test.ts |
Template CRUD | 8-10 |
note-prompts/runner.test.ts |
Prompt execution with mock LLM, chains, multi-note | 15-18 |
note-prompts/routes.test.ts |
API endpoint integration | 10-12 |
lib/copilot-transform.test.ts |
Upgraded copilot with LLM | 4-6 |
lib/reading-time.test.ts |
Reading time calculation | 4-5 |
mcp/note-tools.test.ts |
notes.prompts.run MCP tool |
4-6 |
Phase 1 Deliverables:
note-promptsmodule: types, repository, runner, routes, seed (20 templates)lib/llm.tssingleton + config extended with LLM env varslib/reading-time.tspure utility (F10)- Upgraded
copilot-transform.tsusing@bytelyst/llm(F1, F2, F7) - Multi-note support in runner (F13 merge, F14 compare)
- Chain support in runner via
previousResultNoteId(F24) smart_action+auto_enrichagent action typesnotes.prompts.runMCP toolnote_promptsCosmos container- 45-57 new tests
Phase 2 — Backend: Note Intelligence (Background AI)
Repo: learning_ai_notes
Duration: 2-3 days
Depends on: Phase 1
Features: F6 (auto-summarize), F8 (duplicate detection), F9 (auto-link), F12 (knowledge gaps)
2.1 Embedding service: backend/src/lib/embeddings.ts (F8, F9, F12)
import { getNoteLettLLM } from './llm.js';
export async function embedText(text: string): Promise<number[]> {
const llm = getNoteLettLLM();
if (!llm.embed) throw new Error('Embedding not supported by current LLM provider');
const res = await llm.embed({ input: text });
return res.embeddings[0];
}
export function cosineSimilarity(a: number[], b: number[]): number {
let dot = 0, magA = 0, magB = 0;
for (let i = 0; i < a.length; i++) {
dot += a[i] * b[i];
magA += a[i] * a[i];
magB += b[i] * b[i];
}
return dot / (Math.sqrt(magA) * Math.sqrt(magB));
}
2.2 Note embedding storage
File: backend/src/modules/notes/types.ts — add optional embedding: number[] field to NoteDoc
On note create/update, compute embedding in background (non-blocking). Store in Cosmos alongside the note.
2.3 Auto-summarize on save (F6)
File: backend/src/lib/note-hooks.ts
After a note is saved with body > 300 words:
- Run "summarize" template via
runner.ts - Store result as artifact (type:
summary) on the note - Record agent action with
actionType: 'auto_enrich' - Gated behind feature flag
notelett_auto_summarize_enabled
2.4 Duplicate/similar note detection (F8)
File: backend/src/modules/note-prompts/routes.ts
New endpoint: POST /api/notes/:id/check-duplicates
- Embed the current note's body
- Fetch all notes in workspace with embeddings
- Compute cosine similarity
- Return notes with similarity > 0.85 threshold
2.5 Auto-link related notes (F9)
File: backend/src/modules/note-prompts/routes.ts
New endpoint: POST /api/notes/:id/suggest-links
- Embed the current note
- Find top 5 most similar notes (similarity > 0.6, excluding self)
- Return as suggested links with similarity scores
- UI can accept/dismiss suggestions
2.6 Knowledge gap detection (F12)
File: backend/src/modules/note-prompts/routes.ts
New endpoint: POST /api/workspaces/:id/knowledge-gaps
- Fetch all notes in workspace
- Extract topics from each note (via auto-tag or LLM)
- Build topic frequency map
- Send to LLM: "Given these topics and their coverage depth, what's missing?"
- Return gap analysis as structured JSON
2.7 Tests
| Test file | Coverage | Count |
|---|---|---|
lib/embeddings.test.ts |
Embed + cosine similarity | 6-8 |
lib/note-hooks.test.ts |
Auto-summarize trigger logic | 4-6 |
note-prompts/routes.test.ts |
Duplicate check, suggest links, knowledge gaps | 10-12 |
Phase 2 Deliverables:
lib/embeddings.ts— embed text + cosine similarity- Note embedding storage on create/update
- Auto-summarize on save (F6) — feature-flag gated
- Duplicate detection endpoint (F8)
- Related notes suggestion endpoint (F9)
- Knowledge gap detection endpoint (F12)
- 20-26 new tests
Phase 3 — Web: Smart Actions UI + Editor AI
Repo: learning_ai_notes
Duration: 4-5 days
Depends on: Phase 1 (Phase 2 optional for F8/F9 UI)
Features: F1-F4 (editor AI), F5 (tag UI), F8-F9 (duplicate/link UI), F10 (reading time UI), F14 (compare UI), F20-F23 (export UI)
Can run in parallel with: Phase 4, Phase 5
3.1 API client: web/src/lib/prompt-client.ts
listPromptTemplates(): Promise<PromptTemplate[]>
getPromptTemplate(id: string): Promise<PromptTemplate>
createPromptTemplate(input): Promise<PromptTemplate>
updatePromptTemplate(id, input): Promise<PromptTemplate>
deletePromptTemplate(id): Promise<void>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
runPromptStream(input: RunPromptInput): AsyncIterable<string> // F3
listPromptHistory(noteId?, limit?): Promise<AgentAction[]>
suggestTags(noteId, workspaceId): Promise<string[]> // F5
checkDuplicates(noteId, workspaceId): Promise<SimilarNote[]> // F8
suggestLinks(noteId, workspaceId): Promise<SuggestedLink[]> // F9
compareNotes(noteIds, workspaceId): Promise<RunPromptOutput> // F14
mergeNotes(noteIds, workspaceId): Promise<RunPromptOutput> // F13
getKnowledgeGaps(workspaceId): Promise<GapAnalysis> // F12
3.2 SmartActionsPanel component
File: web/src/components/SmartActionsPanel.tsx
Renders on the note detail page:
- Grid of action buttons grouped by category (built-in + custom)
- Each button: icon + name + inputType badge (text/image/multi)
- Click → opens RunPromptModal
- Shows recent prompt runs for this note
- Reading time display (F10)
- "Suggest tags" button (F5)
3.3 RunPromptModal component
File: web/src/components/RunPromptModal.tsx
- Template selector (filtered by input type compatibility)
- Parameter inputs (e.g., target language, tone)
- Image picker (browse note artifacts or upload new)
- Multi-note selector (for merge/compare — F13, F14)
- Inline prompt textarea (for custom one-off prompts)
- Chain toggle: "Continue from previous result" (F24)
- Dry-run checkbox
- "Run" button with loading spinner
3.4 PromptResultView component
File: web/src/components/PromptResultView.tsx
- Markdown renderer for LLM response
- Action buttons: "Save as Note", "Save as Artifact", "Apply to Note", "Discard"
- Token usage + model info footer
- Link to created note or artifact
3.5 Prompt Template Library page
File: web/src/app/(app)/prompts/page.tsx
- Browse all 20 built-in templates (read-only cards)
- User's custom templates (edit/delete)
- "Create Custom Prompt" → opens PromptTemplateEditor
- Category filter tabs: All, Analysis, Transform, Extract, Generate, Custom
3.6 PromptTemplateEditor component
File: web/src/components/PromptTemplateEditor.tsx
- Form: name, slug, description, category, system prompt, user prompt template
- Input type selector (text / image / text+image / multi-note)
- Output format + output action selectors
- Parameter builder (add/remove dynamic parameters)
- Template variable reference:
{{note.title}},{{note.body}},{{note.tags}},{{params.X}} - Live preview
3.7 Upgrade NoteEditor with advanced Copilot (F1-F4)
File: web/src/components/NoteEditor.tsx
Enhance existing Copilot toolbar:
| Current | Upgraded |
|---|---|
shorten |
Keep (uses @bytelyst/llm now) |
expand |
Keep (uses @bytelyst/llm now) |
bulletize |
Keep (uses @bytelyst/llm now) |
grammar |
Replace with "Fix & Rewrite" (F1) — full rewrite, not just grammar |
| — | Add "Change Tone" (F2) — dropdown: formal/casual/professional/friendly |
| — | Add "Continue Writing" (F3) — inserts at cursor, streams token-by-token |
| — | Add "Explain" (F4) — tooltip popover with definition/explanation |
F3 (Continue Writing) implementation:
- Get text before cursor position from TipTap editor state
- Call
runPromptStream()(SSE) - Insert streamed tokens into editor in real-time via TipTap commands
F4 (Inline Q&A) implementation:
- Select text → right-click or toolbar button → "Explain this"
- Opens floating popover below selection
- Calls LLM with "Explain this term/concept concisely: {selection}"
- Shows result in popover (dismissible)
3.8 Duplicate detection UI (F8)
After note save, if notelett_duplicate_check_enabled flag is on:
- Call
checkDuplicates() - If similar notes found → show toast: "This note is similar to 'Note X' (87% match). View?"
- Click → opens side-by-side comparison
3.9 Related notes suggestion UI (F9)
After note creation:
- Call
suggestLinks() - If suggestions found → show panel: "Related notes you might want to link"
- Each suggestion: note title + similarity % + "Link" / "Dismiss" buttons
3.10 Auto-tag suggestion UI (F5)
On the note detail page SmartActionsPanel:
- "Suggest Tags" button
- Calls
/api/notes/:id/suggest-tags - Shows tag chips with + button to accept each
- Accepted tags are added to the note
3.11 Export actions UI (F20-F23)
In SmartActionsPanel, export templates appear with share icons:
- "Shareable Summary" → generates polished version → copy or share via note-shares
- "Presentation Outline" → generates outline → saves as new note
- "Email Draft" → generates email → copy to clipboard
- "Social Post" → generates post → copy to clipboard
3.12 Knowledge gap analysis UI (F12)
File: web/src/app/(app)/workspaces/[id]/gaps/page.tsx
- "Analyze Knowledge Gaps" button on workspace page
- Shows gap analysis: topics with thin coverage, suggested new note topics
- "Create Note" button for each gap → pre-fills title
3.13 Wire into note detail page + sidebar
web/src/app/(app)/notes/[noteId]/page.tsx— add SmartActionsPanel, reading time, duplicate warningweb/src/components/Sidebar.tsx— add "Prompts" nav item (sparkle icon)- Keyboard shortcut:
Cmd+Shift+A→ open Smart Actions panel
3.14 Tests
| Test file | Coverage | Count |
|---|---|---|
prompt-client.test.ts |
API client functions | 8-10 |
SmartActionsPanel.test.tsx |
Render + click handlers | 4-6 |
RunPromptModal.test.tsx |
Form + submission + multi-note | 4-6 |
NoteEditor.test.tsx |
Copilot upgrade (F1-F4) | 6-8 |
e2e/smart-actions.spec.ts |
Full flow E2E | 6 |
Phase 3 Deliverables:
prompt-client.tsAPI client (all endpoints)- 5 new components: SmartActionsPanel, RunPromptModal, PromptResultView, PromptTemplateEditor, KnowledgeGapView
/promptstemplate library page/workspaces/[id]/gapsknowledge gap page (F12)- NoteEditor upgraded with F1-F4 (Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A)
- Duplicate detection toast (F8)
- Related notes suggestion panel (F9)
- Auto-tag suggestion UI (F5)
- Export actions UI (F20-F23)
- Reading time display (F10)
- Sidebar updated, keyboard shortcut
- 28-36 new tests + 6 E2E tests
Phase 4 — Mobile: Smart Actions + AI-Enhanced Capture
Repo: learning_ai_notes
Duration: 4-5 days
Depends on: Phase 1
Features: F15 (voice), F16 (screenshot), F17 (URL), F18 (multi-image), F19 (clipboard)
Can run in parallel with: Phase 3, Phase 5
4.1 New dependencies
| Package | Purpose |
|---|---|
expo-image-picker |
Camera capture + gallery selection |
expo-av |
Audio recording for voice-to-note (F15) |
expo-clipboard |
Clipboard access for AI paste (F19) |
expo-sharing |
Share results |
4.2 API client: mobile/src/api/note-prompts.ts
listPromptTemplates(): Promise<PromptTemplate[]>
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>
suggestTags(noteId, workspaceId): Promise<string[]>
4.3 Enhance blob upload: mobile/src/api/blob-upload.ts
Upgrade existing stub:
- Camera capture via
expo-image-picker(photo + gallery) - Image resize (max 2048px, compress to < 4MB)
- Upload to blob storage via
@bytelyst/blob-client - Return
blobPath+ SAS URL
4.4 Zustand store: mobile/src/store/prompt-store.ts
interface PromptState {
templates: PromptTemplate[];
isRunning: boolean;
lastResult: RunPromptOutput | null;
error: string | null;
fetchTemplates(): Promise<void>;
runPrompt(input: RunPromptInput): Promise<RunPromptOutput>;
clearResult(): void;
}
4.5 SmartActionsSheet component
File: mobile/src/app/note/SmartActionsSheet.tsx
Bottom sheet (react-native-gesture-handler) that slides up from note detail:
- Scrollable grid of action buttons (icon + name)
- Category filter tabs (All, Text, Image, Custom)
- Actions trigger either:
- Direct run (text actions)
- Camera/gallery picker → then run (image actions)
4.6 PromptResultScreen
File: mobile/src/app/note/prompt-result.tsx
- Markdown-rendered LLM response
- "Save as Note" / "Discard" buttons
- Model info + token count
- Navigate to new note after saving
4.7 Voice-to-note (F15)
File: mobile/src/app/capture/voice.tsx (sub-route of capture, NOT a new tab)
- Record audio via
expo-av(Audio.Recording) - Upload audio file to blob storage
- Call backend transcription endpoint (or use extraction-service with speech task)
- Show transcribed text for review/edit
- Save as note
- Optionally run a Smart Action on the result (e.g., "Extract Key Facts")
4.8 Screenshot-to-note (F16)
On the capture tab:
- "From Screenshot" button → gallery picker (images only)
- Upload image → blob storage
- Run "handwriting-to-text" or custom OCR prompt
- Show result for review → save as note
4.9 URL-to-note (F17)
On the capture tab:
- "From URL" input field
- Backend endpoint:
POST /api/note-prompts/url-extract- Fetches URL content (server-side to avoid CORS)
- Strips HTML → extracts main content
- Runs "summarize" template
- Returns structured result
- Show summary for review → save as note
4.10 Multi-image capture (F18)
On the capture tab:
- "Scan Document" button → camera in continuous mode
- Take multiple photos (whiteboard pages, multi-page document)
- Upload all images → blob storage
- Run each through vision model sequentially
- Combine results into single note body
- Show merged result for review → save
4.11 Clipboard AI paste (F19)
On the capture tab:
- "Paste & Clean" button
- Read clipboard via
expo-clipboard - If clipboard contains text → run "fix-rewrite" template
- If clipboard contains URL → trigger URL-to-note flow (F17)
- Show cleaned result → save as note
4.12 Enhance capture tab
File: mobile/src/app/(tabs)/capture.tsx
Add new capture methods alongside existing text draft:
┌─────────────────────────────────────┐
│ Quick Capture │
│ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │ Text │ │ Photo │ │ Voice │ │
│ └───────┘ └───────┘ └───────┘ │
│ ┌───────┐ ┌───────┐ ┌───────┐ │
│ │ URL │ │ Scan │ │ Paste │ │
│ └───────┘ └───────┘ └───────┘ │
│ │
│ [existing text capture form] │
└─────────────────────────────────────┘
4.13 Wire Smart Actions into note detail
File: mobile/src/app/note/[id].tsx
- "AI Actions" button in the header
- Opens SmartActionsSheet
- Shows reading time (F10)
- Shows suggested tags after save (F5)
4.14 Offline queue integration
Prompt runs that fail → queue via @bytelyst/offline-queue for retry.
4.15 Tests
| Test file | Coverage | Count |
|---|---|---|
api/note-prompts.test.ts |
API client | 4-6 |
store/prompt-store.test.ts |
Store actions | 6-8 |
SmartActionsSheet.test.tsx |
Render + interactions | 4-6 |
capture.test.tsx |
New capture methods | 4-6 |
Phase 4 Deliverables:
note-prompts.tsAPI client +prompt-store.tsZustand store- Camera capture + image resize + blob upload
- SmartActionsSheet bottom sheet + PromptResultScreen
- Voice-to-note flow (F15) —
expo-avrecording - Screenshot-to-note (F16) — gallery + vision OCR
- URL-to-note (F17) — server-side fetch + summarize
- Multi-image scan (F18) — continuous camera + combine
- Clipboard AI paste (F19) — read + clean
- Enhanced capture tab with 6 capture modes
- Smart Actions on note detail
- Offline queue for failed runs
- 18-26 new tests
Phase 5 — Agent & Workflow Intelligence
Repo: learning_ai_notes
Duration: 2-3 days
Depends on: Phase 2
Features: F11 (weekly digest), F25 (scheduled), F26 (webhooks), F27 (approval-gated)
Can run in parallel with: Phases 3, 4
5.1 Scheduled Smart Actions (F25)
File: backend/src/modules/note-prompts/scheduler.ts
| Component | Detail |
|---|---|
PromptScheduleDoc |
New Cosmos doc: scheduleId, templateId, workspaceId, cron expression, enabled, lastRunAt, nextRunAt |
| Cosmos container | note_prompt_schedules (partition: /workspaceId) |
| Scheduler loop | In-process interval (60s check), matches cron → invokes runner.ts |
| API endpoints | POST /api/prompt-schedules (create), GET (list), PATCH/:id (update), DELETE/:id (delete) |
Example: "Summarize all notes in 'Research' workspace every Friday at 5pm"
5.2 Weekly workspace digest (F11)
Built on F25 — a special scheduled action:
- Pre-configured template:
weekly-digest - Runs weekly, collects all notes created/modified in workspace that week
- Produces a digest note with: summary, key themes, new notes list, most active areas
- Linked to workspace
Add template #21: weekly-digest (system-only, runs via scheduler)
5.3 Webhook-triggered actions (F26)
File: backend/src/modules/note-prompts/webhooks.ts
| Component | Detail |
|---|---|
PromptWebhookDoc |
webhookId, templateId, workspaceId, triggerEvent, enabled |
| API endpoint | POST /api/prompt-webhooks (create), GET (list), DELETE/:id |
| Trigger endpoint | POST /api/prompt-webhooks/:id/trigger — accepts { noteId, payload } |
| Supported events | note.created, note.updated, note.tagged, external |
Example: "When a note is tagged 'receipt', auto-run Parse Receipt"
5.4 Approval-gated actions (F27)
Leverages existing NoteAgentActionDoc with approval states.
| Change | Detail |
|---|---|
| New prompt template field | requiresApproval: boolean (default: false) |
| Runner modification | If template has requiresApproval, create action with state: 'proposed' instead of state: 'applied' |
| Review endpoint | Already exists: POST /api/agent-actions/:id/review (approve/reject) |
| Post-approval hook | On approval, execute the saved output action (create note / update / artifact) |
| Web UI | ProposalReviewCard already exists — add Smart Action context |
5.5 Tests
| Test file | Coverage | Count |
|---|---|---|
note-prompts/scheduler.test.ts |
Cron matching, schedule CRUD, execution | 8-10 |
note-prompts/webhooks.test.ts |
Webhook CRUD, trigger, event matching | 6-8 |
note-prompts/runner.test.ts |
Approval-gated flow | 3-4 |
Phase 5 Deliverables:
scheduler.ts— cron-based scheduled prompt execution (F25)weekly-digesttemplate + scheduled action (F11)webhooks.ts— event-triggered prompt execution (F26)- Approval-gated actions in runner (F27)
note_prompt_schedulesCosmos container- API endpoints for schedules + webhooks
- 17-22 new tests
Phase 6 — Polish, Integration Tests, Documentation
Duration: 2-3 days Depends on: Phases 3-5
6.1 End-to-end integration testing
| Test | Flow |
|---|---|
| Web E2E: Food label | Create note → attach image → run "Rate Food Label" → verify result note |
| Web E2E: Summarize | Create long note → run "Summarize" → verify summary artifact |
| Web E2E: Compare | Select 2 notes → compare → verify comparison note |
| Web E2E: Template CRUD | Create custom template → use it → edit → delete |
| Mobile E2E: Camera capture | Photo → upload → run prompt → verify result |
| Mobile E2E: Voice-to-note | Record → transcribe → review → save |
| MCP E2E: Agent prompt | Agent calls notes.prompts.run → verify audit trail |
| Webhook E2E | Tag note → webhook fires → prompt runs automatically |
| Scheduler E2E | Schedule created → time triggers → digest generated |
6.2 Error handling
| Scenario | Handling |
|---|---|
| LLM API key not configured | Clear error, disable Smart Actions UI, show setup guide |
| LLM rate limit (429) | Retry with exponential backoff (3 attempts), show "try again later" |
| LLM timeout | 60s timeout, graceful error, suggest retry |
| Image too large | Client-side resize before upload (max 2048px, < 4MB) |
| Prompt template not found | 404 with helpful message |
| Empty note body (text prompt) | Require body or show warning |
| No images on note (image prompt) | Prompt to upload/capture first |
| Embedding service unavailable | Skip duplicate check/auto-link gracefully |
| Audio recording fails | Fallback to text capture, show error |
| URL fetch fails | Show error with suggestion to paste content manually |
6.3 Feature flags
| Flag | Default | Controls |
|---|---|---|
notelett_smart_actions_enabled |
false | All Smart Actions UI + API |
notelett_auto_summarize_enabled |
false | F6 auto-summarize on save |
notelett_duplicate_check_enabled |
false | F8 duplicate detection |
notelett_auto_link_enabled |
false | F9 auto-link suggestions |
notelett_copilot_llm_enabled |
false | F1-F4 editor AI (vs extraction fallback) |
notelett_voice_capture_enabled |
false | F15 voice-to-note |
notelett_scheduled_actions_enabled |
false | F25 scheduled actions |
notelett_webhooks_enabled |
false | F26 webhook triggers |
6.4 Telemetry events
| Event | Properties |
|---|---|
smart_action_run |
templateSlug, inputType, model, durationMs, tokenUsage |
smart_action_result_saved |
outputAction, resultType |
smart_action_template_created |
category, inputType |
smart_action_error |
errorType, templateSlug |
copilot_transform |
action (rewrite/tone/continue/explain), durationMs |
auto_summarize_triggered |
wordCount, durationMs |
duplicate_detected |
similarityScore, noteId |
voice_capture_completed |
durationSecs, wordCount |
url_extract_completed |
domain, wordCount |
scheduled_action_fired |
scheduleId, templateSlug |
webhook_triggered |
webhookId, triggerEvent |
6.5 Documentation updates
- Update
docs/PRD.md— Smart Actions section (§5.2 AI features) - Update
AGENTS.md— new MCP tool, new module, new env vars - Update
docs/roadmaps/02_BACKEND_ROADMAP.md— mark Smart Actions complete - API reference for all new endpoints (15+ endpoints)
docs/SMART_ACTIONS_USER_GUIDE.md— end-user documentation
6.6 Docker / CI updates
- Add LLM env vars to
.env.example - Add
@bytelyst/llmtoscripts/docker-prep.shtarball list - Update
backend/Dockerfilefor new deps - Add
expo-image-picker,expo-avto mobile CI build matrix
Phase 6 Deliverables:
- 9+ E2E integration tests + 1-6 additional integration tests
- Error handling for all edge cases
- 8 feature flags for gradual rollout
- 11 telemetry events
- Documentation updated (PRD, AGENTS.md, roadmaps, user guide)
- Docker + CI updated
Test Budget Summary
| Phase | Unit Tests | E2E Tests | Total |
|---|---|---|---|
| 0 — Common-plat LLM | 25-30 | — | 25-30 |
| 1 — Backend core | 45-57 | — | 45-57 |
| 2 — Note intelligence | 20-26 | — | 20-26 |
| 3 — Web UI + editor AI | 22-30 | 6 | 28-36 |
| 4 — Mobile + capture | 18-26 | — | 18-26 |
| 5 — Agent/workflow | 17-22 | — | 17-22 |
| 6 — Integration/polish | — | 10-15 | 10-15 |
| Total | 147-191 | 16-21 | 163-212 |
New Files Summary
learning_ai_common_plat (Phase 0) — 6-8 files modified
| File | Change |
|---|---|
packages/llm/src/types.ts |
Add ContentPart, EmbeddingRequest/Response, update ChatMessage |
packages/llm/src/helpers.ts |
New: isVisionMessage(), buildVisionMessage() |
packages/llm/src/providers/openai.ts |
Vision + embedding support |
packages/llm/src/providers/azure-openai.ts |
Vision + embedding support |
packages/llm/src/providers/mock.ts |
Vision + embedding mocks |
packages/llm/src/index.ts |
Export new types + helpers |
packages/llm-router/src/types.ts |
Add supportsVision, supportsEmbedding |
packages/llm-router/src/classifier.ts |
Detect image content |
learning_ai_notes/backend (Phases 1, 2, 5) — 11 new + 7 modified
| File | Status | Phase |
|---|---|---|
src/lib/llm.ts |
New | 1 |
src/lib/config.ts |
Modified | 1 |
src/lib/cosmos-init.ts |
Modified | 1 |
src/lib/copilot-transform.ts |
Modified | 1 |
src/lib/reading-time.ts |
New | 1 |
src/lib/embeddings.ts |
New | 2 |
src/lib/note-hooks.ts |
New | 2 |
src/modules/note-prompts/types.ts |
New | 1 |
src/modules/note-prompts/repository.ts |
New | 1 |
src/modules/note-prompts/runner.ts |
New | 1 |
src/modules/note-prompts/routes.ts |
New | 1 |
src/modules/note-prompts/seed.ts |
New | 1 |
src/modules/note-prompts/scheduler.ts |
New | 5 |
src/modules/note-prompts/webhooks.ts |
New | 5 |
src/modules/note-agent-actions/types.ts |
Modified | 1 |
src/mcp/note-tool-contracts.ts |
Modified | 1 |
src/mcp/note-tools.ts |
Modified | 1 |
src/server.ts |
Modified | 1 |
learning_ai_notes/web (Phase 3) — 8 new + 5 modified
| File | Status |
|---|---|
src/lib/prompt-client.ts |
New |
src/components/SmartActionsPanel.tsx |
New |
src/components/RunPromptModal.tsx |
New |
src/components/PromptResultView.tsx |
New |
src/components/PromptTemplateEditor.tsx |
New |
src/app/(app)/prompts/page.tsx |
New |
src/app/(app)/workspaces/[id]/gaps/page.tsx |
New |
e2e/smart-actions.spec.ts |
New |
src/app/(app)/notes/[noteId]/page.tsx |
Modified |
src/components/NoteEditor.tsx |
Modified |
src/components/Sidebar.tsx |
Modified |
src/lib/copilot-client.ts |
Modified (add new CopilotAction types) |
src/lib/types.ts |
Modified (add PromptTemplate, RunPromptInput/Output, etc.) |
learning_ai_notes/mobile (Phase 4) — 8 new + 3 modified
| File | Status |
|---|---|
src/api/note-prompts.ts |
New |
src/api/blob-upload.ts |
Modified |
src/store/prompt-store.ts |
New |
src/app/note/SmartActionsSheet.tsx |
New |
src/app/note/prompt-result.tsx |
New |
src/app/capture/voice.tsx |
New (sub-route of capture, NOT a tab) |
src/app/capture/url.tsx |
New (sub-route of capture, NOT a tab) |
src/app/capture/scan.tsx |
New (sub-route of capture, NOT a tab) |
src/app/(tabs)/capture.tsx |
Modified |
src/app/note/[id].tsx |
Modified |
20 Built-in Prompt Templates
| # | Slug | Name | Input | Output | Category |
|---|---|---|---|---|---|
| 1 | summarize |
Summarize | text | new_note | transform |
| 2 | translate |
Translate | text | new_note | transform |
| 3 | simplify |
Simplify / ELI5 | text | artifact | transform |
| 4 | extract-key-facts |
Extract Key Facts | text | artifact | extract |
| 5 | food-label-rating |
Rate Food Label | image | new_note | analysis |
| 6 | parse-receipt |
Parse Receipt | image | new_note | extract |
| 7 | read-business-card |
Read Business Card | image | new_note | extract |
| 8 | handwriting-to-text |
Handwriting to Text | image | new_note | transform |
| 9 | generate-flashcards |
Generate Flashcards | text | new_note | generate |
| 10 | pros-and-cons |
Pros & Cons | text | artifact | analysis |
| 11 | presentation-outline |
Presentation Outline | text | new_note | generate |
| 12 | email-draft |
Email Draft | text | new_note | generate |
| 13 | social-post |
Social Post | text | artifact | generate |
| 14 | shareable-summary |
Shareable Summary | text | new_note | transform |
| 15 | compare-notes |
Compare Notes | multi-note | new_note | analysis |
| 16 | merge-notes |
Merge Notes | multi-note | new_note | transform |
| 17 | fix-rewrite |
Fix & Rewrite | text | update_note | transform |
| 18 | change-tone |
Change Tone | text | update_note | transform |
| 19 | continue-writing |
Continue Writing | text | update_note | generate |
| 20 | auto-tag |
Auto-Tag | text | update_note | extract |
New Dependencies
| Package | Where | Purpose |
|---|---|---|
@bytelyst/llm@^0.2.0 |
backend | LLM with vision + embedding |
expo-image-picker |
mobile | Camera + gallery |
expo-av |
mobile | Audio recording (F15) |
expo-clipboard |
mobile | Clipboard access (F19) |
All other integrations use existing @bytelyst/* packages already in package.json.
New Cosmos Containers
| Container | Partition Key | Phase | Purpose |
|---|---|---|---|
note_prompts |
/userId |
1 | Prompt templates (built-in + custom) |
note_prompt_schedules |
/workspaceId |
5 | Scheduled action definitions |
Prompt run results don't need containers — they produce notes (notes) and artifacts (note_artifacts).
New Environment Variables
| Variable | Default | Phase | Description |
|---|---|---|---|
LLM_PROVIDER |
openai |
1 | openai / azure / mock |
OPENAI_API_KEY |
— | 1 | OpenAI API key |
OPENAI_BASE_URL |
— | 1 | Optional base URL override |
AZURE_OPENAI_ENDPOINT |
— | 1 | Azure OpenAI endpoint |
AZURE_OPENAI_API_KEY |
— | 1 | Azure OpenAI key |
LLM_DEFAULT_MODEL |
gpt-4o-mini |
1 | Default text model |
LLM_VISION_MODEL |
gpt-4o |
1 | Default vision model |
LLM_EMBEDDING_MODEL |
text-embedding-3-small |
2 | Default embedding model |
New API Endpoints (15 endpoints)
| Method | Path | Phase | Feature |
|---|---|---|---|
GET |
/api/prompt-templates |
1 | List templates |
GET |
/api/prompt-templates/:id |
1 | Get template |
POST |
/api/prompt-templates |
1 | Create template |
PATCH |
/api/prompt-templates/:id |
1 | Update template |
DELETE |
/api/prompt-templates/:id |
1 | Delete template |
POST |
/api/note-prompts/run |
1 | Run prompt |
POST |
/api/note-prompts/run-stream |
1 | Run prompt (SSE) |
GET |
/api/note-prompts/history |
1 | Prompt run history |
POST |
/api/notes/:id/suggest-tags |
1 | F5 |
POST |
/api/notes/compare |
1 | F14 |
POST |
/api/notes/merge |
1 | F13 |
POST |
/api/notes/:id/check-duplicates |
2 | F8 |
POST |
/api/notes/:id/suggest-links |
2 | F9 |
POST |
/api/workspaces/:id/knowledge-gaps |
2 | F12 |
POST |
/api/note-prompts/url-extract |
4 | F17 |
CRUD |
/api/prompt-schedules |
5 | F25 |
CRUD |
/api/prompt-webhooks |
5 | F26 |
Commit Strategy
Phase 0 commits (common-plat)
feat(llm): add ContentPart type + multipart ChatMessage.content support
feat(llm): update OpenAI + Azure providers for vision messages
feat(llm): add embedding support (EmbeddingRequest/Response, embed())
feat(llm): add isVisionMessage + buildVisionMessage helpers
test(llm): add vision + embedding tests (30 tests)
feat(llm-router): add supportsVision + supportsEmbedding model capability flags
chore(llm): bump to 0.2.0 + publish
Phase 1 commits (notelett backend)
feat(backend): add @bytelyst/llm + lib/llm.ts singleton + LLM config
feat(note-prompts): types + Zod schemas for templates and run input/output
feat(note-prompts): repository — template CRUD
feat(note-prompts): runner — LLM orchestration + multi-note + chains
feat(note-prompts): routes — REST API endpoints (12 routes)
feat(note-prompts): seed 20 built-in prompt templates
feat(backend): upgrade copilot-transform.ts to use @bytelyst/llm
feat(backend): add reading-time utility
feat(agent-actions): add smart_action + auto_enrich types
feat(mcp): add notes.prompts.run tool
test(note-prompts): full test suite (55 tests)
Phase 2 commits
feat(backend): embeddings service — embed text + cosine similarity
feat(backend): note embedding storage on create/update
feat(backend): auto-summarize on save (feature-flag gated)
feat(backend): duplicate detection endpoint
feat(backend): related notes suggestion endpoint
feat(backend): knowledge gap detection endpoint
test(backend): intelligence tests (25 tests)
Phase 3 commits
feat(web): prompt-client API client
feat(web): SmartActionsPanel + RunPromptModal + PromptResultView
feat(web): PromptTemplateEditor + /prompts library page
feat(web): upgrade NoteEditor — Fix & Rewrite, Change Tone, Continue Writing, Inline Q&A
feat(web): duplicate detection toast + related notes panel
feat(web): auto-tag suggestion UI + export actions
feat(web): knowledge gap analysis page
feat(web): wire Smart Actions into note detail + sidebar
test(web): Smart Actions unit + E2E tests (36 tests)
Phase 4 commits
feat(mobile): note-prompts API client + prompt-store
feat(mobile): camera capture + image resize + blob upload
feat(mobile): SmartActionsSheet bottom sheet + PromptResultScreen
feat(mobile): voice-to-note — expo-av recording + transcription
feat(mobile): screenshot-to-note + multi-image scan
feat(mobile): URL-to-note + clipboard AI paste
feat(mobile): enhanced capture tab with 6 capture modes
test(mobile): Smart Actions tests (26 tests)
Phase 5 commits
feat(backend): scheduled Smart Actions — cron scheduler + CRUD
feat(backend): weekly workspace digest template + scheduled action
feat(backend): webhook-triggered actions — CRUD + trigger endpoint
feat(backend): approval-gated actions in runner
test(backend): scheduler + webhook tests (22 tests)
Phase 6 commits
feat(all): 8 feature flags for gradual rollout
feat(all): 11 telemetry events for Smart Actions
docs: update PRD, AGENTS.md, roadmaps for Smart Actions
docs: Smart Actions user guide
chore: update .env.example + Docker for LLM support
test: end-to-end integration tests (15 tests)
Risk Mitigation
| Risk | Mitigation |
|---|---|
| OpenAI API costs | Per-user daily quota, model tier selection (gpt-4o-mini default, gpt-4o vision only), feature flag gating |
| Vision prompt latency (5-15s) | Progress indicator, allow background processing, cache identical requests |
| Image size limits | Client-side resize to max 2048px, compress < 4MB before upload |
| Prompt injection | System prompt hardening, output validation, truncate excessively long inputs |
| LLM hallucination | JSON mode where possible, output schema validation, clear UI disclaimer |
| Corporate proxy blocking OpenAI | Support Azure OpenAI as alternative (already in @bytelyst/llm) |
| Embedding cost at scale | Batch embeddings, cache embeddings on note doc, recompute only on content change |
| Audio transcription accuracy | Show editable preview before saving, allow manual corrections |
| Scheduler reliability | In-process interval (simple), log missed runs, diagnostics endpoint |
Future Extensions (Not in This Roadmap)
- RAG context — include related notes as context in prompts for better answers
- Agent marketplace prompts — share templates across ByteLyst products
- Multi-step workflow builder — visual chain editor (drag-and-drop)
- Streaming for mobile — SSE on React Native for real-time token display
- Collaborative Smart Actions — run prompts across shared workspaces
- Custom model support — plug in local Ollama models via
@bytelyst/ollama-client - Action replay — re-run a previous Smart Action with same parameters
- Template versioning — track changes to custom templates over time
Appendix: Review Findings & Resolutions
Systematic code-level audit of this roadmap against the actual NoteLett and common-plat codebases, conducted April 2026. Each finding cross-references the real source files.
Finding 1 — FIXED: Timeline diagram showed wrong dependency flow
Severity: Medium — Incorrect diagram could mislead parallel scheduling Was: Phase 3/4 branching from Phase 2 Fix: Phase 3/4 branch from Phase 1. Phase 2 → Phase 5. Diagram corrected above.
Finding 2 — FIXED: PromptTemplateDoc was missing productId field
Severity: Critical — violates NoteLett convention: every Cosmos document MUST include productId: "notelett"
Source: backend/src/modules/notes/types.ts — all other docs (NoteDoc, NoteArtifactDoc, NoteAgentActionDoc) have productId
Fix: Added productId, userId, createdAt, updatedAt to the type definition in §1.5.
Note: PromptScheduleDoc (§5.1) and PromptWebhookDoc (§5.3) must also include productId and userId when implemented.
Finding 3 — FIXED: Reading time endpoint was POST, should be GET
Severity: Low — Pure calculation with no side effects
Source: REST convention — GET for idempotent read operations
Fix: Changed to GET /api/notes/:id/reading-time in §1.8.
Finding 4 — FIXED: Backend file count was wrong (claimed 18+8, actual 11+7)
Severity: Low — Documentation accuracy Fix: Corrected to "11 new + 7 modified" in New Files Summary.
Finding 5 — FIXED: Web file count missing copilot-client.ts and types.ts
Severity: Medium — These files MUST be updated but were omitted
Source: web/src/lib/copilot-client.ts defines CopilotAction = 'shorten' | 'expand' | 'bulletize' | 'grammar' — needs new types for F1/F2.
web/src/lib/types.ts needs PromptTemplate, RunPromptInput, RunPromptOutput, SimilarNote, SuggestedLink, GapAnalysis types.
Fix: Added both to web modified files list. Count corrected to "8 new + 5 modified".
Finding 6 — FIXED: Mobile capture sub-routes were listed as tabs
Severity: High — Would break the 5-tab navigator
Source: mobile/src/app/(tabs)/_layout.tsx has exactly 5 tabs: Home, Search, Capture, Inbox, Settings. Adding 3 more tabs (voice-capture, url-capture, scan-capture) would overflow the tab bar.
Fix: Changed to sub-routes of capture: src/app/capture/voice.tsx, src/app/capture/url.tsx, src/app/capture/scan.tsx. These are navigated to FROM the capture tab, not separate tabs.
Finding 7 — FIXED: Phase 6 deliverables listed test count twice (redundant)
Severity: Low Fix: Consolidated into single line.
Finding 8 — OPEN: Embedding storage strategy needs decision
Severity: High — Affects Cosmos RU cost and query patterns
Issue: §2.2 proposes storing embedding: number[] directly on NoteDoc. For text-embedding-3-small, each embedding is 1536 floats (~6KB). This increases every NoteDoc read by ~6KB, affecting list queries and the notes container partition-level throughput.
Recommendation: Either:
- (a) Store embeddings in a SEPARATE
note_embeddingscontainer (partition:/workspaceId), with documents keyed bynoteId. KeepsNoteDoclean. - (b) Store inline but use Cosmos projection queries (
SELECT c.id, c.title, c.embedding FROM c) to avoid pulling full note bodies when only embeddings are needed. - Option (a) is preferred for scale. Adds 1 new Cosmos container.
Action: Implementer should choose (a) or (b) at Phase 2 start and update
cosmos-init.tsaccordingly.
Finding 9 — OPEN: Voice-to-note (F15) transcription backend not fully specified
Severity: Medium — Implementation decision needed Issue: §4.7 says "Call backend transcription endpoint (or use extraction-service with speech task)" but no endpoint or extraction task is defined. Options:
- (a) Add
speech_transcriptiontask to extraction-service (Python sidecar already supports Whisper/Azure STT) - (b) New backend endpoint
POST /api/note-prompts/transcribethat calls Azure Speech SDK - (c) Client-side transcription via
expo-speech(limited quality) Recommendation: Option (a) — extraction-service already has Python sidecar infrastructure. Add task typespeech_transcriptionand a new endpointPOST /api/note-prompts/transcribethat wraps extraction-service.
Finding 10 — OPEN: URL-to-note backend endpoint assigned to Phase 4 but needs backend work
Severity: Medium — Mobile Phase 4 depends on backend route that isn't in Phase 1
Issue: POST /api/note-prompts/url-extract is listed in the API endpoints table as Phase 4, but this is a SERVER-SIDE endpoint (URL fetch, HTML strip, summarize). It must be implemented in the BACKEND before mobile can use it.
Recommendation: Move this endpoint to Phase 1 (backend routes) since the runner infrastructure is already being built there.
Finding 11 — OPEN: Phase 5 PromptWebhookDoc needs its own Cosmos container
Severity: Low — Currently untracked
Issue: §5.3 defines PromptWebhookDoc but no Cosmos container is mentioned for it. The "New Cosmos Containers" section only lists note_prompts and note_prompt_schedules.
Recommendation: Add note_prompt_webhooks container (partition: /workspaceId) or store webhooks in note_prompt_schedules with a discriminator.
Finding 12 — OPEN: @bytelyst/llm factory reads env vars directly, not via Zod config
Severity: Low — Clarification needed, not a bug
Issue: factory.ts in @bytelyst/llm reads process.env.LLM_PROVIDER, process.env.OPENAI_API_KEY, etc. directly. The roadmap also adds these to NoteLett's Zod config schema (§1.3). These serve different purposes:
@bytelyst/llmfactory — reads env at provider instantiation time- NoteLett config.ts — validates env at startup for fail-fast
Clarification: Both are correct. Config.ts validates upfront, but the LLM package uses its own env reads. No code conflict, but implementers should know the LLM package ignores NoteLett's parsed
configobject.
Finding 13 — RESOLVED: chatCompletionStream() now implemented in all providers
Severity: Medium — F3 (Continue Writing) depends on streaming
Source: packages/llm/src/providers/openai.ts, azure-openai.ts, mock.ts
Resolution: chatCompletionStream() is fully implemented in OpenAI (SSE parsing, buffer handling, [DONE] sentinel), Azure OpenAI (same pattern), and Mock (word-by-word simulation). 3 streaming tests in llm.test.ts. No further work needed.
Finding 14 — RESOLVED: CopilotAction union expanded
Severity: Medium — F1/F2 require new action types
Resolution: CopilotAction expanded to include 'fix-rewrite', 'change-tone', 'continue', 'explain'. CopilotBodySchema in notes/routes.ts updated. grammar kept as deprecated alias via fix-rewrite.
Finding 15 — NOTE: @bytelyst/llm has zero runtime dependencies
Severity: Info
Source: packages/llm/package.json — only devDependencies: { vitest }. Uses native fetch().
Impact: No extra bundling concerns. Requires Node 18+ or a fetch polyfill.
Finding 16 — NOTE: note_artifacts has summary as an existing artifact type
Severity: Info
Source: backend/src/modules/note-artifacts/types.ts line 3: NOTE_ARTIFACT_TYPES = ['file', 'summary', 'extraction', 'citation', 'export']
Impact: F6 (auto-summarize) can use artifactType: 'summary' directly — no schema changes needed for artifact types. F20-F23 (export actions) can use artifactType: 'export'. Good alignment.
Finding 17 — NOTE: Agent action type 'summarize' already exists
Severity: Info
Source: backend/src/modules/note-agent-actions/types.ts line 3: NOTE_AGENT_ACTION_TYPES = ['create', 'update', 'summarize', 'extract_tasks', 'attach_citation']
Impact: We can reuse 'summarize' for F6 (auto-summarize) or still add 'smart_action' as a general-purpose type. Recommendation: add 'smart_action' and 'auto_enrich' as planned, and use 'smart_action' for all prompt runs (the template slug provides the specificity).
Finding 18 — NOTE: Phase 5 weekly-digest is template #21, but §1.9 seeds only 20
Severity: Low — Consistency
Issue: §5.2 says "Add template #21: weekly-digest". This means Phase 5 adds a 21st template, seeded separately from the initial 20.
Clarification: This is correct behavior — built-in template count grows from 20 to 21 in Phase 5. The seed file should support incremental additions (upsert by slug, not hard-coded count).
Summary of Inline Fixes Applied
| # | Finding | Severity | Status |
|---|---|---|---|
| 1 | Timeline diagram wrong dependency | Medium | Fixed |
| 2 | Missing productId in PromptTemplateDoc |
Critical | Fixed |
| 3 | Reading time POST → GET |
Low | Fixed |
| 4 | Backend file count 18+8 → 11+7 | Low | Fixed |
| 5 | Web missing copilot-client.ts + types.ts | Medium | Fixed |
| 6 | Mobile tabs overflow (voice/url/scan) | High | Fixed |
| 7 | Phase 6 duplicate test count | Low | Fixed |
| 8 | Embedding storage strategy | High | Resolved — @bytelyst/llm embed() implemented in OpenAI, Azure, Mock providers. Separate note_embeddings container deferred to Phase 2. |
| 9 | Voice transcription backend unspecified | Medium | Resolved — POST /api/transcribe added to extraction-service (OpenAI Whisper API). transcribe() added to @bytelyst/extraction client. |
| 10 | URL-extract endpoint in wrong phase | Medium | Resolved — POST /api/note-prompts/url-extract implemented in Phase 1 backend routes. |
| 11 | Webhook container missing | Low | Resolved — note_prompt_webhooks container added to cosmos-init.ts. |
| 12 | LLM factory vs Zod config clarification | Low | Resolved — info only, no conflict. |
| 13 | Streaming not implemented in providers | Medium | Resolved — chatCompletionStream() implemented in OpenAI, Azure, Mock providers with SSE parsing. |
| 14 | CopilotAction union needs expansion | Medium | Resolved — expanded to include fix-rewrite, change-tone, continue, explain. |
| 15 | Zero runtime deps in @bytelyst/llm | Info | Noted |
| 16 | Artifact type 'summary' already exists | Info | Noted |
| 17 | Agent action 'summarize' already exists | Info | Noted |
| 18 | Template #21 added in Phase 5 | Low | Noted |